The association between “Brain‐Age Score” (BAS) and traditional neuropsychological screening tools in Alzheimer's disease

Abstract Introduction We present the Brain‐Age Score (BAS) as a magnetic resonance imaging (MRI)‐based index for Alzheimer's disease (AD). We developed a fully automated framework for estimating the BAS in healthy controls (HCs) and individuals with mild cognitive impairment (MCI) or AD, using MRI scans. Methods We trained the proposed framework using 385 HCs from the IXI and OASIS datasets and evaluated 146 HCs, 102 stable‐MCI (sMCI), 112 progressive‐MCI (pMCI), and 147 AD patients from the J‐ADNI dataset. We used a correlation test to determine the association between the BAS and four traditional screening tools of AD: the Mini‐Mental State Examination (MMSE), Clinical Dementia Ratio (CDR), Alzheimer's Disease Assessment Score (ADAS), and Functional Assessment Questionnaire (FAQ). Furthermore, we assessed the association between BAS and anatomical MRI measurements: the normalized gray matter (nGM), normalized white matter (nWM), normalized cerebrospinal fluid (nCSF), mean cortical thickness as well as hippocampus volume. Results The correlation results demonstrated that the BAS is in line with traditional screening tools of AD (i.e., the MMSE, CDR, ADAS, and FAQ scores) as well as anatomical MRI measurements (i.e., nGM, nCSF, mean cortical thickness, and hippocampus volume). Discussion The BAS may be useful for diagnosing the brain atrophy level and can be a reliable automated index for clinical applications and neuropsychological screening tools.

patient's level of education. Thus, bias in these survey methods increases the uncertainty in the diagnosis of AD.
With respect to this point, an automated and physician-friendly computer screening of AD is urgently needed. Several research groups have investigated various fully automated approaches in AD studies (Bedell et al., 2014;Beheshti, Olya, & Demirel, 2016;Cuingnet et al., 2011;Franke, Ziegler, Klöppel, & Gaser, 2010;J Martinez-Murcia, Górriz, Ram'irez, & Ortiz, 2016;Klöppel et al., 2008;Mikhno, Redei, Mann, & Parsey, 2015). Despite recent progress in the automated assessment of AD, the development of an automatic approach that addresses the uncertainty of diagnosing AD is still challenging and requires further data. Previous studies have shown that neuroimaging data along with advanced pattern recognition techniques can be used for predicting clinical scores (Moradi, Hallikainen, Hänninen, Tohka, & Neuroimaging, 2017;Shen et al., 2011;Stonnington et al., 2010) as well as chronological age (Cole, 2017). The brain-age technique was recently introduced as a powerful biomarker that can be used to estimate an individual's neuroanatomical age (Duchesne & Gravel, 2016;Franke et al., 2010). The use of the brain-age technique has helped reveal the abnormal brain changes in many brain studies such as those of AD (Franke et al., 2010), the prediction of the conversion of mild cognitive impairment (MCI) to AD (Gaser, Franke, Klöppel, Koutsouleris, & Sauer, 2013), and investigations of the brains of children and adolescents (Franke, Luders, May, Wilke, & Gaser, 2012), long-term meditation practitioners (Luders, Cherbuin, & Gaser, 2016), and schizophrenia patients (Koutsouleris et al., 2014).
Here, we introduce the "Brain-Age Score" (BAS), which we define as the difference between an individual's chronological age and his or her neuroanatomical age, as a fully automated and reliable index that can be used to assess the level of AD in individual subjects. We developed an automated brain-age framework based on voxel-based morphometry (VBM) features obtained from structural magnetic resonance imaging (sMRI) data in order to estimate the neurological age among healthy controls, individuals with MCI, and individuals with AD.
The proposed framework was trained on 385 samples from the IXI dataset (http://www.brain-development.org/ixi-dataset/) and the OASIS dataset (http://www.oasis-brains.org/). We evaluated the proposed framework on 147 AD patients, 112 progressive-MCI (pMCI) patients, 102 stable-MCI (sMCI) patients, and 146 healthy controls (HCs) acquired from the J-ADNI dataset. More information about J-ADNI is provided in the Appendix and (Fujishima et al., 2017). To test the efficiency of the proposed framework, we performed a correlation test between the BAS and the neuropsychological screening tools (i.e., MMSE, CDR, ADAS, and FAQ) as well as between the BAS and anatomical MRI measurements (i.e., normalized gray matter (nGM), normalized white matter (nWM), normalized cerebrospinal fluid (nCSF), mean cortical thickness, and hippocampus volume).
In light of the relevant literature, we hypothesized that the BAS would show adequate performance in the screening of the level of AD in a fully automated manner, using only a standard MRI scan.
Our findings demonstrate that (1) the BAS is capable of being used to diagnosis an individual's level of brain atrophy, and (2) it can be considered a reliable automated index for clinical applications as well as neuropsychological screening tools.

| LITER ATE RE VIE W
In this section, we review the brain-age models as well as feature selection procedures in a series of neuroimaging studies. Several research groups have investigated automated methods to estimate the neuroanatomical age from sMRI data. As an example, the team of Gaser and colleagues (Franke et al., 2010) introduced an automated framework in the basis of GM density maps through a standard VBM procedure and a regression model for estimating the brain-age of healthy subjects. They examined the influence of different factors such as MRI preprocessing, regression models, data reduction, the use of different scanners, and the training sample size on the age estimation accuracy using sMRI data. In another study (Gaser et al., 2013), the researchers modeled a brain-age framework using a structural MRI database of 320 healthy controls obtained from the IXI and OASIS datasets, and they estimated the individual brain ages of MCI patients from the ADNI dataset. They reported an accuracy value of up to 81% for predicting conversion to AD in MCI patients at the baseline. The authors in  presented longitudinal alterations in BAS in the 150 AD patients, 112 pMCI patients, 36 sMCI patients, and 108 HCs from the ADNI dataset, where correlation values of r = −0.46, r = 0.39 and r = 0.45 (p < 0.001) were achieved between baseline BAS and the MMSE, the CDR and the ADAS scores, respectively. In (Koutsouleris et al., 2014), the researchers examined neuroanatomical age estimation in individuals with schizophrenia and other mental disorders. They trained their proposed brain-age framework using a structural MRI database of 800 healthy subjects examined at five different centers, and then, they evaluated individuals in at-risk mental states for psychosis, borderline personality disorder, and major depression and schizophrenia subjects. The researchers in (Luders et al., 2016) considered a brain-age framework to investigate the neuroanatomical age in subjects who had been regularly engaging in meditation for a period of years. They examined over 650 subjects in the training phase and then evaluated a large sample of the same type of meditating subjects. They observed that at age 50, the brains of Highlights • We present the Brain-Age Score (BAS) as a MRI-based index for AD.
• Our automated computer-aided framework estimates the BAS in healthy and MCI/AD brains.
• The BAS is in line with traditional screening tools of AD as well as anatomical MRI measurements.
• The BAS could be used for diagnosing brain atrophy as a reliable automated index. the meditating subjects were younger than those of age-matched healthy controls. In (Cole, Leech, & Sharp, 2015), the authors built a brain-age model in basis of GM and WM density maps on 1,537 healthy individual and then tested on 113 independent healthy controls and 99 patients after traumatic brain injury (TBL). They stated a mean BAS of 4.66 and 5.97 years for GM and WM modalities, respectively, among TBL patients. In another study , the authors conducted a brain-age model by combining the GM and WM modalities from 2,001 healthy controls to investigate the brain age on HIV-positive and HIV-negative subjects. According to this research, they realized that HIV-positive subjects have a significantly greater BAS in comparison with HIV-negative subjects.
Many researchers have presented different dimensionality reduction and feature selection methods in machine learning for neuroimaging studies. For instance, the authors in (López et al., 2009) conducted a data reduction on features extracted from single photon emission computed tomography (SPECT) and positron emission tomography (PET) images by means of principal component analysis (PCA). In (Ramírez et al., 2010) a new data reduction method was introduced by the means of partial least squares (PLSs) to overcome the curse of dimensionality. The authors applied the proposed PLSbased method on SPECT data and extracted the score features for an AD classification task. The PLS-based data reduction has been widely used in different neuroimaging studies (Chaves, Ramírez, Górriz, & Puntonet, 2012;Khedher, Ramírez, Górriz, Brahim, & Segovia, 2015;Ramírez et al., 2010;Segovia, Górriz, Ramírez, Salas-González, & Álvarez, 2013). The researchers in (Liu, Zhang, & Shen, 2016) used a sparse-based feature selection method to find informative features from template space for AD classification and MCI conversion prediction. In , the authors proposed a feature-ranking strategy for identifying the most informative features from a high-dimensional space in an AD classification task. The dimensionality of selected features was determined by the mean of fisher criterion. It is worth nothing that the most of these approaches were hired in classification studies. In this study, we present an automatic feature selection approach in the basis of feature-ranking strategy for brain-age framework.

| S TUDY PARTI CIPANTS
A total of 892 structural MRI scans from the IXI, OASIS, and J-ADNI datasets were used. To separate the data used for training from the data used for evaluating, we divided the dataset into two main groups: (1) Training data: 385 healthy samples from the IXI and OASIS datasets of subjects aged ≥50 years, and (2) Evaluating data: 507 samples from the J-ADNI database. We divided the total of 507 participants into four groups based on criteria as follows: • AD subjects: MMSE score of 20-26, CDR 0.5 or 1, and memory complaint.
• sMCI subjects: MMSE score 24-30, memory complaint (preferably corroborated by an informant), objective memory loss measured, a CDR of 0.5, absence of significant levels of impairment in other cognitive domains, and essentially preserved activities of daily living (if the diagnosis was MCI for ≥36 months).
• pMCI subjects: MMSE score 24-30, memory complaint (preferably corroborated by an informant), objective memory loss measured, a CDR of 0.5, absence of significant levels of impairment in other cognitive domains, essentially preserved activities of daily living (if the diagnosis was MCI at baseline but conversion to AD was reported after baseline within 6-36 months).
All of the subjects whose samples were evaluated had taken the MMSE, CDR, ADAS, and FAQ as neuropsychological screening tools.
The evaluating data were acquired at the baseline. Table 1 summarizes the demographic and clinical characteristics of the subjects. This study was approved by the Institutional Review Board at the National Center of Neurology and Psychiatry, Tokyo, Japan.

| ME THODS
Here, we describe the computational processes that we applied to the data. The pipeline of the proposed framework is illustrated in Figure 1; the protocol was as follows. (1) We performed MRI preprocessing using a VBM approach on T1-weighted MRI data. (2) We used our proposed automated feature selection approach to select the most informative features. (3) We trained the proposed framework using the training data and then assessed the framework using the evaluating data. (4) We tested the results of our proposed approach using a correlation test between the BAS values and neuropsychological subjects' scores (i.e., MMSE, CDR, ADAS, and FAQ) as well as the subjects' anatomical MRI measurements (i.e., nGM, nWM, nCSF, mean cortical thickness, and normalized hippocampus volume). Detailed explanations concerning the above-described steps are provided next.

| MRI preprocessing
All T1-weighted MRI scans were preprocessed using the SPM8 package (http://www.fil.ion.ucl.ac.uk/spm/software/spm8/) and the standard VBM8 toolbox (http://dbm.neuro.uni-jena.de/vbm). With the VBM8 toolbox, all of the subjects' MRI scans were bias-corrected, special normalized and decomposed into the gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF) components. In this study, we used only the GM images. As proposed in study (Franke et al., 2010), we processed the GM images using an affine registration and then spatially smoothed with an 8-mm full-width-half-maximum Gaussian kernel followed by spatial resolution to 8 mm. The voxel values for each subject were considered the MRI features.
To evaluate the association between BAS values and anatomical MRI measurements, we performed a FreeSurfer image analysis ver. 5.3.0. (https://surfer.nmr.mgh.harvard.edu/) on evaluating data to extract the hippocampus volumes as well as cortical thickness values. For each subject, the left and right hippocampus volumes were calculated and added together. The mean cortical thickness was computed by averaging of whole-brain cortical thickness values. Besides, the whole-brain GM, WM, and CSF volumes were extracted through VBM analysis. We normalized the GM, WM, and CSF volumes as well as hippocampus measurements for head size by dividing the respective intracranial volumes and considered as nGM, nWM, nCSF, and normalized hippocampus volumes, respectively.

| Proposed feature selection
The purpose of feature selection is to identify the most informative subset of the available features not only for data reduction, but also for improving the performance. In this context, we present a new and automatic feature selection approach based on feature-ranking for our proposed brain-age framework. The purpose of any feature-ranking strategy is to sort the features based on their information and then select an optimal informative subset in order to speed up the learning process and promote the performance of models (Zhou & Wang, 2007). The details of the feature-ranking strategy were as described Beheshti, Demirel, Farokhian, Yang, & Matsuda, 2016). The proposed feature selection approach is applied only on training data. In the present study, we sorted all voxels based on the respective correlation value of each voxel value and the subjects' chronological age, in an ascending order. Then, we increased the number of ranked features from 1 to utmost number of features, along with the calculation of the respective mean absolute error (MAE) during the cross-validation strategy.
Based on the cross-validation strategy, we divided the training data into different k-folds, where, in each step, k-1 folds are randomly used for training a regression model and the remained fold for testing. This process is performed k different times by leaving a different fold as test data, which are then used to estimate the respective MSE values. In this study, k = 5 was used. For each iteration (i.e., increasing the number of ranked features), the total MAE was calculated by the averaging of MAEs obtained across the cross-validation.
The optimal number of ranked features that minimizes the total MAE was selected as the optimal subset. Algorithm 1 shows the pseudo-code of the proposed feature selection method to determine the optimal-feature-subset. The advantage of this procedure is that the top informative voxels are determined in a fully automated manner.

| The support vector regression algorithm
For the prediction of the brain age, we established the framework using a support vector regression (SVR) algorithm (Smola & Schölkopf, 2004) because of its desirable characteristics and easy computation for a high-dimensional feature space. The estimation TA B L E 1 Characteristics of the study participants linear regression function f (x) can be represented as follows (Zhang, Zheng, Xia, Wang, & Chen, 2017).
in which x stands for an input space (i.e., the MRI features in our study) and φ refers to a kernel function. Besides, b and w denote the offset for the regression line and the slope, respectively.

| Validation and statistical analysis
We calculated the accuracy of the age estimator using the MAE and root mean square error (RMSE) as follows: where n is the number of subjects in the test sample, and g ′ i and g i denote the estimated age and the chronological age, respectively. (1) The illustration of selected voxels based on the absolute correlation of each voxel value and the subjects' chronological age TA B L E 2 The performance comparison of the proposed feature selection method with using all features and the state-of-the-art techniques applied to HC subjects from the J-ADNI dataset F I G U R E 3 Scatterplot and linear fit for chronological age versus predicted age in the evaluation group (HC, n = 147; sMCI, n = 102; pMCI, n = 112; and AD, n = 146)

F I G U R E 4
The box plots with the BAS (years) in the HC, sMCI, pMCI, and AD subjects from the J-ADNI dataset. *p < 0.05, **p < 0.001

| RE SULTS
In this section, we evaluate the proposed framework on the J-ADNI dataset including 147 AD, 112 pMCI, and 102 sMCI patients and 146 HCs for the determination of the association between the BAS and the neuropsychological screening tools as well as between the BAS and anatomical MRI measurements. All data used in the evaluation step were acquired at the baseline.

| Performance of the proposed feature selection
As described above in section 4.1, the normalized and smoothing GM images were resampled to 8-mm isotropic spatial resolution.
This procedure generated 3,747 voxel values per subject, which were used as MRI features. We applied the proposed feature selection approach to select the most informative voxels for the brain-age estimation. A total of 665 voxels were selected through proposed feature selection procedure. Figure 2 illustrates the selected voxels through proposed feature selection procedure. For the evaluation of the proposed feature selection procedure, we used the HCs from the J-ADNI dataset.
The proposed feature selection method not only introduced a dimensionality reduction but also reduced the overall MAE from 4.35 with 95% confidence intervals [3.83-4.87] to 4.02 years with an overall correlation r = 0.68, p < 0.001. Table 2 presents the details of the performance of the brain-age framework using all feature vectors and our proposed feature selection approach as well as comparison to other studies. As can be seen from is quite comparable to other recent brain-age estimations of healthy control subjects (Franke & Gaser, 2014;Franke et al., 2010;Gaser et al., 2013;Koutsouleris et al., 2014;Lin et al., 2016;Wang, Dai, Li, Hua, & He, 2012).

| Estimating the brain age in the evaluating group
We applied the proposed brain-age estimation framework to the HCs and sMCI, pMCI, and AD patients from the J-ADNI dataset to estimate the respective BAS values. A positive BAS indicates that the individual's brain is estimated to be older than his or her chronological age and that consequently, accelerated brain aging has occurred for this individual. Conversely, a negative BAS indicates that the individual's brain is estimated to be younger than his or her chrono-  The mean BAS values were as follows: HCs: +0.07 year, sMCI patients: +2.38 years, pMCI patients: +3.15 years, and AD patients: +5.36 years. As can be seen from Figures 3 and 4, our proposed brain-age estimator indicated differences in the BAS at different stages of AD, and our results suggest that across the spectrum of health/disease from healthy controls to individuals with sMCI, pMCI or AD, the mean of the BAS increases.

| Association between the BAS and the traditional neuropsychological screening tools as well as between the BAS and anatomical MRI measurements
To determine the association between the BAS results and those of the four traditional neuropsychological screening tools as well as between the BAS results and anatomical MRI measurements, we analyzed the respective neuropsychological screening scores as well as anatomical MRI measurements in the 147 AD patients, 112 pMCI patients, 102 sMCI patients, and 146 HCs from the J-ADNI dataset. The results of the neuropsychological subjects' scores (i.e., MMSE, CDR, ADAS, and FAQ) as well as the subjects' anatomical F I G U R E 6 Box plots of anatomical MRI measurement distributions for the evaluating group: (a) nGM, (b) nWM, (c) nCSF, (d) mean cortical thickness, and (e) normalized hippocampus volume. The anatomical MRI measurements were acquired at the baseline. *p < 0.05; **p < 0.001 MRI measurements (i.e., nGM, nWM, nCSF, mean cortical thickness, and normalized hippocampus volume) are given in Figures 5 and 6, respectively.
The associations between the BAS results and the neuropsychological scores as well as the associations between the BAS results and anatomical MRI measurements are illustrated in Figures 7 and 8, respectively. According to the correlation test results, the BAS has relationships with the MMSE, the CDR, the ADSR, the FAQ scores, the nGM, the nCSF, the mean cortical thickness, and the normal- There was no significant correlation between the BAS and the nWM.

| D ISCUSS I ON
We conducted the present empirical study to develop an automated framework for estimating the neuroanatomical age in individuals with AD or MCI and healthy subjects, and to determine the utility of our index (the BAS) as a computerized AD screening score. The BAS of the subjects is determined by subtracting the subject's neuroanatomical age from his or her chronological age. The results of our present analyses demonstrated that the difference between the neuroanatomical age and chronological age increased along with the level of AD; the mean BAS values were +0.07, +2.38, +3.15, and +5.36 years for our HCs, sMCI, pMCI, and AD subjects, respectively.
A higher positive BAS implies higher accelerated/precocious brain aging, with an increasing BAS reflecting increasing severity of the level of AD.
Consequently, the mean BAS values of +5.63 and +3.15 for our present AD and pMCI groups indicate that the accelerated brain aging among AD and pMCI patients is higher than that of the sMCI and HC subjects with the mean BAS values of +2.38 and +0.07, respectively. Our statistical analysis showed a greater elevation of predicted age from chronological age in younger AD patients than older AD patients. This finding is in agreement with other studies that demonstrated a faster rate of AD progression in early-onset than in late-onset AD patients (Koss et al., 1996). In several recent studies, the authors used the PCA method to reduce the high-dimensional input space into a low-dimensional space (Franke, Hagemann, Schleussner, & Gaser, 2015;Franke et al., 2010Franke et al., , 2017. In the PCA method, the number of principal components has a major effect on the performance, and it is usually determined manually. The effect of PCA data reduction and the influence of different principal components on the accuracy of an age estimation model have been addressed (Franke et al., 2010). One F I G U R E 8 The correlations between the BAS results and the anatomical MRI measurements in the evaluation group: (a) nGM, (b) nWM, (c) nCSF, (d) mean cortical thickness, and (e) normalized hippocampus volume. The anatomical MRI measurements were acquired at the baseline advantage of our proposed brain-age framework is the introduction of a novel and fully automated feature selection approach based on feature-ranking for brain-age estimations. The proposed feature selection method not only introduces data reduction but also promotes the framework's performance by decreasing the MAE.
We also assessed the association between the obtained BAS values and the neuropsychological scores as well as between the BAS results and anatomical MRI measurements. The correction afforded by our proposed brain-age approach confirms the results of the traditional screening tools (i.e., the MMSE, CDR, ADAS, and FAQ) and anatomical MRI measurements (i.e., nGM, nCSF, mean cortical thickness, and hippocampus atrophy) for the diagnosis of AD. The significant correlation (p < 0.001) between the BAS and the two general cognitive assessments in AD (the MMSE and CDR) implies that the BAS has potential as a fully automated criterion that can be used to assess the level of brain atrophy in AD.
The main advantage of our proposed brain-age approach is that it is a fully automated approach at all stages (from preprocessing, feature selection, and training a regression model), which functions based on only an MRI scan of the brain. Our approach summarizes a pattern across the whole brain to one single value (the BAS). In contrast, common approaches in the diagnosis of AD (e.g., the MMSE, CDR, ADAS and FAQ) focus on self-reported results of a survey, which can be influenced by common method bias.
One limitation of our study is that the proposed approach using which are widely used in AD studies. This is why the prediction of MCI-to-AD conversion is the one of the most difficult tasks in AD studies. In future research, we plan to use the cortical thickness and diffusion properties (Irimia, Torgerson, Goh, & Van Horn, 2015) and to combine MRI features with nonimaging characteristics  (Folstein et al., 1975), Japanese version; a global score of 0 on the CDR, Japanese version; and an education-adjusted score above the cutoff level on the Wechsler Memory Scale-Revised (WMS-R) Logical Memory II (Wechsler & Stone, 1987), Japanese version (education for 0-9 years was ≥3, for 10-15 years was ≥5, and for >15 years was ≥9). The inclusion criteria for the MCI subjects were a score of 24-30 on the MMSE, memory disturbance identified by the study partner with or without the subjective complaint of the participant, a score of 0.5 on the CDR, and an education-adjusted score below the cutoff level on the WMS-R Logical Memory II (education for 0-9 years was ≤2, for 10-15 years was ≤4, and for >15 years was ≤8). The inclusion criteria for AD subjects was a score of 20-26 on the MMSE score, a score of 0.5 or 1 on the CDR, and an education-adjusted score below the cutoff level on the WMS-R Logical Memory II (same as for MCI). AD subjects also had to meet the criteria of the NINCDS-ADRDA (the National Institute of Neurological and Communicative Diseases and Stroke and the Alzheimer's Disease and Related Disorders Association) for probable AD (McKhann et al., 1984). Exclusion criteria included brain lesions on screening or baseline MRI, neurological and psychiatric disorders other than AD, addiction to alcohol or other drugs, and use of psychoactive drugs or warfarin.
The institutional review boards at all participating sites approved the data collection procedures and written informed consent was obtained from all participants. If participants were not capable of agreeing, their study partner signed the informed consent form in substitution. A total of 750 participants were first recruited at the 38 clinical sites in Japan. Those who provided written informed consent and passed screening based on the above inclusion/exclusion criteria were enrolled in the J-ADNI study. Finally, 537 participants were enrolled. The 537 participants underwent brain MRI at baseline.
Follow-up MRI was performed at 6, 12, and 24 months for all participants and at 36 months only for MCI and CN participants.
MCI participants additionally underwent MRI at 18 months. Clinical and cognitive assessments were also performed for all participants at the time of the baseline and follow-up scans. These assessments included MMSE, ADAS-Cog, and CDR-SB. Data were used for analysis from 146 AD, 102 sMCI, 112 pMCI and 147 NC participants.