Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (adni.loni.ucla.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data, but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.ucla.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf
Structural magnetic resonance imaging (MRI) is sensitive to neurodegeneration and can be used to estimate the risk of converting to Alzheimer's disease (AD) in individuals with mild cognitive impairment (MCI). Brain changes in AD and prodromal AD involve a pattern of widespread atrophy. The use of multivariate analysis algorithms could enable the development of diagnostic tools based on structural MRI data. In this study, we investigated the possibility of combining multiple MRI features in the form of a severity index.
We used baseline MRI scans from two large multicentre cohorts (AddNeuroMed and ADNI). On the basis of volumetric and cortical thickness measures at baseline with AD cases and healthy control (CTL) subjects as training sets, we generated an MRI-based severity index using the method of orthogonal projection to latent structures (OPLS). The severity index tends to be close to 1 for AD patients and 0 for CTL subjects. Values above 0.5 indicate a more AD-like pattern. The index was then estimated for subjects with MCI, and the accuracy of classification was investigated.
Based on the data at follow-up, 173 subjects converted to AD, of whom 112 (64.7%) were classified as AD-like and 61 (35.3%) as CTL-like.
We found that joint evaluation of multiple brain regions provided accurate discrimination between progressive and stable MCI, with better performance than hippocampal volume alone, or a limited set of features. A major challenge is still to determine optimal cut-off points for such parameters and to compare their relative reliability.
Alzheimer's disease (AD) is a progressive age-related neurodegenerative disease and a growing health problem. Definite diagnosis can only be made post-mortem, and requires histopathological confirmation of amyloid plaques and neurofibrillary tangles. At the time of clinical manifestation of dementia, significant irreversible brain damage is already present. Therefore, an accurate diagnosis of AD at an early stage is a prerequisite for initiating disease-modifying treatments. Mild cognitive impairment (MCI) is a heterogeneous syndrome recently recognized as a diagnostic entity that includes the prodromal stage of AD . Thus, subjects with MCI have a markedly increased risk of developing AD, with a conversion rate to AD of 15–20% per year in memory clinic settings (in the general population, the conversion rate is 1–2%) . However, not all subjects with MCI go on to develop AD and some may even revert to normal cognition .
Neuroimaging biomarkers follow a dynamic model of change during different stages of the disease and could be valuable predictors of patient outcome. Structural magnetic resonance imaging (MRI) is sensitive to neurodegeneration and analysis of structural changes can be used to estimate the risk of converting to AD in individuals with MCI. The ability to identify an individual at risk of developing AD will be critical if disease-modifying treatments become available.
Brain changes in AD and prodromal AD lead to a pattern of widespread atrophy (measured as both volume and thickness), involving a number of different structures across the brain (e.g. hippocampus, entorhinal cortex, cingulate gyrus and frontal cortices) [4, 5].
Advances in statistical learning with the development of new multivariate and machine learning algorithms capable of dealing with high-dimensional data [e.g. support vector machines and orthogonal projection to latent structures (OPLS)] could enable the development of new diagnostic tools based on structural MRI data [6, 7]. These techniques, based on the principle of multivariate statistics, can be used to perform studies across multiple dimensions of data, whilst taking into account the effects of all variables on the responses of interest.
Published neuroimaging results are usually difficult to compare because of two main issues: sample size and unaligned MRI acquisition protocols. We have shown previously that the pattern of structural brain differences is similar when comparing two large cohorts with aligned MRI acquisition protocol, regardless of the demographic characteristics . Results from combined large datasets with long follow-up periods would be easier to extrapolate to the general population, giving a more complete picture of dementia/AD. It is anticipated that this will lead to an earlier diagnosis for individual subjects and provide suitable markers for treatment response.
The main goal of this study was to derive an MRI-based severity index, based on multiple MRI features, with potential clinical value for estimating the future clinical progression of subjects with MCI. We used baseline MRI data from two large multicentre cohorts, the AddNeuroMed, a part of Innovative Medicines in Europe (InnoMed) project and the Alzheimer's Disease Neuroimaging Initiative (ADNI) studies. On the basis of volumetric and cortical thickness measures, and using the AD cases and healthy control (CTL) subjects as training sets, we used the multivariate technique OPLS to generate a severity index. The index was then estimated for subjects with MCI and the accuracy of classification was evaluated using the available follow-up clinical diagnosis as a priori information. Analysis was performed to investigate the cognitive profile of the subjects and the influence of the apolipoprotein E4 (ApoE4) status in connection with the severity index. We also investigated whether additional characteristics of the study subjects (age, education, cognitive profile and ApoE4 status) validate the severity index in those with MCI who did not progress to AD during the study period.
Material and methods
Data of subjects from two large multicentre studies, AddNeuroMed and ADNI, were used for this study.
The AddNeuroMed project is part of the InnoMed European Union FP6 programme, designed to develop and validate novel surrogate markers in AD. It includes a human neuroimaging component [9, 10] which combines MRI data with other biomarker and clinical information. Data were collected from six different sites across Europe: University of Kuopio, Finland; University of Perugia, Italy; Aristotle University of Thessaloniki, Greece; King's College London, UK; University of Łodz, Poland; and University of Toulouse, France. Written consent was obtained from research participants where possible; in those individuals in whom capacity was compromised by dementia, assent from the patient and written consent from a relative, according to local laws, was obtained. This study was approved by ethical review boards in each participating country. A total of 348 subjects from the AddNeuroMed project were included in this study: 119 AD patients, 119 MCI patients and 110 healthy CTL subjects.
Data from the ADNI cohort were obtained from the ADNI database (www.loni.ucla.edu/ADNI). ADNI was launched in 2003 by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, the Food and Drug Administration, private pharmaceutical companies and nonprofit organizations as a 5-year public–private partnership. The primary goal of ADNI has been to test whether serial MRI, of MCI and early AD could establish a set of sensitive and specific markers of very early AD progression to aid researchers and clinicians to develop new treatments and monitor their effectiveness, as well as lessen the duration and cost of clinical trials. Subjects aged 55–90 years from more than 50 sites across the USA and Canada participated in the ADNI study; more detailed information is available at www.adni-info.org. For the present study, 716 subjects were included from the ADNI cohort: 176 AD patients, 315 MCI patients and 225 healthy CTL subjects.
Inclusion and exclusion criteria
For the AddNeuroMed cohort, inclusion criteria for the AD group were the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's Disease and Related Disorders Association and Diagnostic and Statistical Manual of Mental Disorders (DSM-IV)  criteria for probable AD, Mini Mental State Examination (MMSE) score between 12 and 28, age 65 years or above. Exclusion criteria were significant neurological or psychiatric illness other than AD, significant unstable systematic illness or organ failure. All AD subjects had a Clinical Dementia Rating (CDR) scale score of ≥0.5.
Criteria for inclusion in the CTL and MCI groups were MMSE score between 24 and 30, Geriatric Depression Scale score of <5, age 65 years or above, stable medication and good general health, whereas exclusion criteria were DSM-IV criteria for dementia, significant neurological or psychiatric illness other than AD, significant unstable systematic illness or organ failure. Discrimination between patients with MCI and CTL subjects was based on two criteria: (i) CDR score of 0 for CTL subjects and CDR of 0.5 for those with MCI; and (ii) reported occurrence of memory problems (by subject or informant) for MCI patients.
Clinical dementia rating, MMSE and the Consortium to Establish a Registry for Alzheimer's disease (CERAD) cognitive battery scores were assessed for each subject. The CERAD cognitive battery was replaced with the Alzheimer's disease Assessment Scale (ADAS-Cog) for patients with AD. This cognitive test is specially designed for AD trials . Both the ADAS-Cog and the CERAD battery use the same 10-word recall task, although the scoring in the two tests is in the opposite direction. The mean number of words that were not recalled in the word list of the CERAD immediate recall task was calculated. The variable obtained was termed ADAS1, corresponding to the first subtest of ADAS-Cog. This was performed to provide comparable measures for the ADNI and AddNeuroMed cohorts.
For the ADNI cohort, a detailed description of the inclusion criteria can be found at http://www.adni-info.org/Scientists/AboutADNI.aspx#. Subjects were between 55 and 90 years of age, had a study partner who was able to provide independent evaluation of functioning, and spoke either English or Spanish. All subjects were willing and able to undergo all test procedures including neuroimaging and agreed to longitudinal follow-up. Use of specific psychoactive medications was excluded.
Inclusion criteria for the AD group were MMSE score between 20 and 26, CDR scale score of 0.5 or 1 and NINCDS/ADRDA criteria for probable AD. For inclusion in the MCI group, criteria were MMSE score between 24 and 30, memory problems with objective memory loss measured with the Wechsler Memory Scale Logical Memory II (education-adjusted scores), CDR score of 0.5, absence of significant levels of impairment in other cognitive domains, preservation of activities of daily living and absence of dementia. Inclusion criteria for the CTL group were MMSE score between 24 and 30, CDR score of 0, and absence of depression, MCI and dementia.
A total of 1064 subjects were included in this current study. Ten MCI subjects were classified as CTLs at follow-up and were excluded from the analysis. The characteristics of the study subjects are presented in Table 1, with the demographics of the individual ADNI and AddNeuroMed cohorts shown in Supplementary Table 1. Although both studies have a longitudinal design, only baseline MRI data were analysed in this study.
Table 1. Baseline characteristics
Data are mean ± SD. AD, Alzheimer's disease; MCI, mild cognitive impairment (c, converter; s, stable); CTL, healthy control; MMSE, Mini Mental State Examination; CDR-SOB, Clinical Dementia Rating – Sum of Boxes; ADAS1, word list nonlearning. Subjects with MCI are divided into MCI-s and MCI-c subgroups based on their follow-up diagnosis at 12, 18, 24 and 36 months.
Statistically significant difference between MCI-s and MCI-c, ANOVA.
Based on the follow-up diagnosis (12 months for the AddNeuroMed study and 12, 18, 24 and 36 months for the ADNI study), subjects with MCI were divided into two groups: those who did not progress to AD (stable MCI; MCI-s) and those who did progress to AD (MCI converting to AD; MCI-c).
MRI acquisition protocol
Data acquisition for the AddNeuroMed study was designed to be compatible with that of ADNI . The imaging protocol for both studies included a high-resolution sagittal 3D T1-weighted MPRAGE volume (voxel size 1.1 × 1.1 × 1.2 mm3) and axial proton density/T2-weighted fast spin echo images. The MPRAGE volume was acquired using a customized pulse sequence specifically designed for the ADNI study to ensure compatibility across scanners . Full brain and skull coverage was required and detailed quality control was carried out on all MRI data according to the AddNeuroMed quality control procedure [9, 10].
Postacquisition image analysis
Volumetric segmentation, cortical surface reconstruction and cortical parcellation, based on the FreeSurfer software package, version 4.5.0 (http://surfer.nmr.mgh.harvard.edu/), were used to quantify the baseline thicknesses and volumes of brain regions, as described in detail previously [14, 15]. The procedure automatically assigns a neuroanatomical label to each voxel in an MRI volume based on probabilistic information automatically estimated from a manually labelled training set. The regional cortical thickness was measured from 34 areas and the regional volumes were measured from 23 areas (see Table 2). Left- and right-sided thicknesses were averaged. Volumetric measures were corrected for differences in head size by dividing each measurement by the estimated total intracranial volume. This segmentation approach has been previously shown to be comparable in accuracy with manual labelling. The atlas-based normalization procedure increases the robustness and accuracy of the segmentation across scanner platforms . This segmentation approach has been used for multivariate classification of patients with AD and healthy CTL subjects [17, 18], neuropsychological image analysis , imaging genetic analysis  and biomarker discovery .
Table 2. Variables included in OPLS analysis
Cortical thickness measures
Banks of superior temporal sulcus
Caudal anterior cingulate
Caudal middle frontal gyrus
Corpus callosum anterior
Corpus callosum central
Corpus callosum midanterior
Inferior parietal cortex
Corpus callosum midposterior
Inferior temporal gyrus
Corpus callosum posterior
Isthmus of cingulate cortex
Lateral occipital cortex
Lateral orbitofrontal cortex
Medial orbitalfrontal cortex
Middle temporal gyrus
Cerebellar white matter
Inferior lateral ventricle
Triangular part of inferior frontal gyrus
Cerebral white matter
Posterior cingulate cortex
Rostral anterior cingulate cortex
Rostral middle frontal gyrus
Superior frontal gyrus
Superior parietal gyrus
Superior temporal gyrus
Transverse temporal cortex
To determine the sensitivity and specificity for discriminating between AD and healthy CTL subjects, orthogonal partial least squares to latent structures (OPLS), a supervised multivariate data analysis method was employed under SIMCA P+ software package (UMETRICS AB, Umeå, Sweden). FreeSurfer-derived MRI measures were analysed using the OPLS method [22-24].
Preprocessing was performed using unit variance scaling and mean centring. Variables with a high level of variance are more likely to be expressed in modelling than those with a low variance. Therefore, unit variance scaling was selected to scale the data appropriately. This method uses the inverse standard deviation as a scaling weight for each variable. Mean centring improves the interpretability of the data, by subtracting the variable average from the data; thus the dataset is repositioned around the origin. In this study, we used seven-fold cross-validation, which means that one-seventh of the data is omitted for each cross-validation round. Balanced groups were maintained during all cross-validation rounds so that further analysis was not affected. The predictive component is given a Q2(Y) value that describes its statistical significance for separating groups. Q2(Y) values >0.05 are regarded as statistically significant (http://www.umetrics.com/Content/Document%20Library/Files/UserGuides-Tutorials/SIMCA-P_12_UG.pdf).
A total of 57 variables were used for OPLS analysis. No feature selection was performed; in other words, all measured variables were included in the analysis. OPLS classifiers were trained on all data (the 57 variables in Table 2) from all subjects in the combined group of AD and CTL subjects and were then applied to data from subjects with MCI. Feature selection was not used, meaning all variables were included. By excluding specific regions, the models might be less representative and structural features measured from a limited set of predefined regions might not be able to reflect the complete pattern of structural abnormalities . Furthermore, Cuingnet et al. have shown that feature selection does not improve the classification, but does increase computational time . The effect of feature selection was investigated in another recent study  and was shown to improve the results for small cohorts, but have little effect on larger samples. The cohort size in this study was much larger than the largest sample used in the latter study.
It is however important to compare new methods with established approaches. Therefore, as well as using the OPLS technique on the 57 regional MRI measures, we created a further three models based on (i) hippocampal volume alone, (ii) the combination of hippocampal volume and lateral ventricles as recently suggested by Heister et al.  and (iii) 10 temporal lobe and ventricular measures based on previously reported information about the most affected regions (medial temporal lobe structures: hippocampus, entorhinal cortex, inferior temporal gyrus, medial temporal gyrus, superior temporal gyrus, parahippocampal gyrus, lateral ventricles, inferior lateral ventricles and third and fourth ventricles).
Orthogonal projection to latent structures is a supervised method which means it has both X and Y variables. The X variables are the original variables (volumes and cortical thickness measures) and Y contains the information about group membership. Y is set to 1 for AD cases and 0 for CTL subjects in the AD versus CTL model. The prediction value for a subject to belong to a group is equal to 1 for maximum likelihood and 0 for minimum likelihood, or vice versa depending on the group. The cut-off value for accepting the observation as correctly predicted is 0.5. When the model is generated, each subject receives a predictive Y value whilst it is omitted from the modelling during the cross-validation rounds and then predicted on to the model. The AD versus CTL model was used as a classifier to investigate how well it could predict conversion from MCI to AD. Each individual MCI subject was predicted on to the AD versus CTL model and this produced a discriminant index (the severity index based on MRI data) for each individual with MCI, reflecting the degree to which the individual's MR pattern resembled the pattern of AD subjects or the pattern of CTL subjects. MCI subjects demonstrating a more AD-like or CTL-like pattern than the AD and CTL subjects used to generate the OPLS model may be characterized by OPLS scores above 1 or below 0, respectively, as shown by others . A more detailed description of the method has been reported previously .
To further evaluate the use of the OPLS score for MCI prediction, we created survival curves for the ADNI cohort alone, as follow-up data to 36 months were available.
Receiver operating characteristic (ROC) curves were computed from the resulting scores by using the cross-validated prediction values of the OPLS models and the areas under the ROC curve (AUCs) were computed.
The AUC is similar to the Wilcoxon statistic, which provides a way to approximate the standard error. This enables the comparison of two algorithms based on formal statistical criteria [29-31, 33]. We used the ROCKIT ROC analysis software package (developed at the University of Chicago and part of the Metz ROC software) for statistical comparisons. Values of P < 0.05 were considered significant. Bonferroni correction was used to correct for multiple comparisons. ANOVA was used to compare continuous measures (e.g. FreeSurfer-derived variables, MMSE score and age) between groups. Four main study groups were used for analyses: CTL, MCI-s, MCI-c and AD.
The severity index is based only on MRI data and characterizes the atrophy pattern of each individual subject as AD-like or CTL-like. To investigate how demographic factors and cognitive scores (e.g. age, gender, ApoE4 status, baseline MMSE, 1-year change in MMSE, baseline CDR-SOB and 1-year change in CDR-SOB) alter with the severity index, we divided all subjects (CTL, MCI-s, MCI-c and AD) into groups based on AD-like or CTL-like classification and ApoE4 status.
Demographic and clinical characteristics of the study groups are summarized in Table 1. There were no differences in age between the main study groups. Overall the AD subjects had a higher and the MCI-c subjects a lower level of education compared with the MCI-s group.
Of the 434 MCI subjects, 173 progressed to a diagnosis of AD at follow-up (12 months for the AddNeuroMed cohort and 12, 18, 24 and 36 months for the ADNI cohort).
First, we assessed whether the FreeSurfer measures replicated previously published differences between the MCI-s and MCI-c groups. We found significant differences between groups for the following structures: banks of superior temporal sulcus, entorhinal cortex, fusiform gyrus, inferior parietal cortex, inferior temporal gyrus, isthmus of cingulate gyrus, lateral occipital cortex, lateral orbitofrontal cortex, medial orbitofrontal cortex, middle temporal gyrus, parahippocampal gyrus, precuneus cortex, rostral middle frontal gyrus, superior parietal gyrus, superior temporal gyrus, supramarginal gyrus, temporal pole, nucleus accumbens, amygdala, hippocampus and inferior lateral ventricle [ANOVA followed by unequal n honestly significant difference (HSD) post hoc analysis]. Figure 1 shows the differences for some of the most AD-specific brain structures (hippocampus, entorhinal cortex, middle temporal gyrus and superior temporal gyrus) in the study groups.
OPLS modelling and quality
An OPLS model was created for CTL versus AD subjects including 57 FreeSurfer measures (34 cortical thickness and 23 volume measures; Table 2). The final model resulted in one predictive and zero orthogonal (1+0) components with cross-validated predictability Q2(Y) = 0.592.
The mean ± SD severity indexes were 0.19 ± 0.20 and 0.78 ± 0.28 for the CTL and the AD groups and 0.38 ± 0.28 and 0.65 ± 0.26 for the MCI-s and MCI-c groups respectively (Figure 2).
CTL versus AD subjects
The classification results for CTL versus AD subjects are summarized in Table 3. The OPLS classifier enabled the discrimination of subjects with AD from CTLs with high cross-validated sensitivity (86.1%) and specificity (90.5%). Characteristics of the AD and CTL subjects based on their OPLS classification are shown in Table 4. Of 295 AD subjects, 242 were classified as AD-like and 53 as CTL-like. AD subjects classified as CTL-like were younger, although the difference was not statistically significant. This group also had a higher level of education. We found that CTL subjects classified as AD-like were often older than the true negatives, meaning that the oldest controls were more often misclassified, without any influence of the ApoE4 status. Misclassified CTL subjects were also more highly educated.
Table 3. OPLS modelling: sensitivity, specificity and area under the curve
AD, Alzheimer's disease; MCI, mild cognitive impairment (c, converter; s, stable); CTL, healthy control; CI, confidence interval; AUC, area under the curve.
AD versus CTL model as training set and MCI subjects as test set. Sensitivity refers to MCI-c classified as AD and specificity to MCI-s classified as CTL.
Table 4. Characteristics of the AD and CTL subjects based on the OPLS classifier results
No of subjects
Change in MMSE
Change in CDR-SOB
Data are mean ± SD. AD, Alzheimer's disease; CTL, healthy control; MMSE, Mini Mental State Examination; CDR-SOB, Clinical Dementia Rating – Sum of Boxes. MMSE and CDR-SOB changes are calculated between baseline and 12-month follow-up.
Statistically significant difference between groups.
To compare the use of a combination of the 57 measures to use of hippocampal volume alone, hippocampal volume plus lateral ventricle volume, and features selected based on prior knowledge, three additional models were created. The following results were obtained: (i) hippocampus alone: sensitivity 82.7%, specificity 82.1%, accuracy 82.4%, AUC 0.895 [confidence interval (95% CI) for AUC 0.868–0.917]; (ii) hippocampus and lateral ventricles: sensitivity 82.7%, specificity 81.9%, accuracy 82.3%, AUC 0.893 (CI 0.865–0.915); and (iii) hippocampus, entorhinal cortex, inferior temporal gyrus, medial temporal gyrus, superior temporal gyrus, parahippocampal gyrus, lateral ventricles, inferior lateral ventricles and third and fourth ventricles: sensitivity 83.0%, specificity 88.6%, accuracy 85.8%, AUC 0.923 (CI 0.902–0.943). Comparison of the AUC values using the method of Hanley and McNeil  showed that each of these three models provided inferior results to the use of the full 57 measures [AUC 0.948 (CI 0.9290–0.9630); P < 0.001].
MCI-c versus MCI-s subjects
Using FreeSurfer-derived measures as input to the OPLS model, subjects in the MCI-c group were separated from those in the MCI-s group with 69.6% sensitivity and 66.8% specificity.
Histograms of the OPLS severity index for both MCI-s and MCI-c groups are presented in Figure 3. Subjects in the MCI-c group were significantly more likely to have the AD atrophy phenotype (112/173, 64.34%) for the OPLS index than the CTL-like phenotype (61/173, 35.26%). ANOVA and post hoc analysis (unequal n HSD method) revealed statistical differences in the severity index between the MCI-s and MCI-c groups (Figure 2). MCI-c subjects classified as CTL-like were younger than those classified as AD-like, but this difference did not reach statistical significance. Of the 173 subjects who progressed to AD (MCI-c), 64.2% were ApoE4 carriers and 28 of these were homozygotes (4/4 genotype). There were no statistically significant differences between the MCI-c AD-like carrier and noncarrier groups nor between the MCI-c CTL-like carrier and noncarrier groups with regard to age, baseline MMSE, 1-year change in MMSE, baseline CDR-SOB, 1-year change in CDR-SOB and years of education (see Table 6).
Histograms of the severity index for the MCI-s and MCI-c groups are shown in Figure 3. MCI-s subjects were divided into three subgroups based on the OPLS-derived index: <0.25 (MCI-s-1), 0.25–0.75 (MCI-s-2) and >0.75 (MCI-s-3), similar to the reported method of Davatzikos et al. . The characteristics of subjects in the three subgroups are presented in Table 5. Although there were no statistically significant differences between these three groups, a clear trend for all variables was present. Using pattern classification to differentiate stable from progressive MCI subjects, the same trend was also observed for the SPARE-AD score .
Table 5. Distribution of the FreeSurfer OPLS score in the MCI cohort
Change in MMSE
Change in CDR-SOB
Change in ADAS1
Data are mean ± SD. AD, Alzheimer's disease; MCI, mild cognitive impairment (c, converter; s, stable); CTL, healthy control; MMSE, Mini Mental State Examination; CDR-SOB, Clinical Dementia Rating – Sum of Boxes; ADAS1, word list nonlearning. The MCI-st group was divided into three subgroups based on the histogram of the OPLS score. MMSE and CDR-SOB changes are calculated between baseline and 12-month follow-up.
0.65 ± 0.26
74.42 ± 6.97
26.6 ± 1.73
−2.0 ± 2.53
1.81 ± 1.0
1.2 ± 1.5
5.04 ± 1.27
−0.50 ± 1.21
0.78 ± 0.18
74.98 ± 6.80
26.50 ± 1.76
−1.91 ± 2.47
1.89 ± 1.01
1.19 ± 1.47
5.16 ± 1.31
−0.44 ± 1.20
0.35 ± 0.14
73.16 ± 7.00
26.82 ± 1.64
−1.43 ± 2.77
1.66 ± 0.86
0.88 ± 1.49
4.83 ± 1.17
−0.60 ± 1.24
0.85 ± 0.07
76.5 ± 5.40
26.6 ± 1.71
−0.6 ± 2.88
1.60 ± 1.0
0.7 ± 1.3
5.26 ± 1.25
−0.40 ± 1.16
0.46 ± 0.14
75.60 ± 6.33
27.10 ± 1.67
−0.2 ± 2.42
1.40 ± 0.80
0.4 ± 1.6
4.71 ± 1.35
−0.36 ± 1.39
0.07 ± 0.13
72.34 ± 6.86
27.77 ± 1.59
+0.4 ± 1.83
1.20 ± 0.60
0.4 ± 1.5
4.28 ± 1.47
0.14 ± 1.67
Figure 4 shows survival curves for the AD-like and CTL-like MCI subgroups based on baseline MRI data, illustrating the much higher subsequent conversion rate to AD of the AD-like subgroup.
Of the 261 subjects who remained stable (MCI-s), only 36% were ApoE4 carriers (Table 6). Amongst MCI-s subjects classified as CTL-like, ApoE4 carriers were younger than noncarriers.
Table 6. Characteristics of subjects with MCI (stable and convertors) based on the atrophy pattern as detected by the OPLS classifier and ApoE status
Change in MMSE
Change in CDR-SOB
Data are mean ± SD. AD, Alzheimer's disease; MCI, mild cognitive impairment (c, converter; s, stable); CTL, healthy control; MMSE, Mini Mental State Examination; CDR-SOB, Clinical Dementia Rating – Sum of Boxes. MMSE and CDR-SOB changes are calculated between baseline and 12-month follow-up.
A challenge in developing informative neuroimaging biomarkers for early AD diagnosis is the need to identify biomarkers that are altered before the onset of clinical symptoms, and which have adequate sensitivity and specificity on an individual patient basis.
We found differences in AD-specific brain structures, which is in line with previous findings of structural MRI as a sensitive biomarker for AD pathology . Volumetric and cortical thickness data from this study extend previously published findings [5, 35]. There were significant differences between the MCI-s and MCI-c groups in hippocampal volume, as well as in the thickness of medial temporal gyrus, superior temporal gyrus and entorhinal cortex. This indicates a widespread pathology for MCI-c subjects; however, whether these findings can be integrated to provide a meaningful clinical intervention remains an open question. Machine learning algorithms and multivariate statistical techniques could have the potential to assist in the early diagnosis of AD. However, manual measures of different brain regions are time consuming and operator dependent and hence are not regularly used in clinical settings. For this reason, in this study we investigated the applicability of OPLS using only automated regional subcortical volumes and cortical thickness measures. The results show that using all the acquired 57 structural measures and OPLS is superior to using single measures such as hippocampus alone or using feature selection based on prior knowledge (regions known to be affected in AD). Nevertheless, the challenges of translating research techniques into clinical practice should not be underestimated. Jack et al.  have highlighted the substantial work required for standardization of hippocampal measures and the significant regulatory hurdles to overcome for even a single measure such as this.
Here, we investigated whether structural brain measures (both volume and thickness) combined under the form of a severity index can be used to differentiate between clinically relevant groups. The OPLS model resulted in a prediction accuracy which was significantly better than chance for the discrimination of patients with AD from normal ageing (88.4%), with high sensitivity (86.1%) and specificity (90.4%). Both misclassified AD and CTL subjects were younger and more educated than correctly classified subjects. This finding is in line with previously published results . The severity index was higher than 0.5 in most MCI-c subjects. As expected, those in the MCI-s group presented a more heterogeneous pattern for the severity index.
The performance of our classifier is comparable to that of previously published methods and suggests similar conclusions [7, 25]. Cuingnet et al. compared several different methods for the separation of MCI-c and MCI-s using subjects from the ADNI study and confirmed that discrimination is difficult, even with different analysis approaches . Similarly, Davatzikos et al.  derived a classification index which they compared with clinical assessment and prognosis, with a performance similar to the one used in the current study (66.6% sensitivity) when applied to a subset of the ADNI cohort. We used the same MCI-c subgroups as Davatzikos et al.  to allow direct comparison between the two methods. However, a strength of this study is that we used a much larger cohort than in the latter study. McEvoy et al. used rigid regularized quadratic discriminant analysis on a subsample of the ADNI cohort . Fifty-eight FreeSurfer measurements were used as input into the analysis and, as in our study, an atrophy score for each MCI subject was generated that was used to compute an average risk of conversion. The authors concluded that individuals with atrophy scores in the highest percentile had a greater than two-fold increase in risk of conversion to AD, whereas those with atrophy scores in the lowest percentile had a five-fold decreased risk.
The issue of nonconverter MCI subjects cannot be solved without long-term follow-up, although MRI changes can be detected at least 3 years before the diagnosis of AD . Longitudinal studies have shown that the majority of subjects with MCI progress to AD in the first 2 years . Thus, a 1-year follow-up period (as in the AddNeuroMed cohort) may be insufficient to ensure adequate clinical separation between the MCI-c and MCI-s groups. This may explain the lower accuracy achieved for the MCI predictions. The survival curves shown in Figure 4, however, do illustrate that a substantially higher proportion of the MCI subjects who demonstrate an AD-like MRI pattern at baseline convert to AD in the following 36 months compared to those with a CTL-like pattern.
These results further support the notion that sophisticated statistical methods, such as multivariate analysis, are necessary to capture complex patterns of brain atrophy. Such methods are more informative for predicting clinical course, compared to the use of a limited number of predefined regions known to be affected early in the disease course. Using a limited set of predefined regions may not reflect completely the spatial and temporal pattern of structural and physiological abnormalities . Most existing pattern classification methods usually use only one individual modality of biomarker that may affect the overall classification performance, as is the case in this study. In addition to neuroimaging biomarkers (structural MRI and FDG-PET), biological and genetic biomarkers are available. Different biomarkers provide complementary information, which may be useful for the diagnosis of AD and MCI when used together. Recently, Zhang et al. proposed a new multimodal data fusion and classification method based on kernel combination for AD and MCI subjects . We have also previously combined MRI measures and cerebrospinal fluid (CSF) markers to predict conversion at several future time-points using OPLS in a subsample of the ADNI cohort. We found that the addition of CSF markers further improved the predictions .
Multiple follow-up diagnostic visits were available for the subjects in the ADNI cohort, but not those in the AddNeuroMed study. Although the clinical evolution of MCI remains poorly understood, our study is one of the largest to date to investigate an MRI-based severity index.
As the annual conversion rate for the MCI subjects is 10–15%, it is anticipated that many in the MCI-s group will convert to AD in the near future. The distribution of the severity index suggests that a subgroup of MCI-s subjects has normal brain structures. However, a large subgroup has a distinct AD-like pattern, which we believe reflects the underlying AD pathology.
Subjects with MCI who did not progress to AD within the follow-up period of this study are characterized by differences in characteristics (e.g. age, MMSE score and ApoE4 status) depending on severity index subgroup. The subgroup of MCI-s with the highest severity index (MCI-s-1) was associated with a faster decline in MMSE score and older age. In contrast, subjects belonging to the MCI-s-2 and MCI-s-3 groups, although showing relatively similar decline in MMSE scores, had statistically significant differences in severity index. Amongst MCI-s subjects, the subgroup with the highest severity index had a different proportion of APOE4 carriers compared with the other two subgroups.
Dividing the MCI-s group into three subgroups revealed that gradual brain changes over long periods of time might eventually lead to clinical progression. These results are important because they demonstrate the robustness of the structural MRI dementia measure that we have used to detect structural brain differences between groups. In a previous study by Davatzikos et al., the MCI-s group was categorized in a similar way . The present results confirm their findings in a much larger cohort. Although not statistically significant, we observed a trend towards lower MMSE scores, higher CDR-SOB and ADAS1 scores and a higher percentage of ApoE4 carriers with a higher severity index.
Magnetic resonance imaging is a promising adjunct to the clinical diagnosis of AD and is useful for assessing the early stages of disease. However, there are several obstacles to the widespread use of volumetric MRI in the clinical setting: variation in imaging protocols, spatial distortion of MRI data, the absence of normative values and labour-intensive methods have all reduced the potential impact of MRI measures. Developments are needed to allow consistency in acquisition of MRI data across sites and fully automated image segmentation before being able to introduce the clinical use of volumetric MRI. Large, multicentre trials, such as ADNI and AddNeuroMed, are important for facilitating greater use of MRI in clinical settings by providing image standardization, correction for spatial distortion, improved data throughput, and on-site quality control procedures.
These results confirm that joint evaluation of brain regions is beneficial for increasing the accuracy of predicting progression to AD. Future studies with longer follow-up periods will improve our estimates of specificity. In addition, information such as age, ApoE4 status and level of education should be used as co-factors when deciding cut-off values for severity indices similar to the index proposed in this study.
This study was supported by InnoMed (Innovative Medicines in Europe), an Integrated Project funded by the European Union of the Sixth Framework programme priority FP6-2004-LIFESCIHEALTH-5, Life Sciences, Genomics and Biotechnology for Health.
Data collection and sharing in this project was funded by the Alzheimer's Disease Neuroimaging Initiative (ADNI) (National Institutes of Health grant U01 AG024904). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: Abbott; Alzheimer's Association; Alzheimer's Drug Discovery Foundation; Amorfix Life Sciences Ltd.; AstraZeneca; Bayer HealthCare; BioClinica, Inc.; Biogen Idec Inc.; Bristol-Myers Squibb Company; Eisai Inc.; Elan Pharmaceuticals Inc.; Eli Lilly and Company; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; GE Healthcare; Innogenetics, N.V.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Medpace, Inc.; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Servier; Synarc Inc.; and Takeda Pharmaceutical Company. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer's Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of California, Los Angeles. This research was also supported by NIH grants P30 AG010129, K01 AG030514, and the Dana Foundation.
The authors thanks Swedish Brain Power, the Strategic Research Programme in Neuroscience at Karolinska Institutet (StratNeuro), Hjärnfonden, the Gamla Tjänarinnor foundation, the Swedish Alzheimer's Association, the regional agreement on medical training and clinical research (ALF) between Stockholm County Council and Karolinska Institutet, Health Research Council of Academy of Finland, University of Eastern Finland UEFBRAIN, EVO funding from Kuopio University Hospital and Stockholm Medical Image Laboratory and Education (SMILE). AS and SL were supported by funds from NIHR Biomedical Research Centre for Mental Health at the South London and Maudsley NHS Foundation Trust and Institute of Psychiatry, Kings College London.