Multiparametric hippocampal signatures for early diagnosis of Alzheimer's disease using 18F‐FDG PET/MRI Radiomics

Abstract Purpose This study aimed to explore the utility of hippocampal radiomics using multiparametric simultaneous positron emission tomography (PET)/magnetic resonance imaging (MRI) for early diagnosis of Alzheimer's disease (AD). Methods A total of 53 healthy control (HC) participants, 55 patients with amnestic mild cognitive impairment (aMCI), and 51 patients with AD were included in this study. All participants accepted simultaneous PET/MRI scans, including 18F‐fluorodeoxyglucose (18F‐FDG) PET, 3D arterial spin labeling (ASL), and high‐resolution T1‐weighted imaging (3D T1WI). Radiomics features were extracted from the hippocampus region on those three modal images. Logistic regression models were trained to classify AD and HC, AD and aMCI, aMCI and HC respectively. The diagnostic performance and radiomics score (Rad‐Score) of logistic regression models were evaluated from 5‐fold cross‐validation. Results The hippocampal radiomics features demonstrated favorable diagnostic performance, with the multimodal classifier outperforming the single‐modal classifier in the binary classification of HC, aMCI, and AD. Using the multimodal classifier, we achieved an area under the receiver operating characteristic curve (AUC) of 0.98 and accuracy of 96.7% for classifying AD from HC, and an AUC of 0.86 and accuracy of 80.6% for classifying aMCI from HC. The value of Rad‐Score differed significantly between the AD and HC (p < 0.001), aMCI and HC (p < 0.001) groups. Decision curve analysis showed superior clinical benefits of multimodal classifiers compared to neuropsychological tests. Conclusion Multiparametric hippocampal radiomics using PET/MRI aids in the identification of early AD, and may provide a potential biomarker for clinical applications.


| INTRODUC TI ON
Alzheimer's disease (AD) typically presents with gradually worsening amnesia, which is characterized by an early distribution of predominantly neurofibrillary tangles pathology within the medial temporal lobe.6][7][8] Currently, available biomarkers allow for effective identification of patients with AD in the early stages, making individualized clinical intervention possible.However, the existing biomarkers are not sufficiently convenient and generalizable.
The hippocampus, a structure responsible for memory storage, is one of the most susceptible hallmarks of AD.0][11] In addition to volume measurement, alterations in hippocampal glucose metabolism and blood flow have also been reported in AD and MCI patients, providing an additional information for diagnosis. 12,13 18F-fluorodeoxyglucose positron emission tomography ( 18 F-FDG PET) is a superior examination to evaluate hypometabolism induced by cognitive decline. 14,15recent study revealed that glucose metabolism decreased in the hippocampus bilaterally in AD patients.16 Furthermore, cerebral blood flow (CBF) of arterial spin labeling (ASL)-magnetic resonance imaging (MRI) is believed to be tightly coupled to glucose metabolism and is a reliable indicator of hippocampal atrophy.17 However, single-level features such as volume are insufficient for accurately characterizing the hippocampus as a generalizable biomarker.This limitation has led many studies to focus on characterizing more features through radiomics analysis.
Radiomics is an emerging image analysis method that captures the heterogeneity of tissue to provide valuable information for disease diagnosis.Subtle changes that cannot be detected visually in neuroimaging can be analyzed by extracting high-throughput quantitative features through radiomics.Recent studies in structural MRI have demonstrated that hippocampal radiomics features have the potential as biomarkers for AD classification. 18In a multicenter study, machine learning was utilized to show that hippocampal radiomics signatures were strongly correlated with apolipoprotein E, cerebrospinal fluid β-amyloid (Aβ), and cerebrospinal fluid tau. 19 Related Disorders Association criteria for probable. 21And the diagnosis of aMCI was based on diagnostic criteria defined by Petersen et al. 22 The HC participants were age-and gender-matched to AD patients and had no cognitive decline complaints, with the MMSE score ≥24.Those participants were excluded who had current or previous psychiatric or neurological disorders such as major depression, stroke, brain tumor, head injury, or cerebrovascular injury related to cognitive impairment.

This study was approved by the Ethics Committee of Xuanwu
Hospital, Capital Medical University, China.All participants or their legal guardians gave written informed consent before the start of the study.

| Imaging acquisition
All PET/MRI scans were performed within a 30-day window following the neuropsychological assessment, including MMSE and Montreal Cognitive Assessment (MoCA), using a simultaneous timeof-flight (TOF) PET/MRI scanner (Signa, GE Healthcare, Waukesha, WI, USA).A 19-channel head and neck union coil was used for imaging.Before scanning, all participants fasted for 4-6 h to ensure normal blood glucose levels.Each participant received an intravenous injection of 3.7 MBq/kg of 18 F-FDG while lying supine with their eyes closed in a dimly lit room.
MR images were acquired with the PET acquisition at the same time.High-resolution T1-weighted imaging (3D T1WI) were acquired using the following parameters: repetition time/echo time = 8.5/3.2 ms, flip angle = 15°, voxel size = 1.00 × 1.00 × 1.00 mm 3 , and 188 slices.Additionally, 3D ASL images were acquired using the following parameters: post-labeling delay = 2.0 s, 23  1.82 × 1.82 × 2.78 mm 3 ).This was achieved using an algorithm, incorporating TOF information, point spread function, and an ordered subset expectation maximization method.The algorithm involved 8 iterations, 32 subsets, and applied a 3 mm Gaussian filter. 12

| Data preprocessing
The diagram of this machine learning-based analysis was overviewed in Figure 1.
PET and MRI images were processed using MATLAB 2018b (The MathWorks Inc.) and Statistical Parametric Mapping (SPM12, Wellcome Department of Imaging Neuroscience, London, United Kingdom).SPM cortical segmentation was used to divide the images into gray matter, white matter, and cerebrospinal fluid images.
First, the 18 F-FDG PET and CBF images were aligned to the matched 3D T1WI images.The 3D T1WI images were then spatially normalized into the standard Montreal Neurological Institute (MNI) space.
Subsequently, PET and CBF images were also normalized to the MNI space with the same transformation matrix.Moreover, 18 F-FDG PET images were converted to standardized uptake value ratio (SUVR) maps, referring to the pons. 24,25

| Hippocampal segmentation and feature extraction
The bilateral hippocampal segmentation was performed using an MRI template, which was proposed by the Johns Hopkins Department of Radiology and drawn in standard MNI space manually by PMOD software (PMOD 4.002, Technologies Ltd., Zürich, Switzerland). 26e hippocampal volume, mean CBF, and mean SUVR of each subject were calculated for region of interest (ROI)-wise analysis.For each modality, 1316 features were extracted respectively within the left and right hippocampus, including 18 first-order features, 14 shapebased features, 75 texture features, 186 Laplacian of Gaussian (LoG) features, 744 wavelet features, and 279 local binary patterns (LBP) features.The quantitative radiomics features were calculated using the PyRadiomics package which was implemented in Python 3.8.1.

| Feature selection and radiomics model
For feature selection, two feature selection methods, Max-Relevance and Min-Redundancy (mRMR) and Least Absolute Shrinkage and Selection Operator (LASSO) were employed.Initially, mRMR was utilized to filter the redundant and irrelevant features.Then LASSO was employed to select the optimized subset of features by choosing the regular parameter λ and determining the number of the feature.
Once the number of features was determined, the most predictive subset of features was selected and the corresponding coefficients

| Statistical analysis
The statistical analysis was performed using R 3.6.1 software.The metrics of the area under the receiver operating characteristic curve (AUC) with 95% confidence interval (CI), accuracy (ACC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), and negative predictive value (NPV) were evaluated to assess the performance of the LR model.The radiomics score (Rad-Score) for each participant was calculated by summing the selected features of LR models which were weighted by their coefficients.
Furthermore, the decision curve analysis was conducted to assess the clinical utility of the model, which was constructed from the net benefit within a threshold probability range in the testing group.
The codes used in this study are available from the corresponding author upon request.
The differences between demographic characteristics and imaging features were compared using SPSS 26 software.Shapiro-Wilk test was used to evaluate the distribution of continuous variables.
One-way Analysis of Variance (ANOVA) was conducted for data with continuous variables and a normal distribution, while the Wilcoxon test was used for data without normal distribution.Additionally, Kruskal-Wallis test was used for comparison of more than two groups.The differences between any two groups were verified using post hoc comparisons.Categorical variables were compared using chi-squared testing.The statistical significance was determined at a level of p < 0.05.

| Demographic characteristics
A total of 159 subjects were evaluated in this study.Significant group differences were found in age (p = 0.002), MMSE and MoCA scores (p < 0.001).The aMCI group was older than HC group (p < 0.001).However, no significant differences were found in terms of gender (p = 0.995) and education (p = 0.264).There were significant differences for the hippocampal volume, 18 F-FDG PET SUVR value, and CBF value (all p < 0.01).Specifically, the hippocampal volume (p < 0.001) and 18 F-FDG PET SUVR value (p < 0.001) decreased in the progressive stages of HC, aMCI, and AD.The CBF value of HC and aMCI were similar (p = 0.966) and both were higher than AD (p = 0.001).The detailed information is listed in Table 1.
All participants of HC, aMCI, and AD were randomly divided 7 to 3 into a training group and a group (Table 2).

| Radiomics features and Rad-Score
The comparisons of the Rad-Score in multimodal classifier for classifying AD and HC, AD and aMCI, aMCI and HC showed significant differences (all p < 0.01) (Figure 2).For other classifiers with radiomics features, the Rad-Scores were detailed in Data S1.
In the multimodal classifier for distinguishing aMCI from HC, the categories and coefficients of the selected radiomics features were showed in Figure 3. Additionally, the Rad-Score, which is the result of the linear combination of LR model, was calculated as below:

| Diagnostic performance
The performance of using single modal (3D T1WI, 18 PET and CBF), MRI (3D T1WI + CBF) and multimodal (3D T1WI + 18 F-FDG PET + CBF) classifiers in the hippocampus were evaluated, as shown in Figure 4, Tables 3 and 4. We found that the multimodal Furthermore, the performance of the multimodal classifier was evaluated in female and male subgroups with AUC = 0.85 and ACC = 83.6% for the female subgroup and AUC = 0.94 and ACC = 90.2%for the male subgroup in the classification of aMCI and HC (Data S2, Table S1).

| Clinical classifier and application for aMCI
Both univariate logistic regression and multivariate logistic regression were conducted for gender, age, education, MMSE, and MoCA.The results showed that MoCA was p < 0.001, and the AUC of MoCA to identify aMCI from HC was 0.82 (95% CI: 0.73-0.92) in the training group, 0.75 (95% CI: 0.57-0.93) in the testing group.
The decision curve analysis indicated a threshold probability between 0 and 0.85 to benefit from the Radiomics model and was higher than the Clinical model in the range of 0.1 to 0.85, as shown in Figure 5.

| DISCUSS ION
In this study, we developed LR models by utilizing single-level hippocampal radiomics features derived from 3D T1WI, 18 F-FDG PET,  reduced hippocampal volume and increased risk of AD. 27 Our research supports these findings, that is, the hippocampal volume differs significantly among HC, aMCI, and AD, and exhibits a trend toward worsening atrophy with disease exacerbation.The same results could also be concluded from 18 F-FDG PET SUVR of the hippocampus in this study.This is consistent with the study of Laforce et al. 28 in which they reported that decreased glucose metabolism in the hippocampus preceded volume atrophy and was also a potential biomarker for AD.For patients with AD and aMCI, the voxel-wise and ROI-based group differences in a simultaneous PET/MRI study demonstrated that the pattern of abnormal CBF was similar to the hypometabolism pattern of 18 F-FDG PET. 29 Similarly, another simultaneous PET/MRI study suggested CBF had comparable efficacy to 18 F-FDG PET for AD classification. 30Our study also showed that blood perfusion in the hippocampus was significantly reduced in patients with AD, whereas not in patients with aMCI when compared to the HC group.This indicated that CBF can be effectively used as a biomarker for identifying AD from HC, but is insufficient for identifying aMCI from HC.
One of the challenges for modern neuroimaging is the early diagnosis of AD, particularly for patients with aMCI, because the current AD therapeutics were developed to focus on early-stage AD for optimal efficacy. 31-338][39] In contrast to the random forest models based on the whole brain area analysis of Huang et al. 40 (AUC: 0.99 and ACC: 93.5% for AD; AUC: 0.88 and ACC: 80.8% for MCI), our hippocampal radiomics classifiers (AUC: 0.98 and ACC: 96.7% for AD; AUC: 0.86 and ACC: 80.6% for aMCI) outperformed comparably in discriminating AD and aMCI from HC.The single brain region analysis in our study evidently minimized the instability variables related to multiple brain regions, simplified the training, and enhanced the robustness of machine learning.
In this study, the hippocampus was defined as ROI for single brain region analysis, and the radiomics approach was employed for texture analysis of PET/MRI data for efficient performance.
Hippocampal volume can only coarsely represent complex anatomical atrophy in AD.A study of the hippocampus suggested that hippocampal anatomical heterogeneity determined AD-induced heterogeneous microscopic changes within the hippocampus. 41fferential changes within the hippocampal tissue are reflected as altered texture patterns in medical images, which can be quantitatively measured by radiomics features.Previous studies by Zhao et al., 19 Feng et al., 18 and Du et al. 42   regions, which explains why the efficacy of the CBF classifier in our study is not as superior as other classifiers.We further constructed an MRI model and found it to be effective in AD classification.However, the multimodal hippocampal parameters ensemble had better performance in the classification of aMCI and HC.This suggests that the metabolic information provided by 18 F-FDG PET can be complementary to MRI.
Overall, classifiers composed of single-level hippocampal features were effective in diagnosing AD or aMCI from HC, and the multi-feature ensemble classifier performed best, especially in aMCI, and had favorable clinical benefits.
However, our study had several limitations that we need to address.Firstly, this study was conducted in a single center and only recruited patients from our hospital.To enhance the efficacy and robustness of the machine learning model in this study, a multi-center study could increase the size of the data and validate the generalization ability of the model.In the future, the results of this study will be evaluated on multi-centers data.Secondly, the hippocampal segmentation utilized in this study was atlas-based and was subject to individual differences.An accurate automatic segmentation that can be performed in the individual space will be a valuable research perspective.Lastly, we did not analyze Aβ PET in this study, which is available for risk stratification of patients with early AD. 43We in the process of acquiring Aβ PET data and the results of amyloid PET data will be evaluated further in the near future at our PET center and other larger cohort studies.

| CON CLUS ION
In this study, we developed a novel approach based on 18

2 |
Moreover, at a fiveyear follow-up of MCI patients, hippocampal signatures were found to change with the longitudinal variation in cognitive performance measured by the Mini-Mental State Examination (MMSE). 20In light of the robust and reproducible association of hippocampal radiomics signatures with AD, exploring the use of hippocampal radiomics in improving the accuracy of early diagnosis of AD is warranted.PET/MRI offers the unique capability to simultaneously evaluate brain structure, CBF, and glucose metabolism with optimal spatial and temporal registration of both modalities.This methodology can aid in better understanding the impact on neuronal function and shed light on the mechanisms involved in the development of AD.In this study, we hypothesized that radiomics features extracted from the hippocampus imaging could provide insights into the underlying pathology of AD.Consequently, we aimed to develop and evaluate a machine learning model with hippocampal radiomics features extracted from PET/MRI for early diagnosis of AD.MATERIAL S AND ME THODS 2.1 | Participants A total of 159 subjects were enrolled in this study from July 2017 to August 2022 at Xuanwu Hospital, Capital Medical University.The participants were predominantly right-handed and included 53 HC participants, 55 aMCI patients, and 51 AD patients.The diagnosis of AD was based on the National Institute of Neurological and Communicative Diseases and Stroke/Alzheimer's Disease and

| 3 of 11 CHEN
repetition time/ echo time = 10.7/5337ms, voxel size = 1.88 × 1.88 × 4.00 mm 3 , and 36 slices.Simultaneous PET data were acquired, using a 2-point Dixon scan for MRI-based PET attenuation correction.The PET data were reconstructed to a matrix of size 192 × 192, with a field of view of 35 cm and a slice thickness of 2.78 mm (each voxel measuring et al. were evaluated.The logistic regression (LR) models were trained to classify AD from HC, AD from aMCI, aMCI from HC with five-fold cross-validation.The data of all subjects were randomized into training groups and testing groups in a ratio of 7:3.Based on the features extracted from 3D T1WI, 18 F-FDG PET, and CBF images, the classifiers were divided into multiple categories: 3D T1WI classifier, F I G U R E 1 Radiomics workflow.3D T1WI, 3 dimensional T1-weighted imaging; 18 F-FDG PET, 18 F-fluorodeoxyglucose positron emission tomography; CBF, cerebral blood flow; MNI, Montreal Neurological Institute; Original features, including first-order features, shapebased features and texture features; LBP, local binary patterns; LoG, Laplacian of Gaussian; LR, logistic regression.sisted of 3D T1WI and CBF, and multimodal classifier which combined all categories.And the clinical classifier consisted of clinical information.
classifier achieved better performance in discriminating AD, aMCI, and HC than single modals and MRI classifiers.In the classification of AD and HC, the multimodal classifier yielded AUC = 0.98 and ACC = 96.7% in the testing group, which is higher than the MRI classifier (AUC = 0.96, ACC = 90.0%)and the single-level hippocampal features (3D T1WI, AUC = 0.96, ACC = 90.0%;18F-FDG PET, AUC = 0.96, ACC = 93.3%;CBF, AUC = 0.78, ACC = 76.7%), as shown in Figure 4A,B.Regarding the classification of aMCI and HC, using the comprehensive characterization of hippocampal feature ensemble resulted in an AUC = 0.86 and ACC = 80.6% in the testing group, and the MRI classifier had an AUC = 0.79 and ACC = 77.4% (Figure 4E,F).
and CBF respectively, to classify AD, aMCI, and HC.The hippocampal multi-features ensemble based on structural, metabolic, and blood flow information in the multimodal classifier improved the accuracy of AD and aMCI identification.Hippocampal atrophy widely observed in AD and aMCI patients and is strongly correlated with disease progression.A fiveyear follow-up study has consistently shown a relationship between developed a 3D T1WIbased hippocampal radiomics classifier to diagnose AD.These studies reported AUCs between 0.92 and 0.95, ACCs between 86.8% and 88.2% for the classification of AD and HC, and AUCs between 0.69 and 0.70, ACCs between 67.8% and 70.5% for the classification of MCI and HC.In comparison, our multimodal classifier significantly improved the diagnostic performance of AD (AUC = 0.98, ACC = 96.7%),especially in aMCI (AUC = 0.86, ACC = 80.6%).Zhao et al. 19 found a significant correlation (up to r = 0.40, p < 0.001) between texture features, represented by gray level size zone matrix features, and MMSE.Consistent with this, the gray level size zone matrix features were an important component of the selected features in this study.This may indicate that radiomics does have the potential to reflect robust and reproducible changes in hippocampal pathological alterations in AD and aMCI.Furthermore, the clinical benefit of our multimodal classifier outperformed the MoCA and had the potential to replace the clinical cognitive function assessment scale to simplify tedious diagnostic work.The significant advantage of our study lies in the utilization of 18 F-FDG PET and CBF to complement 3D T1WI by providing metabolic and blood flow information.As a result, this allows for better identification of aMCI for early diagnosis.A recent study has found that 18 F-FDG hypometabolism in the hippocampus differs among subregions in AD.

TA B L E 4 F I G U R E 5
Performance of combination models.Decision curve analysis of Radiomics model and Clinical model in the testing group for aMCI patients and HC participants.The Y-axis means the model benefit, and the Xaxis represents the threshold probability.Radiomics, including multimodal features extracted from 3D T1WI, 18 F-FDG PET, and ASL-CBF of aMCI and HC; Clinical, including MoCA scores of aMCI and HC.aMCI, amnestic mild cognitive impairment; HC, healthy control; MoCA, Montreal Cognitive Assessment.
Demographics of datasets in training and testing group.
12 18F-FDG PET-based hippocampal radiomics can also be ap-Performance of single-modal machine learning models.