Genetic architecture of hippocampus subfields volumes in Alzheimer’s disease

Abstract Background The hippocampus is a heterogeneous structure, comprising histologically and functionally distinguishable hippocampal subfields. The volume reductions in hippocampal subfields have been demonstrated to be linked with Alzheimer's disease (AD). The aim of our study is to investigate the hippocampal subfields' genetic architecture based on the Alzheimer's Disease Neuroimaging Initiative (ADNI) data set. Methods After preprocessing the downloaded genetic variants and imaging data from the ADNI database, a co‐sparse reduced rank regression model was applied to analyze the genetic architecture of hippocampal subfields volumes. Homology modeling, docking, molecular dynamics simulations, and Co‐IP experiments for protein–protein interactions were used to verify the function of target protein on hippocampal subfields successively. After that, the association analysis between the candidated genes on the hippocampal subfields volume and clinical scales were performed. Results The results of the association analysis revealed five unique genetic variants (e.g., ubiquitin‐specific protease 10 [USP10]) changed in nine hippocampal subfields (e.g., the granule cell and molecular layer of the dentate gyrus [GC‐ML‐DG]). Among five genetic variants, USP10 had the strongest interaction effect with BACE1, which affected hippocampal subfields verified by MD and Co‐IP experiments. The results of association analysis between the candidated genes on the hippocampal subfields volume and clinical scales showed that candidated genes influenced the volume and function of hippocampal subfields. Conclusions Current evidence suggests that hippocampal subfields have partly distinct genetic architecture and may improve the sensitivity of the detection of AD.


| INTRODUC TI ON
The hippocampus plays an important role in learning, memory, and spatial navigation. 1 It implicates several brain disorders, especially Alzheimer's disease (AD).5][6][7][8] The atrophy is believed to be associated with functional deficits in AD. 9 Hippocampal atrophy, determined by magnetic resonance imaging (MRI), 10 is considered as one of the most validated, easily accessible biomarker of AD and has been widely used. 11e hippocampus is composed of several subfields with different histological characteristics and heterogenous structure. 12It includes the cornu ammonis (CA1-CA4) and the dentate gyrus (DG), 13 and for differentiated prodromal AD, preliminary findings give evidence that estimates of the volume of hippocampal subfields are more sensitive than that of the total hippocampal volume. 14,15Pathologically, NFTs are investigated in CA1, subiculum, CA2, CA3, and CA4/DG in patients with mild cognitive impairment (MCI). 16Aβ precedes NFTs formation. 8 is detected extra-and intracellularly, whereas NTFs are found to be located intracellularly within Aβ-containing neurons in the CA1 of AD mouse.8 In addition, a large amount of research in hippocampus has identified that the volume reductions in hippocampal subfields such as CA1, subiculum, and dentate gyrus(DG) have been demonstrated to be linked with AD. 2,[17][18][19][20] For instance, the CA1is anatomical, physiological, and functional heterogeneities in the proximal-distal, dorsal-ventral, and anterior-posterior axes of hippocampus.21,22 These studies demonstrate that the hippocampal subfields with unique properites and differential vulnerability to some neuropsychiatric diseases, which are considered as sensitive biomarkers in the early AD detection.
Imaging genetic studies confirm that hippocampal volume is a highly polygenic trait. 2,23As the development of the emergence of high field MRI scanners and more sophisticated neuroimaging methods introduced, 24 the genetic architecture, the lifespan changes in hippocampal subfields volumes, and the functions of them are investigated. 25,26 Wang et al. show that a cognitively normal elderly population that carriers of the TREML2 gene have larger volumes of CA1 by using multiple linear regression. 25Furthermore, Ambrée et al. reported that the number of proliferative cells in the DG decreases in H1R knockout mice, which have deficits in spatial learning and memory. 27e first hypothesis of our study is that the changes in the different volumes of hippocampal subfields have the different genetic architecture because the discrepancy of the cytoarchitecture, connectivity patterns, and functions are existed in the hippocampal subfields.
Protein-protein interactions (PPIs) are established to construct metabolic and signal pathways to get function because dysfunctions and malfunction of pathways and alterations in PPIs have shown to be related to some diseases, like neurodegenerative disease 28 (such as AD). 29BACE1 cleaves APP in the first step in β-amyloid (Aβ) peptide production.
PPIs between nuclear factor kappa-B (NF-κB) interaction with BACE1 enhances BACE1 transactivation and promotes amyloid production in AD. 32,33 The regulations of BACE1 are also related to AD. 34 For example, BACE1 accumulation in axonal swellings is triggered by GGA3, which is linked to late-onset AD. 34 BACE1 exhibits prominent localization in the stratum lucidum of the hippocampus, composed of axons and presynaptic terminals of mossy fibers from granule cells in the dentate gyrus. 35Local elevation in BACE1 processing could contribute to amyloid burden in the progress of AD. 36 To determine the potential molecular structure-to-function of the candidate proteins in AD, atomic-molecular dynamics (MD) simulation, and coimmunoprecipitation (Co-IP) reveal the complete microscopic model of PPIs and determine the potential molecular structure-to-function of the candidate proteins in AD.Hence, the second hypothesis is that candidate proteins may be involved in BACE1 regulation in AD through PPIs which can be verified by MD simulation and Co-IP in our study.
According to these two hypotheses, after downloading the imaging data, clinical data and genetic data from the Alzheimer's Disease Neuroimaging Initiative (ADNI; http://adni.loni.usc.edu/),we extracted the hippocampal subfields by the Freesufer software (version 6.0) 37 and selected the coding nonsynonymous variants, constitute more than 50% of the mutations known to be involved in human inherited diseases, 38 by filtering pipelines.Co-sparse reduced rank regression (CSRRR) and simple linear regressions were used to analyze the association between the 12 hippocampal subfields of two hemispheres and the selected non-synonymous variants.Finally, we also combined experimental methods (co-immunoprecipitation (Co-IP)) and computational methods (homology modeling, molecular docking, and molecular dynamics (MD) simulation) to reveal the biological mechanism of effector genes involved in AD.

| MATERIAL S AND ME THODS
Detailed procedures for the association analysis between imaging phenotypes and genetic variants are provided in Figure 1.

| Participants
The normal distribution was tested by the two-sample of Kolmogorov-Smirnov test, and the homogeneity of variance was also checked by Levene's Test.The results of variance inflation factor (VIF) and minimum covariance determinant (MCD) were applied to handle collinearity and mis-measured outliers in our study.
To compare the difference of demographic characteristics and 12 hippocampal subfields between two groups, several two-sided parametric or nonparametric difference analyses were performed due to the distribution of the data.

| FreeSurfer-based segmentation of hippocampal subfields
The FreeSurfer software was applied to analyze hippocampal subfields volumes in patients with AD along with data from matched controls in our study because of its automation, availability, and higher accuracy. 39The hippocampal subfield segmentation was based on a Bayesian modeling approach and manual delineations of each hippocampal subfield by FreeSurfer. 40,41The outputs of the hippocampal segmentation are left and right hemisphere images with label assignments for voxels in the hippocampal area to one of twelve subregions 42 : CA1, molecular layer (ML), hippocampal tail, subiculum, presubiculum, granule cell layer of dentate gyrus (GC-ML-DG), CA4, CA3, hippocampal fissure, hippocampusamygdala-transition-area (HATA), and fimbria.After hippocampus segmentation, FreeSurfer was applied to obtain volumes of the hippocampal subfields, total hippocampal formation volume, and intracranial volume 40 in our study.
The procedures of the segmentation of hippocampal subfields were fully automated without manual editing.All the images were checked and interpreted by 1 psychiatric resident physician and 1 radiologist.One subject was excluded because of the poor image quality.

| Selection on nonsynonymous mutations
Nonsynonymous mutations change the sequence of amino acids and then affect the genetic function, while the synonymous mutations do not affect the genetic function. 43Nonsynonymous mutations were obtained by filtering according to the following pipeline: quality control (QC) was carried out using the PLINK software 1.90 beta (developed by Christopher Chang with support from the NIH-NIDDK's Laboratory of Biological Modeling, the Purcell Lab, and others); genetic imputation was performed on the Michigan imputation server (https://imput ation server.sph.umich.edu/index.html#!pages/ home), which was a new webbased service for imputation that facilitated access to new reference panels; and then, annotation was carried out to determine nonsynonymous variants by using the ANNOVAR software, an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes (http://www.openbioinf ormat ics.org/annov ar/), 44,45 since non-synonymous variants were impacted by the degree of genetic diversity and pattern of linkage disequilibrium 46 ; principal components (PCs) analysis was subsequently done using the EIGENSTRAT, a leading association mapping method in terms of its popularity, power, and type I error control.

| Association analysis between imaging phenotypes and genetic variants
8][49] To eliminate the influence of covarites (an intercept, age, sex, intracranial volume, 50 APOE4, and several top significant PCs in SNPs), we regressed hippocampal subfields to these covariates.The resulting residuals and the selected 11,596 nonsynonymous variants were treated as response (Y, 24 hippocampal subfields because of 12 hippocampal subfields for per hemisphere) and explanatory variables (X), respectively.CSRRR 51 was then performed because it is an efficient way to select causal nonsynonymous variants and affected hippocampal subfields simultaneously via nonconvex penalty based on a group primal dual-active set formulation.The following formula depicted the CSRRR model used in our study: where C denoted the coefficient matrix linking the volumes for the 24 hippocampal subfields.C∈ℝ p × q (ℝ p represents p-dimensional (p = 11,596) genetic variation, ℝ q represents q-dimensional (q = 24) image phenotype), ||•|| F denoted the Frobenius norm, rank(•) indicated the matrix rank, and ||C|| 2,0 counted the number of nonzero rows in C, K x represented the desired levels of sparsity in genotypes, k y the desired levels of sparsity in phenotypes, and r represented the rank of the coefficient matrix. 51sed on the results of CSRRR analysis, the minimum p-value of each SNP was the smallest p-value by regressing the SNP to each selected hippocampal subfields.Bonferroni correction was applied to correct for multiple comparisons.
In order to analyze the effect of selected SNPs on hippocampal subfields function, image phenotype, and clinical scales, statistical analysis of hippocampal subfields was performed using Student's ttest, Wilcoxon rank-sum test, or chi-square test due to the disease grouping and variation grouping information.

| Docking and MD simulation of the selected SNPs with BACE1 complex
9][60] Therefore, proteins that were transcribed by the significant nonsynonymous variants were modeled in three different ways, respectively (including the SWISS-MODEL, the first fully automated protein homology modeling server in comparative modeling; the I-TASSER server, which has recently been ranked as the best server for protein structure prediction community wide 61 ; and the AlphaFold Protein Structure Database (AlphaFold DB, https:// alpha fold.ebi.ac.uk), an openly accessible and extensive database of high-accuracy protein structure predictions powered by AlphaFold v2.0 of DeepMind).The point amino acids were then mutated using Chimera 1.14, which is a program for the interactive visualization and analysis of molecular structures and related data. 62The qualities of the protein structures were evaluated using PROSA, which is a suite of programs to check the stereochemical quality of protein structures. 63ch wild and mutant protein was subsequently docked with BACE1 by HADDOCK, 64 and MD simulations were performed.

| Co-IP experiment
To verify whether the increase of Aβ deposition would be induced by PPIs between the candidated proteins and BACE1 which caused    S2).The results of difference analysis between the AD and HC cohorts showed that no differences in age (p-value = 0.907), sex (p-value = 0.815), and race (p-value = 0.357) were existed (see Supplementary Materials Table S3).APOE4, an important genetic biomarker for AD pathophysiology, was significantly different between the AD and HC (p-value < 0.001) (see Supplementary Materials Table S3).

| Association analysis between imaging phenotypes and genetic variants
By applying CSRRR and linear regression, we identified five genetic variants associated with the hippocampal subfields.Table 1 described the variant in detail.Figure 2 depicted the genetically affected hippocampal subfields on standard resolution magnetic resonance imaging (MRI).The significance threshold was set at 0.05/ number of independent variants.

| Docking and MD simulation of the selected SNPs with BACE1 complex
As shown in Table 2, USP10 had the strongest interaction with BACE1 in these selected SNPs.The interaction of the

TA B L E 1
The results of the association analysis between imaging phenotypes and genetic variants.of the Van der Waals, electrostatic, and desolvation energies, and the buried surface area represents the interaction strength of the protein complex binds.
suggesting the system became more tight and stable after mutation. 65The solvent accessible surface areas (SASA) for the protein structures show the dimensional discrepancy in 50 ns between the wild type and its mutants. 66Hence, the SASA of the mutant BACE1-USP10 Val204Leu was smaller than that of the wild type, indicating the interaction contact area of its complex was smaller (see Figure 3C).From Figure 3D, the number of hydrogen bonds accounting for protein rigidity and the protein's ability to interact  with its partners in BACE1-USP10 Val204Leu were lower than in wild type.As Figure 3E showed, the distance of the intermolecular hydrogen bond between the 384 tyrosine (Tyr) of mutant USP10 Val204Leu and 62 glutamic acid (Glu) of BACE1 was closer than that of wild USP10.Consequently, the reason for the enhanced interaction of the mutants may be caused by the shortened length of hydrogen bonds.

| Co-IP experiments
USP10 (NP_001259004.1) was selected for the following Co-IP experiment based on the results of the docking and MD simulation.
The results of the sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) analysis using hemagglutinin (HA) rabbit polyclonal antibody showed that the HA-USP10 band was detected in the pull-down complex (Figure 4A).In addition, the results of western blotting revealed that the Flag-BACE1 band was detected in the pull-down complex (Figure 4B).These findings demonstrate that BACE1 interacts with the protein USP10.

|
The association analysis between the candidated genes on the hippocampal subfields volume and clinical scales 6.3.1 | The effects of candiated genes on the hippocampal subfields volume Taking USP10 as an example, USP10 homozygous variants had smaller standardized GC-ML-DG volume in the right hippocampus than USP10 heterozygous variants (p-value <0.05) (Table 3).Both AD and CN of the 88 heterozygous individuals carrying the USP10 gene were statistically different in the hippocampal subfields of GC-ML-DG on both sides (p-value<0.05)(Table 4).The AD and CN of the 229 individuals with homozygous variants carrying the USP10 gene were also statistically different in the GC-ML-DG hippocampal subfield on both sides (p-value<0.05)(Table 5).Therefore, carrying USP10 might cause changes in GC-ML-DG.
The details about the other 4 candidated genes effects on the hippocampal subfields volume are described in the Supplementary cause changes in right GC-ML-DG(Tables S4-S9, p-value<0.05).

| DISCUSS ION
Degeneration of adrenergic neurons in locus coeruleus of brainstem and/or of serotonergic neurons sends projections to cerebral cortex and hippocampus and leads to impaired metabolic and functional interactions of neurons in the hippocampus. 3,67,68According to four distinct spatiotemporal trajectories of tau pathology, one of the four subtypes of AD accounted for the most (33%) is subtype of tau that spreads within the temporal lobe and affects memory. 69e hippocampus in the temporal lobe compries histologically and unique functional distinguishable subfields with differential vulnerability to AD.The hippocampus is subdivided by using FreeSufer due to the cytoarchitecture of the hippocampal subfields.Using brain scans in the ADNI dataset, we demonstrated that the difference of hippocampal subfields was affected by difference in their genetic architecture.The identification on genetic architecture and specific genetic variants on hippocampal subfields is useful to better understand the underlying biological functions of subfields and their roles in the development of AD.
We identified several genetic variants (USP10, TRPV1, NDUFA11, MRGPRX3, and SEPT9) associated with the volumes of the hippocampal subfields.This findings also largely agree with previous studies.1][72] The atrophy of synapses between the cortex and hippocampus has been shown to be caused by the reduction in CA4 volume. 73For NDUFA11, mutations in NDUFA11 are associated with severe mitochondrial complex I deficiency.
Mitochondrial complex I dysfunction accelerates amyloid toxicity and mitochondrial complex I dysfunction in aging, which may contribute to the pathogenesis of sporadic AD. 74 SEPT9 interacts with kinesin KIF17 and interferes with the mechanism of NMDA receptor cargo binding and transport.[78][79] The results of homology modeling, molecular docking, MD simulations, and Co-IP experiments show that USP10 has the strongest interaction with BACE1 among five identified genes.USP10 is a member of the USP domain family of deubiquitinating enzymes (DUB, a new therapeutic target in cases of neurodegenerative diseases 80 ), which comprises over 50 members, including USP8 and USP25. 81,82The study by Yeates et al. demonstrated that BACE1 was a direct substrate of USP8 deubiquitination and induces an increase in Aβ. 83 Zheng's study demonstrated that USP25 promoted the cleavage of APP as well as the generation of Aβ through deubiquitination of BACE1. 82Our finding is also consistent with previous reports of reduced USP10 activity, decreasing Aβ secretion to ameliorate Aβ plaque load and improving deficits in learning memory. 84e results of association analysis between the candidated genes on the hippocampal subfields volume and clinical scales showed that candidated genes influenced the volume and function of hippocampal subfields.Taking USP10 as an example, homozygous variants of USP10 had smaller standardized granule cell and molecular layer of the dentate gyrus (GC-ML-DG) volume in the right hippocampus than USP10 heterozygous variants (p-value <0.05).And homozygous variants were statistically different compared to heterozygous variants on the cognitive scale (p-value <0.05).GC-ML-DG volume was found to be smaller in patients with MCI or early MCI compared with CN. 85 The cause of GC-ML-DG atrophy is associated with abnormal Aβ1-42 and P-Tau181 (A + T+) in AD patients and MCI subjects. 85In the AD group, G. Šimić et.al found a significant loss of neurons in the DG (https://pubmed.ncbi.nlm.nih.gov/9067838/).
However, our study has several limitations.Firstly, the use of different MRI scanner types from different centers may result in bias.
Secondly, our small sample size limits the generalizability of our results.In addition, we only investigated the interaction between USP10 and BACE1.Additional genes for AD will likely be identified if other proteins related to AD besides BACE1 are included.
Taken together, the involvement of USP10 in the pathological and molecular mechanisms underlying AD is preliminarily demonstrated by the MD and CO-IP experiment, and warrants further exploration.

TA B L E 6
Results of the association analysis between the candiated genes and cliniccal scales.In conclusion, we identify novel non-synonymous variants that influenced specific hippocampal subfields and demonstrate that difference genetic architecture on hippocampal subfields, associated with specific biological processes and functions, showing that a greater specificity of the hippocampal subfields is existed.
We believe that the specificity may help us to understand the underlying hippocampal neurobiology and its related functions in AD.
In our study, approximately 600,470 variants on chromosome 1-22 and 1.5 T accelerated T1-weighted structural MRI scans of primarily the hippocampus in 175 AD and 214 NC individuals were acquired from the ADNI-1 database(http://adni.loni.usc.edu/).Demographic and clinical data (e.g., age, gender, APOE4, the Mini-Mental State Examination (MMSE) scale, and Geriatric Depression Scale (GDS)) were also gathered.Selection and exclusion criteria are available on the ADNI website (http://adni.loni.usc.edu/wp-content/uploa ds/2010/09/ADNI_Gener alPro cedur esMan ual.pdf).The investigators within ADNI did not | 3 of 15 CAI et al. participate in analysis or writing of this manuscript.Information about written Informed or phone consent, all relevant ethical guidelines and/ or ethics committee approvals were seen in the ADNI-1 data set.
the hippocampal atrophy in AD patients, we conducted a Co-IP experiment based on the results of MD simulation.Human embryonic kidney 293 cells (HEK 293 T) carrying SV40 large T antigen (Cat: CRL-11268), not human tissues, obtained from the ATCC and used in the Co-IP experiment.We first cotransfected pCDAN3.1(+)-Flag-BACE1and pCDAN3.1(+)-HA-USP10plasmids into HEK 293 T cells for 48 hours.Then, HEK 293 T cells were subjected to a Co-IP assay using anti-Flag magnetic beads.Anti-HA magnetic beads were coimmunoprecipitated with cotransfected HEK 293 T cells.Finally, western blotting was conducted using Flag rabbit polyclonal antibody.Detailed information on the Co-IP experiment is available in the supplementary materials. min

F I G U R E 2
BACE1-ubiquitin-specific protease 10 (USP10) complex was weaker than that of the mutant BACE1-USP10 Val204Leu (rs1812061).The high ambiguity driven protein-protein docking (HADDOCK) score of BACE1-USP10 Val204Leu (−63.0 ± 17.5) was the lowest, which suggested that its interaction was the highest.We also found that the proteins MRGPRX3 and TRPV1 did not interact with BACE1 because their HADDOCK score >0.All three modeling modalities showed that USP10 interacted with BACE1 (HADDOCK score <0).Docking and MD simulations showed that no erratic fluctuations existed in the molecular systems and all the complexes were stable according to Figure3A.Figure3Bdisplays that the volumetric and compactness variations were induced by the complex, Association of Significant Genes with Hippocampal Subfields.(A): each hippocampal subfields of Normal Sample.(B): MRGPRX3 was associated with right GC-ML-DG and left CA4.(C): NDUFA11 was associated with left GC-ML-DG and left CA3.(D): SEPT9 was associated with left CA4, left GC-ML-DG, right GC-ML-DG, right CA4 and right CA3. (E): TRPV1 was associated with left CA4, right HATA, left molecular layer, right subiculum and left CA3.(F): USP10 was associated with right GC-ML-DG.From left to right column: Axial, Coronal, Sagiital, Posterior 3D render, Superior 3D render.TA B L E 2 Docking results of different proteins with BACE1 by different modeling software.

F I G U R E 3
MD simulation results of BACE1-USP10 and BACE1-USP10 Val204Leu.The blue line represents the wild type and the red line represents the mutant BACE1-USP10 Val204Leu.(A).The root mean square deviation (RMSD) plot shows that there were no erratic fluctuations in the molecular systems, and all complexes were stable.(B).The results of the radius of gyration (Rg) show the volumetric and compactness variation induced by the complex.(C).The results of the solvent accessible surface area (SASA) for the protein structures show the dimensional discrepancy.(D).The results of the hydrogen bonds account for protein rigidity and the protein's ability to interact with its partners.(E).The distance of the intermolecular hydrogen bond between the 384 tyrosine (Tyr) of mutant USP10 Val204Leu and 62 glutamic acid (Glu) of BACE1 was closer than that of wild USP10.F I G U R E 4 SDS-PAGE analysis results of BACE1 and USP10 interaction.(A): HA-USP10 band was detected in the pulldown complex.(B): Flag-BACE1 band was detected in the pull-down complex.TA B L E 3 Results of the difference test for the effects of USP10 gene polymorphism on the hippocampal subfields' volume.e-05 (9.020 e-05, 1.138 e-04) 1.029 e-04 (9.040 e-05, 1 To confirm the effect of candidate genes on the volume of hippocampal subfields and clinical scale, the Mini-Mental State Examination (MMSE) A p-Value <0.05 was considered statistically significant in some analyses.All the statistical analyses were carried out using the R (version 4.2.0).5 | RE SULTS5.1 | Preprocessed dataAfter a series of preprocessing steps on the genetic data (see Supplementary Materials and Methods for details), a total of 11,596 SNPs in 150 AD patients and 180 normal controls (NCs) were reserved as the independent variables.Six top significant PCs were treated as additional covariates.After hippocampal segmentation, 24 hippocampal-subfields volumes (continuous data) were extracted as the high-deminsional dependent variables.The results of VIF indicated that there was no collinearity between variables.No outlier was existed according to the results of MCD(see Supplementary Materials Table

Gene name Hippocampal subfield Estimate Position rs number p value * p adjust *
*As shown, p-values are not corrected for multiple comparisons, and p-value adjust are corrected after Bonferroni correction.
Note: 1. SWISS-MODEL(https://swiss model.expasy.org)was the first fully automated protein homology modeling server in comparative modeling.2.The iterative threading assembly refinement (I-TASSER)server is an integrated platform for automated protein structure and function prediction based on the ab initio folding.3. The AlphaFold Protein Structure Database (AlphaFold DB, https://alpha fold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions, powered by AlphaFold v2.0 of DeepMind.4. The HADDOCK score was obtained by the weighted average

Table 6
The results of the difference test for the effects of USP10 homogenous variants on hippocampal subfields' volume.