Nonlinear biomarker interactions in conversion from mild cognitive impairment to Alzheimer's disease

Abstract Multiple biomarkers can capture different facets of Alzheimer's disease. However, statistical models of biomarkers to predict outcomes in Alzheimer's rarely model nonlinear interactions between these measures. Here, we used Gaussian Processes to address this, modelling nonlinear interactions to predict progression from mild cognitive impairment (MCI) to Alzheimer's over 3 years, using Alzheimer's Disease Neuroimaging Initiative (ADNI) data. Measures included: demographics, APOE4 genotype, CSF (amyloid‐β42, total tau, phosphorylated tau), [18F]florbetapir, hippocampal volume and brain‐age. We examined: (a) the independent value of each biomarker; and (b) whether modelling nonlinear interactions between biomarkers improved predictions. Each measured added complementary information when predicting conversion to Alzheimer's. A linear model classifying stable from progressive MCI explained over half the variance (R2 = 0.51, p < .001); the strongest independently contributing biomarker was hippocampal volume (R2 = 0.13). When comparing sensitivity of different models to progressive MCI (independent biomarker models, additive models, nonlinear interaction models), we observed a significant improvement (p < .001) for various two‐way interaction models. The best performing model included an interaction between amyloid‐β‐PET and P‐tau, while accounting for hippocampal volume (sensitivity = 0.77, AUC = 0.826). Closely related biomarkers contributed uniquely to predict conversion to Alzheimer's. Nonlinear biomarker interactions were also implicated, and results showed that although for some patients adding additional biomarkers may add little value (i.e., when hippocampal volume is high), for others (i.e., with low hippocampal volume) further invasive and expensive examination may be warranted. Our framework enables visualisation of these interactions, in individual patient biomarker ‘space', providing information for personalised or stratified healthcare or clinical trial design.

the accumulation of amyloid-β plaques is a defining pathological feature of Alzheimer's, many cognitively-normal older adults also have elevated amyloid-β plaque levels (Villemagne et al., 2011). Seemingly, amyloid-β deposition is not solely sufficient to cause dementia (Aizenstein et al., 2008). Furthermore, while CSF tau was found to positively correlate with severity of cognitive impairment (Shaw et al., 2009), increased CSF tau appears to indicate neuronal injury and neurodegeneration in different diseases (Schoonenboom et al., 2012). Indeed, a broad range of potential biomarkers are available for Alzheimer's. This includes hippocampal volume, whole-brain volume and as well as apparent 'brainage'. Previous work has shown that having an older-appearing brain is associated with conversion to Alzheimer's within 3 years, and is more statistically informative than hippocampal volume (Franke & Gaser, 2012;Franke, Ziegler, Klöppel, & Gaser, 2010;Gaser et al., 2013). We have previously used the brain-age paradigm to demonstrate abnormal brain-ageing after a traumatic brain injury, in treatment resistant epilepsy, multiple sclerosis and Down's syndrome Cole et al., 2020;Cole, Leech, & Sharp, 2015;Pardoe et al., 2017). Brain-age also relates to cognitive performance and mortality risk in older adults from the general population (Cole et al., 2018), suggesting that this metric reflects something of the brain's sensitivity to more general health. The range of available biomarkers indicates the biological heterogeneity of Alzheimer's and more work is needed to incorporate this heterogeneity into predictive models of disease progression. This should improve specificity and better enable treatment decisions to be made at the individual level, to aid in clinical practice and evaluate potential treatments.
To improve specificity in the use of biomarkers for staging Alzheimer's, multiple measures are likely to be necessary. This way, complementary information from different biological sources can be combined to build a more comprehensive picture of the underlying diseases processes, and capture disease heterogeneity more accurately.
An essential consideration when combining these multiple sources of information is how they interact. For instance, while amyloid-β deposition may precede and potentially drive subsequent neurodegeneration in some instances, the magnitude of amyloid-β deposition is likely to be key. An individual may be 'positive' for amyloid-β on a PET scan but remain below some latent threshold for neuronal or glial loss (Fricker, Tolkovsky, Borutaite, Coleman, & Brown, 2018). Once over that threshold, neurodegeneration might occur, though if they remain below this threshold, neurodegeneration may be driven by other factors, or not occur at all. This accords with Jack et al. (2016) who proposed a multistate transition model, with two distinct pathways to Alzheimer's; one where cerebral amyloid-β deposition occurs prior to neurodegeneration, and one where neurodegeneration occurs prior to amyloid-β deposition. In this second pathway, individuals may never accumulate sufficient amyloid-β to cross the threshold to amyloidmediated neurodegeneration. However, they may have been exposed to negative genetic or environmental factors, not necessarily Alzheimer's related, that have influenced the rate of age-associated change in brain structure. On this backdrop of poorer brain health, potentially only minimal Alzheimer's-specific pathology is necessary to drive disease progression. The presence of such thresholds in how two biological processes interact is inherently nonlinear, however, nonlinear relationships are under-studied in neurology and remain in the domain of systems biology, for example in modelling cell signalling networks (Sung & Hager, 2012). Nonlinear models that incorporate thresholds, plateaus and other complex patterns are theoretically better positioned to detect such relationships, and machine learning analysis offers a range of tools for modelling nonlinearities.
Machine learning emphasises out-of-sample prediction and is readily able to incorporate high-dimensional data, hence is well-suited to making individualised predictions of disease outcomes. However, only a limited number of studies have specifically modelled interactions between biomarkers and crucially, these have only been linear interactions (Bilgel et al., 2018;Caroli et al., 2015;Fortea et al., 2014;Li et al., 2014;. For example,  showed that being positive for both amyloid-β-PET and CSF phosphorylated-tau (P-tau) was associated greater cognitive decline and greater risk of disease progression in people with MCI, than being positive for either biomarker in isolation. However, when dichotomising the continuous measures of amyloid-β-PET and CSF P-tau into positive or negative, it is not possible to detect more complex relationships between biomarkers.
Hence, this motivates research into alternative methods that enable both linear and nonlinear interactions to be modelled.
One approach to modelling nonlinear interactions is Gaussian Processes, a family of models that assigns a nonparametric Gaussian smoothing function to each biomarker ( Figure 1). These 'kernels' can then be combined, allowing nonlinear effects in both additive (i.e., main effects) and interactions to be modelled (an example of nonlinear main effects is shown in Supplementary Figure S1). Here, we tested the hypothesis that the conversion from MCI to Alzheimer's can be better predicted by modelling biomarkers interactions (linear or nonlinear), than by using the additive sum of independent effects on disease progression.
Using the ADNI data set, we analysed people with stable and progressive MCI to investigate: (a) the independent value each biomarker has in predicting conversion; and (b) whether modelling nonlinear interactions between biomarkers could improve the fit of classification models that predict conversion to Alzheimer's. We included multiple biological indices, to capture different facets of Alzheimer's: APOE genotype, CSF measures of amyloid-β42, total tau and P-tau, PET measure of amyloid-β deposition, and structural MRI measures of global 'brain age' and of hippocampal volume.

| METHODS
A high-level overview of the methods used in the study is included in

| Participants
Participants were drawn from the publicly-available ADNI data set.
Inclusion criteria included the availability of all biomarkers (genetic, fluid, PET, MRI), a diagnosis of MCI at baseline imaging assessment and the availability of clinical follow-up 3 years after the imaging assessment to determine disease progression. Stable MCI participants were those who still met the criteria for MCI after 3 years; progressive MCI participants met the criteria for a clinical diagnosis of dementia at or before the three-year assessment. The diagnostic criteria for MCI is based on a mini-mental state exam score between 24 and 30 (inclusive) and a clinical dementia rating = 0.5 with a memory box score of at least 0.5, indicating that general cognition and functional performance are sufficiently preserved such that a diagnosis of Alzheimer's disease cannot be made by the site physician at the time of the screening visit.
In total, n = 206 people with MCI were included in the analysis;

| Biomarkers: Fluid
Samples of 20 ml of CSF were obtained from participants by a lumbar puncture with a 20-or 24-gauge spinal needle, around the time of F I G U R E 1 Overview of study methods. (a) Raw T1-weighted MRI scans are pre-processed via the DARTEL pipeline in SPM12 to obtain grey matter and white matter volume maps. These are fed into a pretrained Gaussian Processes Regression based 'brain-age' prediction software to arrive at an estimation of Brain-PAD, which is the difference between chronological age and neuroimaging-predicted 'brain-age'. (b) CSF features and hippocampal volume alongside genetic and demographic information are also used as biomarkers; (c) Subsampling stable MCI to overcome the class imbalance. Repeated random subsampling of the stable MCI group 100 times for Gaussian Processes based models; (d) Partitioning of total variance explained into independent and shared variance explained for each biomarker; Gaussian Processes that allow just for main effects of biomarkers to be modelled are constructed by adding all the univariate squared exponential kernels corresponding to each biomarker. Fullorder interactions between all biomarkers considered are captured through a single multivariate squared exponential kernel; Full set of statistics (sensitivity, specificity, accuracy, AUC) are computed for each replicate. Paired t-tests are conducted between accuracy scores stemming from main effects models and a model containing a multivariate kernel to assess if the increase in generalisation to unseen data are statistically significant their baseline scan. All samples were sent same-day on dry ice to the University of Pennsylvania ADNI Biomarker Core laboratory. There, levels of the proteins (amyloid-β42, total tau, and P-tau) were measured and recorded, as described previously (Shaw et al., 2009). By design, only a subset of ADNI participants had measurement of CSF levels.

| Biomarkers: PET
Cerebral amyloid-β deposition was indexed using the PET tracer [18 F ] florbetapir (F-AV45). PET imaging was performed within 2 weeks of the baseline clinical assessments, as described previously . In brief, four late-time 5-minute frames are co-registered and averaged. The resulting image is converted to a 160 × 160 × 96 voxel static image with voxel dimension of 1.5mm 3 . Finally, an 8 mm full-width half-maximum (FWHM) Gaussian kernel was applied (corresponding to the lowest resolution scanner used in the study).
These primary data were downloaded from the ADNI database and used in the subsequent analyses.
F-AV45 data were nonlinearly registered into Montreal Neurological Institute 152 space (MNI152 space) using DARTEL (Ashburner, 2007). Initially the structural MRI images were segmented into grey matter and white matter using the Statistical Parametric Mapping (SPM12) software package (University College London, UK) and registered to a group average template. The group average template was then registered to MNI152 space. The F-AV45 standardised uptake value ratio (SUVr) image for each participant was registered to the corresponding T1-weighted MRI using a rigid-body registration.
Finally, the individuals' DARTEL flow field and template transformation was applied without modulation resulting in F-AV45 images in MNI152 space. The normalised maps were spatially smoothed (8 mm FWHM Gaussian kernel). A neuroanatomical atlas (Tziortzi et al., 2011) and a grey-matter probability atlas in MNI152 space were employed to calculate regional SUVr values. SUVr values were quantified using the grey matter cerebellum as the reference region (defined as the intersection between the cerebellum region from the anatomical atlas and the grey-matter atlas, thresholded at p > .5). The mean uptake value for cerebellar grey matter was obtained and each image was divided by this to generate an SUVr image for each participant. Finally, an average cortical SUVr value was obtained by calculating the mean SUVr value for all cortical regions (weighted by regional volume).

| Biomarkers: Structural MRI
Three-dimensional T1-weighted MRI scans were acquired at either 1.5T (ADNI-1) or 3T (ADNI-2 and ADNI-GO) using previously described standardised protocols at each site 2008. All MRI scans were pre-processed using SPM12. This entailed tissue segmentation into grey matter and white matter, followed by a nonlinear registration procedure using DARTEL (Ashburner, 2007) to MNI152 space, subsequently followed by resampling to 1.5 mm 3 with a 4 mm Gaussian FWHM kernel.
Processed grey matter and white matter images were then entered in the Pattern Recognition for Neuroimaging Toolbox (PRoNTo) software (Schrouff et al., 2013). Using a previously trained model that predicts chronological age from structural neuroimaging data, as per our previous work (Cole et al., 2015;Cole et al., 2018;Cole, Annus, et al., 2017), we generated a brainpredicted age value for each participant. This step used a Gaussian Processes regression model with a linear kernel, with processed neuroimaging data as the independent variables, and age as the dependent variable. The training data set used to define the model included n = 2001 health adults aged 18-90 years; further details have been reported previously . Finally, we generated brainpredicted age difference (brain-PAD) values; chronological age subtracted from brain-predicted age (i.e., the model's predictions). Note: Paired T-test were carried out to assess group differences. p values were uncorrected for multiple comparisons. Abbreviations: APOE4, apolipoprotein e4; Brain-PAD, brain predicted age difference.

| Hierarchical partitioning of variance
We examined the independent contribution of each biomarker for distinguishing between stable and progressive MCI participants in a multivariate model, using hierarchical partitioning of variance (Chevan & Sutherland, 1991). Here, a hierarchical series of logistic regression models are fit, with all possible combinations of biomarkers. By comparing the variance explained by each model across all possible models, an unbiased approximation of the variance attributable to each variable can be derived, both that independent of all other variables and that shared with other variables (due to the correlations between them).

| Gaussian processes classification
Gaussian Processes are a supervised learning algorithm, that has found applications in many fields including our previous neuroimaging research (Cole et al., 2015), due to its Bayesian and non-parametric properties. The principal behind Gaussian Processes is that a measured (dependent) variable is modelled by defining a multivariate Gaussian distribution, hence we can view them as being a distribution over functions (Rasmussen & Williams, 2006).
To obtain Gaussian Processes models where our predictor variables are independent, we assigned to each biomarker an individual squared exponential kernel, and subsequently all kernels are added together to derive a global kernel. To introduce interactions between biomarkers, we used multiple covariates within a single squared exponential kernel. Duvenaud, Nickisch, and Rasmussen (2011) proposed that this kernel construction defines an m-order interaction over covariate space, where m is the total number of covariates used in building the kernel. For example, two-way interactions were constructed by assigning the respective two biomarkers to a single kernel, subsequently adding this to the list of univariate kernels specific to each biomarker. For all models we used the nonsparse version of the Scalable Variational Gaussian Processes Classifier (Hensman, Matthews, & Ghahramani, 2015). In terms of data preprocessing, all input variables were whitened. We used the entire available data set, without excluding outliers. Continuous variables were used in their raw form without dichotomisation or cutoffs.  Figure S2.

| Bootstrapping to adjust for class imbalance
A list of the two-way and three-way interactions tested can be found in Table 2. To determine whether Gaussian Processes for classification resulted in improvements over less complex linear models, we performed the same classification task using logistic regression, with the same combination of different input variables and interaction terms (Supplementary Table S1). When considering model performance, we opted to focus on sensitivity to detecting progressive MCI as the chief criteria, under the assumption that a false-negative classification of stable MCI, is clinically more deleterious than a false positive.

| RESULTS
A total of n = 158 stable MCI participants and n = 48 progressive MCI participants were included in our study. Note: T-tests (paired) based on comparison between accuracy and log likelihood scores on 100 bootstrapped sets are provided as a mean to assess the relative increase in generalisation attributable to the introduction of either two-way interactions or three-way interactions between biomarkers in comparison to main effects only variants for two-way interactions, respectively main effects plus a summation of pairwise two-way interactions terms between the three variables in question for three-way interactions. Abbreviation: AUC, area under receiver operator characteristic curve. age = 0.028, sex = 0.004, APOE4 = 0.069, amyloid-β42-CSF = 0.063, total tau = 0.054, P-tau = 0.045, amyloid-β-PET = 0.048, hippocampal volume = 0.126, brain-PAD = 0.076. A substantial proportion of the variance was shared between variables, indicating that these biomarkers reflect some common as well as unique processes underlying progression from MCI to Alzheimer's ( Figure 2).

| Logistic regression does not benefit from interaction terms
When using logistic regression to classify stable and progressive MCI patients, performance was moderately accurate, with balanced accuracies ranging from 0.67-0.72 (see Supplementary Table S1). Notably, for all models, classification sensitivity (range 0.61-0.65) was substantially lower than specificity (range 0.86-0.88), indicating that the logistic regression was better at identifying stable MCI patients than progressive MCI patients. When comparing the models containing interactions terms with their additive counterparts, t-tests showed than none of the interaction models were significant. This indicates that using standard logistic regression methods, where interactions are linear, there is no improvement in classification performance when including interaction terms.

| Nonlinear interactions between hippocampus and amyloid-based biomarkers
Different models were compared to assess the influence of including nonlinear interactions when classifying stable from progressive MCI (

| Visualising nonlinear interactions: Contour plots
The greatest performance improvement occurred when including a two-way nonlinear interaction between amyloid-β-PET and hippocampal volume when adjusting for P-tau (Model 3). To investigate the relationships between these three biomarkers we used 'contour' plots From inspection of the contour plots it is evident that for different levels of P-tau (based on this illustrative tertiary split), the relationship between hippocampal volume and amyloid-β-PET changes. In particular, for participants with medium P-tau levels, the contours are a better fit to the distribution of stable (represented by crosses on Figure 3) and progressive MCI participants (represented by dots) for the interaction model compared to the additive model.

| DISCUSSION
Here, we modelled nonlinear interactions between a panel of biomarkers (neuroimaging, genetic, CSF) to predict disease progression from MCI to Alzheimer's disease. Statistical models that included nonlinear interaction term explained conversion risk moderately  Two-way interaction model, using univariate and bivariate kernels. Comparison of the first row (additive model) and second row (interactive model) conveys the influence of incorporating nonlinear interaction kernels on the biomarker space better than additive linear models (in terms of model fit), though the influence of these interaction models of the sensitivity to progressive MCI was negligible. Importantly, our panel of biomarkers all independently explained a proportion of the variance in a logistic classification model. Though perhaps unsurprising, given evidence of the influence of these biomarkers in Alzheimer's (Caminiti et al., 2018;Gaser et al., 2013;Jack et al., 1999;Xu et al., 2013), this justifies the inclusion of different sources of information, despite the evident shared variance (i.e., bivariate correlations) between biomarkers. This is even the case for measures of amyloid; whereby the PET-derived and CSFderived measures were correlated r = −0.54. Crucially, our results show that a proportion of the non-shared variance between these two biomarkers still relates to disease progression. This is in line with previous findings (Mattsson et al., 2014), which showed that amyloid-β42-CSF and amyloid-β-PET provide partially independent information about a wide range of Alzheimer's measures. While a practitioner's decision about whether to acquire PET or CSF measures of amyloid for a single patient are likely influenced by cost and invasiveness, statistically including both will improve predictive accuracy.
We found that the nonlinear interaction between amyloid-β-PET and hippocampal volume significantly increases our model fit on unseen data, suggesting an element of synergy between the two bio-  clinical trials to assess how much change in a given biomarker must occur for an individual to move to from being at low risk from conversion from MCI to Alzheimer's to being at high risk (i.e., moving from blue regions to red in Figure 3).
Weaknesses of the current study include the limited sample size and lack of a replication data set. While we did use 'held-out' data to test model generalisability, we were unable to test the predictive performance of our models to truly independent data. This is one key lim- In conclusion, our findings suggest that multiple underlying neurobiological processes both act independently and interact in a nonlinear fashion during progression from MCI to Alzheimer's. By capturing atrophy (hippocampal volume), amyloid-β deposition (using PET) and neurofibrillary tangle formation (CSF P-tau) in a nonlinear interactive model, we can better fit models to predict conversion than independent effects alone, though main-effects multi-modality models still offer reasonable performance. Our result highlighting three-way interactions in Model 3 provides an unbiased quantification of change in probability of progressing from MCI to Alzheimer's when we modify the value of a certain covariate while maintaining the others stable, in effect providing a quantitative extension to the A/T/N classification framework detailed by Jack and colleagues (2018) (see Supplementary Table S2). For example, the presence of high amyloid deposition increases the chances of progression by 17.5% in tau-negative and neurodegeneration-positive patients. The impact of modelling nonlinear interactions implies that threshold effects and pathological 'plateaus' are present during disease progression and that measuring multiple biomarkers will be necessary to best predict outcomes. The current data do not suggest clinically meaningful benefits of nonlinear-interaction models yet, though our novel visualisation method (contour plots) could be helpful in clinical contexts and provides strong face validity for our approach. Finally, multifaceted therapeutic interventions are likely to be necessary, as simply influencing the levels of a single biomarker will be insufficient to modify the disease in all individuals.

ACKNOWLEDGEMENTS
Data collection and sharing for this project was funded by the

DATA AVAILABILITY STATEMENT
All data used for the preparation of this article was obtained from ADNI and are publicly available. For up-to-date information, see www.adni-info.org. Our statistical analysis code is available online (https://github.com/SebastianPopescu/nonlinear_interaction_GPs).