Multimodal fusion of structural and functional brain imaging in depression using linked independent component analysis

Abstract Previous structural and functional neuroimaging studies have implicated distributed brain regions and networks in depression. However, there are no robust imaging biomarkers that are specific to depression, which may be due to clinical heterogeneity and neurobiological complexity. A dimensional approach and fusion of imaging modalities may yield a more coherent view of the neuronal correlates of depression. We used linked independent component analysis to fuse cortical macrostructure (thickness, area, gray matter density), white matter diffusion properties and resting‐state functional magnetic resonance imaging default mode network amplitude in patients with a history of depression (n = 170) and controls (n = 71). We used univariate and machine learning approaches to assess the relationship between age, sex, case–control status, and symptom loads for depression and anxiety with the resulting brain components. Univariate analyses revealed strong associations between age and sex with mainly global but also regional specific brain components, with varying degrees of multimodal involvement. In contrast, there were no significant associations with case–control status, nor symptom loads for depression and anxiety with the brain components, nor any interaction effects with age and sex. Machine learning revealed low model performance for classifying patients from controls and predicting symptom loads for depression and anxiety, but high age prediction accuracy. Multimodal fusion of brain imaging data alone may not be sufficient for dissecting the clinical and neurobiological heterogeneity of depression. Precise clinical stratification and methods for brain phenotyping at the individual level based on large training samples may be needed to parse the neuroanatomy of depression.

modalities may yield a more coherent view of the neuronal correlates of depression.
We used linked independent component analysis to fuse cortical macrostructure (thickness, area, gray matter density), white matter diffusion properties and restingstate functional magnetic resonance imaging default mode network amplitude in patients with a history of depression (n = 170) and controls (n = 71). We used univariate and machine learning approaches to assess the relationship between age, sex, case-control status, and symptom loads for depression and anxiety with the resulting brain components. Univariate analyses revealed strong associations between age and sex with mainly global but also regional specific brain components, with varying degrees of multimodal involvement. In contrast, there were no significant associations with case-control status, nor symptom loads for depression and anxiety with the brain components, nor any interaction effects with age and sex. Machine learning revealed low model performance for classifying patients from controls and predicting symptom loads for depression and anxiety, but high age prediction accuracy. Multimodal fusion of brain imaging data alone may not be sufficient for dissecting the clinical and neurobiological heterogeneity of depression. Precise clinical stratification and methods for brain phenotyping at the individual level based on large training samples may be needed to parse the neuroanatomy of depression.

| INTRODUCTION
With an estimated prevalence of 4.4%, depression affects more than 300 million worldwide (World Health Organization, 2017) and is a substantial contributor to disability and health loss (Friedrich, 2017).
Identifying useful imaging based and other biomarkers to aid detection of individuals at risk for depression and facilitating individualized treatment is a global aim (Cuthbert & Insel, 2012;Insel, 2014Insel, , 2015. A host of studies across a range of neuroimaging modalities have implicated various brain regions and networks in depression. Metaanalyses of structural magnetic resonance imaging (MRI) studies have suggested thinner orbitofrontal and anterior cingulate cortex in patients with depression compared to healthy controls (Lai, 2013;Schmaal et al., 2017;Suh et al., 2019). A large-scale meta-analysis comprising 2,148 patients and 7,957 controls from 20 different cohorts reported slightly smaller hippocampal volumes in patients with depression compared to controls (Schmaal et al., 2016), but the overall pattern of results suggested substantial heterogeneity and otherwise striking similarity across groups for all other investigated subcortical structures (Fried & Kievit, 2016). A meta-analysis of diffusion tensor imaging (DTI) studies including 641 patients and 581 healthy controls reported fractional anisotropy (FA) reductions in the genu of the corpus callosum and the anterior limb of the internal capsule (Chen et al., 2016), implicating interhemispheric and frontal-striatalthalamic connections among the neuronal correlates of depression.
However, despite meta-analytical evidence suggesting brain aberrations in large groups of patients with depression, the reported effect sizes are small and the direct clinical utility is unclear (Müller et al., 2017;Paulus & Thompson, 2019). One explanation for the lack of robust imaging-based markers in depression may be that previous studies either have focused on a single imaging modality or have analyzed different imaging modalities along separate pipelines and thus failed to model the common variance across features. In contrast, linked independent component analysis (LICA: Groves, Beckmann, Smith, & Woolrich, 2011;Groves et al., 2012) offers an integrated approach by fusing different structural and functional imaging modalities (Groves et al., 2012). LICA identifies modes of variation across modalities and disentangles independent sources of variation that may account both for large and small parts of the total variance that may otherwise be overlooked by conventional approaches. By decomposing the imaging data into a set of independent components, LICA enables an integrated perspective that may improve clinical sensitivity compared to unimodal analyses Francx et al., 2016;Wu et al., 2019).
Apart from the predominantly unimodal approaches in previous imaging studies, large individual differences and heterogeneity in the configuration and load of depressive symptoms represent other factors that could explain the lack of robust imaging markers. Symptombased approaches have revealed more than 1,000 unique symptom profiles among 3,703 depressed outpatients based on only 12 questionnaire items (Fried & Nesse, 2015), suggesting large heterogeneity.
Additionally, depression is highly comorbid with anxiety, with reported rates exceeding 50% (Johansson, Carlbring, Heedman, Paxling, & Andersson, 2013;Lamers et al., 2011). Furthermore, depression can be conceptualized along a continuum including individuals of the general, healthy population that may experience transient symptoms to varying degrees, and thus warrants a dimensional approach.
The main aim of the current study was to determine whether fusion of neuroimaging modalities would capture modes of brain variations which discriminate between patients with a history of depression (n = 170) and healthy controls with no history of depression (n = 71), and which are sensitive to current symptoms of depression and anxiety across groups. To this end, we used LICA to combine measures of cortical macrostructure (cortical surface area and thickness, and gray matter density [GMD]), white matter diffusion properties (DTI-based FA, mean diffusivity [MD], and radial diffusivity [RD]), and resting-state fMRI DMN amplitude.
There is evidence of sex and age differences in the prevalence and clinical characteristics of depression, including lower age at onset of first major depressive episode in women compared to men (Marcus et al., 2005), and longer duration of illness and different symptoms in older compared to younger patients (Husain et al., 2005), which may reflect differential neuronal correlates. Therefore, we tested for main effects of age and sex and their interactions with the resulting brain components' subject weights on group and symptoms.
In addition, we assessed the overall clinical sensitivity of all measures combined using machine learning to classify patients and controls and to predict symptom loads for depression and anxiety, which we compared with age prediction. Based on the above reviewed studies and current models, we anticipated (a) that brain variance related to depression would be captured in components primarily reflecting the previously extended functional neuroanatomy of depression, including limbic and frontotemporal networks and their connections.
Irrespective of having a history of depression, we hypothesized (b) several strong age and sex differences, reflecting well-documented age, and sex-related variance in brain structure, including global thickness and volume reductions with increasing age, and larger brain volume and surface area in men compared to women. To the extent that having a history of depression interacts with sex-and age-related processes in the brain, we hypothesized (c) interactions between the agerelated trajectories and sex differences identified above with casecontrol status or symptoms of depression. To increase robustness and generalizability, we corrected for multiple comparisons across all univariate analyses and performed cross-validation and robust model evaluation in the machine learning analyses.

| Sample
Patients (n = 194) were primarily recruited from outpatient clinics, while healthy controls (n = 78) were recruited through posters, newspaper advertisements, and social media. The patient group was drawn from two related clinical trials (ClinicalTrials.gov ID NCT0265862 and NCT02931487). All participants were evaluated with the Mini International Neuropsychiatric Interview (M.I.N.I 6.0: Sheehan et al., 1998).
Exclusion criteria for all participants were MRI contraindications and a self-reported history of neurological disorders. The study was approved by the Regional Ethical Committee of South-Eastern Norway, and we obtained a signed informed consent from all the participants. Symptom loads for depression and anxiety were evaluated using the Becks Depression Inventory (BDI-II; Beck, 1996) and the Becks Anxiety Inventory (BAI; Beck & Steer, 1993), respectively. The demographics for the final sample (after exclusions, see below) are shown in Table 1. The range of symptom load for depression and anxiety for the control group was from 0 to 20 and 0 to 19, respectively, while the range for the patient group was from 0 to 51 and 0 to 45, respectively (see Figure 1 for the distributions).

| Image acquisition
MRI data were obtained on a 3 T Philips Ingenia scanner (Phillips Healthcare) at the Oslo University Hospital using a 32-channel head coil. The same protocol was used for all participants, but there was a change in the phase-encoding direction during the course of study recruitment and data collection which affected the T1-weighted data for four controls and 95 patients, and resting-state fMRI data for 64 patients. Resting-state fMRI data were collected using a T2* weighted single-shot gradient echo EPI sequence was acquired for 72 controls and 178 patients with the following parameters: TR/TE/FA = 2,500 ms/30 ms/80 ; 3.00 mm isotropic voxels; 45 slices, 200 volumes; scan time ≈ 8.5 min. Participants were instructed to have their eyes open, and refrain from falling asleep.

| Structural MRI preprocessing
Vertex-wise cortical thickness and surface area measures  were estimated based on the T1-weighted scans using FreeSurfer (http://surfer.nmr. mgh.harvard.edu) (Fischl et al., 2002). Details are described elsewhere Fischl et al., 1999) but in short, after gray/white boundary and pial reconstruction, cortical thickness was defined as the shortest distance between the surfaces vertex-wise , before resampling to the Freesurfer common template (fsaverage, 10,242 vertices; Fischl et al., 1999). The vertex-wise expansion or compression was used to calculate vertex-wise maps of arealization. None of the thickness nor surface area data for healthy control (n = 74) nor patients (n = 194) were excluded after visual QC.

| Voxel-based morphometry
GMD maps were created based on voxel-based morphometry (VBM) using the computational anatomy toolbox (CAT12: http://www. neuro.uni-jena.de/cat/) within SPM12 (http://www.fil.ion.ucl.ac.uk/ spm/). This involved brain extraction, gray matter segmentation, and then registration to MNI152 standard space. The resulting images were averaged and flipped along the x axis to create a left-right symmetric, study-specific gray matter template. The modulated gray matter maps were smoothed with a sigma of 4 mm (FWHM = 9.4 mm).
T A B L E 1 Demographics of the final sample. Six patients were missing information about (ISCED level, two controls and two patients were missing AUDIT scores, and two controls and four patients were missing DUDIT scores. p denotes the p-value from group comparisons using chi-square test for sex, handedness, history of additional disorders, and current SSRI medication status while we used Mann-Whitney U tests for the rest. Depression severity was based on BDI-II sum score criteria: minimal (0-13), mild (14-19), moderate (20-28), and severe (29-63 None of the GMD data for healthy control (n = 74) nor patients (n = 194) were excluded after visual QC.

| DTI preprocessing
Processing steps included correction for motion and geometrical distortions based on the two b = 0 volumes and eddy currents by using FSL topup (http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/TOPUP) and eddy (https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/eddy). We also used eddy to automatically identify and replace slices with signal loss (see Figure S1, Supporting Information) within an integrated framework using Gaussian process (Andersson & Sotiropoulos, 2016), which substantially improved the temporal signal-to-noise ratio (tSNR: Roalf et al., 2016) (t = 24.139, p < .001, Cohen's d = 2.13). We fitted a diffusion tensor model using dtifit in FSL to generate maps of FA, MD, and RD. Based on manual QC, we excluded three subjects due to insufficient brain coverage, and three subjects due to poor data quality. One additional subject was flagged with a tSNR of >2 SD lower than the mean and discarded after additional manual QC. This yielded a total number of DTI scans of 71 healthy controls and 178 patients.

| Resting-state fMRI preprocessing
Resting-state fMRI data were processed using the FSL's FMRI Expert Analysis Tool. This included coregistration with T1 images, brain

| Linked independent component analysis
We used FMRIB's LICA (http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FLICA) to perform data-driven multimodal fusion, which evaluates shared intersubject variations across the brain imaging measures (Groves et al., 2011(Groves et al., , 2012. LICA is a Bayesian extension of ICA (Hyvarinen, 1999;Jutten & Herault, 1991), providing a factorization over participants rather than time (Groves et al., 2011(Groves et al., , 2012. This produces spatial maps based on the commonalities across features (e.g., GMD, DTI measures, DMN maps) and subjects, and corresponding subject weights (i.e., the degree to which a subject contributes to an LICA component). An advantage of this method is its ability to combine modalities with different numbers of spatial dimensions or features by applying an ICA decomposition on each modality while at the same time accounting for the spatial correlation of each modality (Groves et al., 2011(Groves et al., , 2012. We included complete data from 70 patients and 171 controls in the decomposition. We chose a relatively low model order of 40 based on previous recommendation of estimating robust components (Wolfers et al., 2017), and the biological meaningfulness of the spatial maps. For transparency and comparison, we also performed similar analysis using a higher dimensionality (80, more details in Supporting Information). For both model orders, we discarded components which were highly driven by one subject (threshold: >20%) yielding a total of 40 and 67 components, respectively ( Figure S3, Supporting Information). One component was strongly associated with phase encoding direction (IC4 in both decompositions, t = 33.07 F I G U R E 1 Histogram of symptom loads of (a) depression based on Becks Depression Inventory (BDI-II) and (b) anxiety based on Becks Anxiety Inventory (BAI) and t = 32.47, p < .001, respectively, Figures S4 and S5, Supporting Information) but not removed from the analyses because of the biologically meaningful spatial patterns.

| Statistical analysis
Statistical analyses were performed in R version 3.5.1 (R Core Team, 2018) and MATLAB 2014A (The MathWorks). We used linear models to test for main effects of clinical characteristics (case-control status, total symptom severity for depression and anxiety), age, and sex on each LICA subject weight with each IC as the dependent variable. In additional models, we tested for interactions between age or sex and clinical characteristics (casecontrol status, total symptom severity for depression and anxiety) on each IC. For the analyses involving symptoms, one healthy control was removed due to missing data. We included phase encoding direction as an additional covariate in all the univariate analyses, and we controlled the false discovery rate (FDR) across tests using p.adjust in R.

| Machine learning approach
For group classification, we submitted all LICA subject weights to shrinkage discriminant analysis (Ahdesmäki & Strimmer, 2010) in the R-package "sda" (http://www.strimmerlab.org/software/sda/). For the main analyses, we used the residuals of each component's subject weight after regressing out age and sex, and additionally, phase encoding for IC4. As a supplemental analysis, we used the residuals of the subject weights after regressing out age, sex, and phase encoding direction from all the ICs. For robustness and to reduce overfitting, we performed cross-validation with 10 folds across 100 iterations.
We calculated area under the receiver-operating curve (AUC) as our main measure of model performance using the R-package "pROC" (Robin et al., 2011). We additionally estimated accuracy (i.e., proportion of correct classification), sensitivity (i.e., ability to correctly detect cases) and specificity (i.e., ability to correctly detect controls) using a probability threshold of .65 to partly account for the imbalanced group size. The relative feature importance was determined by calculating correlation-adjusted t-scores (CAT scores) between the group centroids and the pooled mean (Ahdesmäki & Strimmer, 2010). We determined statistical significance based on AUC using permutation-based testing across 10,000 iterations. Several supplemental analyses were conducted to account for imbalanced group sizes and depression severity in the group classification, as well as SSRI medication use (see the Methods section, Supporting Information).
We used the same framework to predict depression and anxiety total symptom severity, but by implementing shrinkage linear estimation (Schäfer & Strimmer, 2005)  Here, statistical significance was based on RMSE and permutation testing. As a comparison, we also predicted age using the same framework, using residuals of the subject weights after regressing out phase encoding direction in the methods shown above (using Pearson's r instead of Spearman's ρ).

| RESULTS
3.1 | Linked independent component analysis Figure 2a shows the degree of fusion across MRI measures for each component. Figure 2b shows the percentage of the total variance explained by each IC. Most of the components were characterized by region-specific features that were mainly bilateral, with the exception of four global components shown in Figure 3. Briefly, there was no substantial fusion between DMN maps and the other modalities, except for IC26 ( Figure S6, Supporting Information). There were significant main effects of age and sex on 10 and 5 LICA components (see Figure 4), respectively, but no significant main effects of group, with similar results for the decomposition with 80 components (see Tables S2 and S3, Supporting Information). Figure 3 shows the global LICA components associated with age and sex (see Figure S7, Supporting Information for associations with additional LICA components). Age was negatively associated with IC1,  subject weights after multiple comparison correction, but no robust group differences between patients with a history of depression and healthy controls, and no significant interactions between group and sex or between group and age. Likewise, we observed no robust associations with symptom loads for depression or anxiety, nor interactions with age or sex. In line with the univariate analyses, the machine learning approach revealed overall low prediction accuracy for group classification and prediction of symptom loads for depression and anxiety, but high prediction accuracy for age.

| Machine learning analyses
Our univariate analyses revealed no main effects of history of depression or symptom load for depression on any of the LICA components. The machine learning analyses here revealed overall low predictive value both for case-control status and symptoms of depression and anxiety, which is generally in line with the univariate analyses and an increasing body of literature suggesting small differences in brain structure between patients with MDD and healthy controls (Schmaal et al., 2016;Schmaal et al., 2017;Varoquaux, 2018;Wolfers, Buitelaar, Beckmann, Franke, & Marquand, 2015). While considering the overall low performance, the most important feature for classifying patients with a history of depression from healthy controls was a component encompassing covarying patterns of both high and low GMD in cerebellar regions (IC19). There is some evidence that depression is linked to cerebellar regions that communicate with networks related to cognitive and introspective processing (Depping, Schmitgen, Kubera, & Wolf, 2018). Cerebellar structural characteristics have recently been demonstrated to rank among the most sensitive brain features when comparing adult patients with schizophrenia and healthy controls , and also for predicting psychiatric symptoms in youths . The most important feature for predicting depression symptoms was IC0, which involves complex covarying patterns of low GMD and cortical thickness in mainly temporal but also frontal regions. This pattern is largely in line with previous research (Lai, 2013;Schmaal et al., 2017;Suh et al., 2019). IC0 also encompassed high FA and low MD and RD in interhemispheric connections and frontal-striatal thalamic pathways, in line with previous studies, albeit in the opposite direction (Chen et al., 2016). One study reported a positive association between symptom load for depression and FA in the thalamus (Osoba et al., 2013), and another study suggested this association (although in the opposite direction) is related to late onset MDD, especially in the corpus callosum (Cheng et al., 2014). However, while the implicated brain patterns are in line with previous reports, the overall poor model performance for classifying cases from controls, and predicting symptom loads for depression and anxiety warrants caution when interpreting feature importance.
F I G U R E 6 Age prediction based on machine learning using 10-fold crossvalidation across 100 repetitions. Here, phase encoding direction is regressed out from the subject weights in IC4. (a) Model performance results, with the dotted lines denoting the mean. (b) Feature importance based on CAR scores. Direction shows whether the features is positive or negatively associated with age Higher age was related to lower global cortical thickness (IC7), in line with previous studies (e.g., Fjell et al., 2015). As hypothesized, higher age was also associated with lower global volume and smaller surface area (IC1), similar to previous studies using LICA (Doan, Engvig, Zaske, et al., 2017;Douaud et al., 2014), and a consistent finding in lifespan studies. Additionally, advancing age was negatively associated with IC2, indicating decreased FA globally, but also increased RD and to some extent MD, consistent with the aging literature (Davis et al., 2009;Sexton et al., 2014;Westlye et al., 2010).
Higher age was associated with IC5, reflecting age-related decreases in DMN amplitude, in line with previous research (Damoiseaux et al., 2008;Mevel et al., 2013;Mowinckel, Espeseth, & Westlye, 2012;Razlighi et al., 2014;Vidal-Piñeiro et al., 2014). We observed high prediction accuracy for age in the machine learning approach. This finding supports the potential utility of LICA in estimating the gap between chronological and biological age (i.e., brain age gap).
Men had larger global brain volume and surface area than women (IC1), which is consistent with previous studies (e.g., Ritchie et al., 2018). Additionally, women had higher subject weights in IC13, reflecting lower FA in the corticospinal tract, portions of the superior longitudinal fasciculi and posterior thalamic radiation compared to men, generally in line with a large-scale UK Biobank study (Ritchie et al., 2018). We also found that men had greater DMN amplitude (IC5) than women, which adds to previous inconclusive findings (Mowinckel et al., 2012;Weissman-Fogel, Moayedi, Taylor, Pope, & Davis, 2010) and contrasts a previous report suggesting effects in the opposite direction (Jamadar et al., 2018).
In general, our analyses did not provide support for our hypothesis that a history of depression and symptoms of depression interact with age-related trajectories or sex differences of the LICA components.
To the best of our knowledge, although studies have found specific cortical abnormalities related to adults with MDD, adolescents with MDD  and age at onset of depression (Ho et al., 2019), few or no studies have reported age-by-group interactions. A recent large-scale analysis of regional brain age estimations based on brain morphometry reported increased temporal lobe brain age in patients with MDD compared to matched healthy peers (Kaufmann et al., in press), with a relatively small effect size. Although there have been some early reports of a sex-by-group interaction in hippocampal volumes (e.g., Frodl et al., 2002), the recent large-scale ENIGMA MDD study reported no sex-by-group interactions in any subcortical volumes (Schmaal et al., 2016). Another recent study identified sex-bygroup interactions using VBM, including higher GMD in the left cerebellum of male patients only, and lower GMD in the dorsal medial prefrontal gyrus in female patients only . However, the sample size was relatively small (less than 100 in the patient and control group each) which may affect the reproducibility of these findings.
Despite separate reports of a link between resting-state DMN connectivity and rumination in depression (e.g., Hamilton, Farmer, Fogelman, & Gotlib, 2015) and evidence of sex-differences in rumination among adolescents (e.g., Jose & Brown, 2008), no studies have reported a sex-by-group interaction on DMN functional connectivity or amplitude.
The lack of positive findings in the univariate analyses and low predictive accuracy in the machine learning approach can be attributed to at least two factors. First, as illustrated by the large-scale ENIGMA studies (Ho et al., 2019;Schmaal et al., 2016;Schmaal et al., 2017), the effect sizes in neuroimaging studies of mental disorders and depression are overall small (Paulus & Thompson, 2019). Similarly, small sample sizes may contribute to inflated estimates of prediction accuracy in machine learning approaches (Wolfers et al., 2015). This is one possible explanation why other multimodal fusion studies of depression have reached higher prediction accuracies for classifying patients with depression from healthy controls (He et al., 2017;Ramezani et al., 2014;Yang et al., 2018), with patient groups consisting of no more than 60 individuals. Secondly, and also related to the small effect sizes, mental disorders including depression are clinically highly heterogeneous.
As an example, Müller et al. (2017) partially attribute the lack of convergence in their meta-analysis of activation-based fMRI experiments involving 1,058 MDD patients to clinical heterogeneity. As a result, future research probing the neurobiology of depression should aim for large sample sizes (Rutledge, Chekroud, & Huys, 2019), and more importantly, stratifying patients (Feczko et al., 2019) with depression at the individual level. Pursuant to this, there has been considerable interest in identifying clinically relevant subgroups based on brain imaging, with initially encouraging results (Drysdale et al., 2017). However, the robustness and generalizability of such studies have been brought into question (Dinga et al., 2019), which may be partly due to substantial brain heterogeneity within groups, which has been illustrated in terms of morphometry in schizophrenia . Alternatively, dimensional measures such as brain age prediction (Kaufmann et al., In press) and normative modeling  Rezek, Buitelaar, & Beckmann, 2016) have shown promising results in elucidating brain heterogeneity in mental disorders such as schizophrenia (Wolfers et al., 2018) and attention deficit/hyperactivity disorder . An important consideration is the multicausal nature of depression, which means that it is likely that including other measures such as psychosocial, cognitive, and genetic factors would increase the explained variance, beyond brain imaging by itself.  (Müller et al., 2017). Additionally, the confounding of SSRI use and depression severity in the classification of cases and controls were addressed in supplemental analyses and revealed that results remained the same.
One weakness of this study is that the change in phase encoding direction may have introduced systematic differences in the MRI signal. However, only one component (IC4 in both decompositions) was strongly sensitive to phase encoding direction, suggesting that the remaining components were largely unaffected. Furthermore, we accounted for phase encoding direction in both the univariate and the machine learning analyses. This along with previous studies Doan, Engvig, Zaske, et al., 2017;Groves et al., 2012) provide additional evidence that LICA is a promising tool to account for various scanner effects, particularly relevant for multisite and longitudinal studies. We used a model order of 40 for LICA decomposition in the main analyses. Even though we did find similar feature importance ranking in the decomposition with higher model order, prediction accuracy was slightly lower across all models.
Although there is no consensus on the optimal model order, this may imply that we were modeling more noise components in the higher model order decomposition, which was also supported by the number of discarded components due to dominance by a single subject.
In conclusion, based on fusion of structural, diffusion-weighted, and resting-state fMRI data from 241 individuals with or without a history of depression, we identified multimodal and modality specific components that revealed strong associations with age and sex. None of the components showed significant association with categorical or dimensional measures of depression, nor any interaction effects with age and sex. Similarly, machine learning revealed low prediction accuracy for classifying patients from controls and predicting symptom loads. This study supports accumulating evidence of small effect sizes when comparing brain imaging features between patients with a history of depression and healthy controls, and indicates the need for more precise methods of stratifying individuals with depression, as well as large samples sizes.

ACKNOWLEDGMENTS
This work was supported by the South-Eastern Norway Regional Health Authority (2014097,2015073,2015052)

CONFLICT OF INTERESTS
N.I.L. has previously received consultancy fees and travel expenses from Lundbeck. All the other authors have nothing to declare.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.