Greater male than female variability in regional brain structure across the lifespan

Abstract For many traits, males show greater variability than females, with possible implications for understanding sex differences in health and disease. Here, the ENIGMA (Enhancing Neuro Imaging Genetics through Meta‐Analysis) Consortium presents the largest‐ever mega‐analysis of sex differences in variability of brain structure, based on international data spanning nine decades of life. Subcortical volumes, cortical surface area and cortical thickness were assessed in MRI data of 16,683 healthy individuals 1‐90 years old (47% females). We observed significant patterns of greater male than female between‐subject variance for all subcortical volumetric measures, all cortical surface area measures, and 60% of cortical thickness measures. This pattern was stable across the lifespan for 50% of the subcortical structures, 70% of the regional area measures, and nearly all regions for thickness. Our findings that these sex differences are present in childhood implicate early life genetic or gene‐environment interaction mechanisms. The findings highlight the importance of individual differences within the sexes, that may underpin sex‐specific vulnerability to disorders.

(Enhancing Neuro Imaging Genetics through Meta-Analysis) Consortium presents the largest-ever mega-analysis of sex differences in variability of brain structure, based on international data spanning nine decades of life. Subcortical volumes, cortical surface area and cortical thickness were assessed in MRI data of 16,683 healthy individuals 1-90 years old (47% females). We observed significant patterns of greater male than female between-subject variance for all subcortical volumetric measures, all cortical surface area measures, and 60% of cortical thickness measures. This pattern was stable across the lifespan for 50% of the subcortical structures, 70% of the regional area measures, and nearly all regions for thickness. Our findings that these sex differences are present in childhood implicate early life genetic or gene-environment interaction mechanisms. The findings highlight the importance of individual differences within the sexes, that may underpin sex-specific vulnerability to disorders.

| INTRODUCTION
For a diverse set of human traits and behaviors, males are often reported to show greater variability than females (Hyde 2014). This sex difference has been noted for aspects of personality (Borkenau, McCrae, and Terracciano 2013), cognitive abilities (Arden and Plomin 2006;Johnson, Carothers, and Deary 2008;Roalf et al. 2014), and school achievement (Baye and Monseur 2016). A fundamental question is to what degree these sex differences are related to genetic mechanisms or social factors, or their interactions. Lehre et al. (2009) found compelling evidence for an early genetic or in utero contribution, reporting greater male variability in anthropometric traits (e.g. body weight and height, blood parameters) already detectable at birth. Recent studies suggest greater male variability also in brain structure and its development (Forde et al. 2020;Ritchie et al. 2018;Wierenga et al. , 2019, but studies with larger samples that cover both early childhood and old age are critically needed. Specifically, we do not know when sex differences in variability in brain structure emerge and whether they change with development and throughout life. Yet, data on this could inform us on the origins and factors that influence this phenomenon. For this reason, we set out to analyze magnetic resonance imaging (MRI) data from a large sample of individuals across a very wide age range (n = 16,683, age 1-90) to robustly characterize sex differences in variability of brain structure and test how these differences interact with age.
Many prior studies report sex differences in brain structure, but the specificity, regional pattern and functional relevance of such effects are not clear (Herting et al. 2018;Koolschijn and Crone 2013;Marwha, Halari, and Eliot 2017;Ruigrok et al. 2014;Tan et al. 2016).
One reason could be that most studies have examined mean differences between the sexes, while sex differences in variability remain understudied (Del Giudice et al. 2016;Joel et al. 2015). As mean and variance measure two different aspects of the distribution (center and spread), knowledge on variance effects may provide important insights into sex differences in the brain. Recent studies observed greater male variance for subcortical volumes and for cortical surface area to a larger extent than for cortical thickness (Ritchie et al. 2018;Wierenga et al. , 2019. However, further studies are needed to explore regional patterns of variance differences, and, critically, to test how sex differences in variability in the brain unfold across the lifespan. An important question pertains to the mechanisms involved in sex differences in variability. It is hypothesized that the lack of two parental X-chromosomal copies in human males may directly relate to greater variability and vulnerability to developmental disorders in males compared to females (Arnold 2012). All cells in males express an X-linked variant, while female brain tissues show two variants. In females, one of the X-chromosomes is randomly silenced, as such neighboring cells may have different X related genetic expression (Wu et al. 2014). Consequently, one could expect that in addition to greater variability across the population, interregional anatomical correlations may be stronger in male relative to female brains. This was indeed observed for a number of regional brain volumes in children and adolescents, showing greater within-subject homogeneity across regions in males than females . These results remain to be replicated in larger samples as they may provide clues about mechanisms and risk factors in neurodevelopmental disorders (e.g. attention-deficit/hyperactivity disorder and autism spectrum disorder) that show sex differences in prevalence (Bao and Swaab 2010), age of onset, heritability rates (Costello et al. 2003), or severity of symptoms and course (Goldstein, Seidman, and O'brien 2002).
In the present study, we performed mega-analyses on data from the enhancing neuroimaging genetics through meta-analysis (ENIGMA) Lifespan working group Frangou et al., 2020;Jahanshad and Thompson 2016). A mega-analysis allows for analyses of data from multiple sites with a single statistical model that fits all data and simultaneously accounting for the effect of site. Successfully pooling lifespan data was recently shown in a study combining 18 datasets to derive age trends of brain structure (Pomponio et al. 2020). This contrasts with meta-analysis where summary statistics are combined and weighted from data that is analyzed at each site (van Erp et al. 2019).
MRI data from a large sample (n = 16,683) of participants aged 1 to 90 years was included. We investigated subcortical volumes and regional cortical surface area and thickness. Our first aim was to replicate previous findings of greater male variability in brain structure in a substantially larger sample. Based on prior studies (Forde et al. 2020;Ritchie et al. 2018;Wierenga et al. , 2019 and reports of somewhat greater genetic effect on surface area than thickness (Eyler et al. 2011;Kremen et al. 2013), we hypothesized that greater male variance would be more pronounced for subcortical volumes and cortical surface area than for cortical thickness, and that greater male variance would be observed at both upper and lower ends of the distribution. Our second aim was to test whether observed sex differences in variability of brain structure are stable across the lifespan from birth until 90 years of age, or e.g. increase with the accumulation of experiences (Pfefferbaum, Sullivan, and Carmelli 2004). Third, in line with the single X-chromosome hypothesis, we aimed to replicate whether males show greater interregional anatomical correlations (i.e. within-subject homogeneity) across brain regions that show greater male compared to female variance (Wierenga et al. 2019).

| Participants
The datasets analyzed in the present study were from the Lifespan working group within the ENIGMA Consortium (Jahanshad and Thompson 2016). There were 78 independent samples with MRI data, in total including 16,683 (7,966 males) healthy participants aged 1-90 years from diverse ethnic backgrounds (see detailed descriptions at the cohort level in Table 1). Samples were drawn from the general population or were healthy controls in clinical studies. Screening procedures and the eligibility criteria (e.g. head trauma, neurological history) may be found in Supplemental Table 1. Participants in each cohort gave written informed consent at the local sites. Furthermore, at each site local research ethics committees or Institutional Review Boards gave approval for the data collection, and all local institutional review boards permitted the use of extracted measures of the completely anonymized data that were used in the present study.

| Imaging data acquisition and processing
For definition of all brain measures, whole-brain T1-weighted anatomical scan were included. Detailed information on scanner model and image acquisition parameters for each site can be found in Supplemental Table 1. T1 weighted scans were processed at the cohort level, where subcortical segmentation and cortical parcellation were performed by running the T1-weighted images in FreeSurfer using versions 4.1, 5.1, 5.3 or 6.0 (see Supplemental Table 1 for specifications per site). This software suite is well validated and widely used, and documented and freely available online (surfer.nmr.mgh.harvard.edu).
The technical details of the automated reconstruction scheme are described elsewhere Fischl et al. 1999Fischl et al. , 2002 (Fischl et al. 2002), and cortical surface area and thickness measures Fischl et al. 1999) of 68 regions of the cerebral cortex (Desikan-Killiany atlas) (Desikan et al. 2006). Quality control was also implemented at the cohort level following detailed protocols (http://enigma.ini.usc.edu/protocols/ imaging-protocols). The statistical analyses included 13,696 participants for subcortical volumes, 11,338 for surface area measures, and 12,533 participants for cortical thickness analysis.

| Statistical analysis
Statistical analyses were performed using R Statistical Software. The complete scripts are available in the Appendix. In brief, we first adjusted all brain structure variables for cohort, field strength and FreeSurfer version effects. As age ranges differed for each cohort this was done in two steps: initially, a linear model was used to account for cohort effects and non-linear age effects, using a third-degree polynomial function.
Next, random forest regression modelling (Breiman 2001) was used to additionally account for field strength and FreeSurfer version. See Supplemental Figure 1 for adjusted values. This was implemented in the R package randomForest, which can accommodate models with interactions and non-linear effects.

| Mean differences
Mean sex differences in brain structure variables were tested using t-tests (FDR corrected, see (Benjamini and Hochberg 1995)) and effect sizes were estimated using Cohen's d-value. A negative effect size indicates that the mean was higher in females, and a positive effect size indicates it was higher in males. The brain structure variables were adjusted for age and covariates described above. Graphs were created with R package ggseg (Mowinckel and Vidal-Pineiro, 2019).

| Variance ratio
Variance differences between males and females were examined, after accounting for age and other covariates as described above.
Fisher's variance ratio (VR) was estimated by dividing variance measures for males and females. VR was log transformed to account for VR bias (Katzman and Alliger 1992;Lehre et al. 2009). Letting y i denote the observed outcome for observation number i and y^i its predicted outcome, the residuals were then formed: The residual variance Var males and Var females were computed separately for males and females, and used to form the test statistic T = Var males =Var females For each outcome, a permutation test of the hypothesis that the sex specific standard deviations were equal, was performed. This was done by random permutation of the sex variable among the residuals. Using β permutations, the p-value for the k-th outcome measure was computed as is an indicator function that is 1 when T b ≥ T, and 0 otherwise. Thus, the p-value is the proportion of permuted test statistics (T b ) that were greater than the observed value T of the test statistic above. Here B was set to 10,000. FDR corrected values are reported as significant.

| Shift Function
To assess the nature of the variability difference between males and females, shift functions were estimated for each brain measure that showed significant variance differences between males and females using quantile regression forests (Meinshausen 2006;Rousselet, Pernet, and Wilcox 2017), implemented in the R package quantregForest (see ) for a similar approach). First, as described above, brain measures were accounted for site, age, field strength and FreeSurfer version. Next, quantile distribution functions were estimated for males and females separately after aligning the distribution means. Let q be a probability between 0 and 1. The quantile function specifies the values at which the volume of a brain measure will be at or below any given q. The quantile function for males is given as Q(qj males) and for females as Q (qjfemales). The quantile distance function is then defined as:

| Variance change with age
To study whether the sex differences in variance are stable across the age range we used the residuals of the predicted outcome measure and each individual i: The absolute value of r i was then used in a regression model. It was next explored whether there was a significant (FDR corrected) age by sex interaction effect using a linear model 1 and quadratic model 2:

| Anatomical correlation analysis
Inter-regional anatomical associations were assessed by defining the correlation between two brain structures, after accounting for age and other covariates as described above. Anatomical correlation matrices were estimated as previously applied in several structural MRI studies for males and females separately (see e.g. Baaré et al. 2001;Lerch et al. 2006). Next, the anatomical correlation matrix for females was subtracted from the anatomical correlation matrix for males, yielding a difference matrix.
Thus, the Pearson correlation coefficient between any two regions i and j was assessed for males and females separately. This produced two group correlation matrices M ij and F ij where i, j, = 1, 2, .
…, N, where N is the number of brain regions.
Sex specific means and standard deviations were removed by performing sex specific standardization. The significance of the differences between M ij and F ij was assessed by the difference in their Fisher's z-transformed values, and p-values were computed using permutations. Whether these significantly differed between the sexes was tested using a Chi-square test.

| Sex differences in mean and variance
All brain measures were adjusted for cohort, field strength, FreeSurfer version and (non-linear) age. As a background analysis, we first assessed whether brain structural measures showed mean differences between males and females to align our findings to previous reports ( Figure 1, Table 2). All subcortical volumes were significantly larger in males, with effect sizes (Cohen's d-values) ranging from 0.41 (left accumbens) to 0.92 (right thalamus), and an average effect size of 0.7.
In follow-up analyses with total brain volume as an additional covariate we found a similar pattern, although effect sizes were smaller (Supplemental Table S2A). Also for cortical surface area, all regions showed significantly larger values in males than females, with effect sizes ranging from 0.42 (left caudal anterior cingulate area) to 0.97 (left superior temporal area), on average 0.71. When total surface area was included as an additional covariate, a similar pattern was observed, although effect sizes were smaller (Supplemental Table S2B). Cortical thickness showed significant mean sex differences in 43 (out of 68) regions, of which 38 regions showed larger thickness values in females than males. These were mostly frontal and parietal regions.
The largest effect size, however, was only 0.12 (right caudal anterior cingulate cortex). When total average cortical thickness was included as an additional covariate, nine regions showed a male advantage that was not observed in the raw data analysis, and six of the 38 regions showing female advantage did not reach significance (Supplemental Table S2C).
We then tested for sex differences in variance of brain structure, adjusted for cohort, field strength, FreeSurfer version and (non-linear) age ( Figure 2, Tables 2). All subcortical volumes had significantly greater variance in males than females. Log transformed variance ratios ranged from 0.12 (right accumbens) to 0.36 (right pallidum), indicating greater variance in males than females. Similar results were also observed when total brain volume was taken into account (Supplemental Table S2A). Cortical surface area also showed significantly greater variance in males for all regions: variance ratios ranged from 0.13 (left caudal anterior cingulate cortex) to 0.36 (right parahippocampal cortex). This pattern was also observed when total surface area was included in the model (Supplemental Table S2B).
Cortical thickness showed significantly greater male variance in 41 out of 68 regions, with the greatest variance ratio being 0.11 (left precentral cortex). Notably, 37 of these 41 regions did not show significantly larger mean thickness values in males. When additionally accounting for total average thickness, we found greater male variance in 39 regions and greater females variance in 5 regions. Also here, significant variance ratios were present in the absence of mean sex differences (Supplemental Table S2C).
Next, we directly tested whether the regions showing larger variance effects were also those showing larger mean differences, by correlating the variance ratios with the vector of d-values (Supplemental Figure 2). There was a significant association for subcortical volumes (r (12) = 0.7, p-value = .005), but no significant relation for regional cortical surface area (r (66) = 0.18, p-value = .14), or thickness (r (66) = -0.21, p-value = .09).

| Greater variance in males at upper and lower tails
In order to characterise how the distributions of males and females differ, quantiles were compared using a shift function (Rousselet et al. 2017). As in the previous models, brain measures were adjusted for cohort, field strength, FreeSurfer version and age. In addition, the distribution means were aligned. Results showed greater male variance at both upper and lower tails for regions that showed significant variance differences between males and females. The top three variance ratio effects for subcortical volume, cortical surface area and cortical thickness are shown in Figure 3.

| Variance differences between sexes across age
We next tested whether the sex differences in variance interacted with age ( Figure 4 and supplemental Figure 3). In this set of analyses, brain measures were adjusted for cohort, field strength, and FreeSurfer version. For 50% of the subcortical volume measures there was a significant interaction, specifically for the bilateral thalami, bilateral putamen, bilateral pallidum and the left hippocampus (Table 3, Figure 5). Cortical surface area showed significant interaction effects in 30% of the cortical regions (Table 3, Figure 5). In both cases, younger individuals tended to show greater sex differences in variance than older individuals. For cortical thickness, an interaction with age was detected only in the left insula (Table 3, Figure 5). This region showed greater male than female variance in the younger age group, whereas greater female variance was observed in older individuals.
Next, these analyses were repeated using a quadratic age model (Supplemental Tables 3A-C). None of the subcortical or cortical surface area measures showed quadratic age by sex interaction effects in variance. Cortical thickness showed significant quadratic age by sex effects in two regions; left superior frontal cortex and right lateral orbitofrontal cortex.

| Sex differences in anatomical correlations
Finally, we tested whether females showed greater diversity than males in anatomical correlations by comparing inter-regional anatomical associations between males and females. Using permutation testing (B = 10000), the significance of correlation differences between males and females was assessed.
Of the 91 subcortical-subcortical correlation coefficients, 2% showed significantly stronger correlations in males, while, unexpectedly, 19% showed stronger correlations in females (tested two-sided) ( Figure 6A). A chi-square test of independence showed that this significantly differed between males and females, X 2 (1, N = 18) = 10.889, p < .001. For surface area, no significant difference between males and females were observed: significantly stronger male homogeneity was observed in 4% of the 2,278 unique anatomical correlations, and similarly females also showed significantly stronger correlations in 4% of the anatomical associations ( Figure 6B). For thickness, stronger male than female homogeneity was observed in 21% of the correlations, while stronger female correlations were observed in <1% of the correlations ( Figure 6C). This difference was significant, X 2 (1, N = 484) = 460.300, p < .001.  et al. 2018, 2019). Variance differences were more pronounced for subcortical volumes and regional cortical surface area than for regional cortical thickness. We also corroborated prior findings of greater male brain structural variance at both upper and lower tails of brain measures ). These variance effects seem to describe a unique aspect of sex differences in the brain that does not follow the regional pattern of mean sex differences. A novel finding was that sex differences in variance appear stable across the lifespan for around 50% of subcortical volumes, 70% of cortical surface area measures and almost all cortical thickness measures. Unexpectedly, regions with significant change in variance effects across the age range showed decreasing variance differences between the sexes with increasing age. Finally, we observed greater male inter-regional homogeneity for cortical thickness, but not for surface area or F I G U R E 2 Sex differences in variance ratio for subcortical volumes (Left), cortical surface area (center), and cortical thickness (right). Shown are log transformed variance ratios, where significant larger variance ratio for males than females is displayed in blue ranging from 0 to 1. Darker colors indicate a larger variance ratio subcortical volumes, partly replicating prior results of greater withinsubject homogeneity in the male brain ). Unexpectedly, subcortical regions showed stronger interregional correlation in females than in males.
Greater male variance was most pronounced in brain regions Although most results showed stable sex differences with increasing age, half of the subcortical regions and a quarter of the cortical surface area measures showed decreasing sex differences in variance. What stands out is that in all these regions, sex differences in variance were largest in young compared to older age. This is indicative of early mechanisms being involved. Furthermore, for subcortical regions, the patterns showed larger volumetric increases in females then in males. For surface area, interaction effects showed mostly stable variance across age in females, but decreases in variability in males. The observation that there were no significant quadratic interactions makes it unlikely that pubertal hormones may affect greater male variance. Yet, the decrease in male variance in older age, may be indicative of environmental effects later in life. Alternative explanation may be the larger number of clinical or even death rates in males that may lead to some sex difference in survival (Chen et al. 2008;Ryan et al. 1997).
Factors underlying or influencing sex differences in the brain may include sex chromosomes, sex steroids (both perinatal or pubertal), and the neural embedding of social influences during the life span (Dawson, Ashman, and Carver 2000). Although we could not directly test these mechanisms, our findings of greater male variance, that are mostly stable across age, together with the greater male inter-regional homogeneity for cortical thickness are most in line with the single Xchromosome expression in males compared to the mosaic pattern of X-inactivation in females (Arnold 2012). Whereas female brain tissue shows two variants of X-linked genes, males only show one. This mechanism may lead to increased male vulnerability, as is also seen for a number of rare X-linked genetic mutations (Chen et al. 2008;Craig, Haworth, and Plomin 2009;Johnson, Carothers, and Deary 2009;Reinhold and Engqvist 2013;Ryan et al. 1997). None of the other sex effects mentioned above predict these specific inter and intra-individual sex differences in brain patterns. Future studies are, however, needed to directly test these different mechanisms. Furthermore, the observation that greater male homogeneity was only observed in cortical thickness, but not cortical surface area or subcortical volumes, may speculatively indicate that X-chromosome related genetic mechanisms may have the largest effect on cortical thickness measures.
This paper has several strengths including its sample size, the age range spanning nine decades, the inclusion of different structural measures (subcortical volumes and cortical surface area and thickness) and the investigation of variance effects. These points are important, as most observed mean sex differences in the brain are modest in size (Joel and Fausto-Sterling 2016). We were able to analyze data from a far larger sample than those included in recent meta-analyses of mean sex differences (Marwha et al. 2017;Ruigrok et al. 2014;Tan et al. 2016), and a very wide age range covering childhood, adolescence, adulthood and senescence. The results of this study may have important implications for studies on mean sex differences in brain structure, as analyses in such studies typically assume that group variances are equal, which the present study shows might not be tenable.
This can be particularly problematic for studies with small sample sizes (Rousselet et al. 2017).
The current study has some limitations. First, the multi-site sample was heterogeneous and specific samples were recruited in different ways, not always representative of the entire population.
Furthermore, although structural measures may be quite stable across different scanners, the large number of sites may increase the variance in observed MRI measures, but this would be unlikely to be systematically biased with respect to age or sex. In addition, variance effects may change in non-linear ways across the age-range. This may F I G U R E 4 Regions where sex differences in variability of brain structure interacted with age displayed for subcortical volumes (left), cortical surface area (center), and cortical thickness (right) Future studies including longitudinal data are warranted to further explore the lifespan dynamics of sex differences in variability in the brain. Last, one caveat may be the effect of movement on data quality and morphometric measures. As males have been shown to move more than females in the scanner (Pardoe, Kucharsky Hiess, and Kuzniecky 2016), this may have resulted in slight under estimations of brain volume and thickness measures for males (Reuter et al. 2015).
Although quality control was conducted at each site using the standardized ENIGMA cortical and subcortical quality control protocols (http://enigma.ini.usc.edu/protocols/imaging-protocols/), which involve a combination of statistical outlier detection and visual quality checks and a similar number of males and females had partially missing data (52.4% males), we cannot exclude the possibility that inscanner subject movement may have affected the results. Nevertheless, we do not think this can explain our finding of greater male variance in brain morphometry measures, as this was seen at both the upper and lower ends of the distributions.

| CONCLUSIONS
The present study included a large lifespan sample and robustly confirmed previous findings of greater male variance in brain structure in humans. We found greater male variance in all brain measures, including subcortical volumes and regional cortical surface area and thickness, at both the upper and the lower end of the distributions. The results have important implications for the interpretation of studies on (mean) sex differences in brain structure. Furthermore, the results of decreasing sex differences in variance across age opens a new direction for research focusing on lifespan changes in variability within sexes. Our findings of sex differences in regional brain structure being present already in childhood may suggest early genetic or geneenvironment interaction mechanisms. Further insights into the ontogeny and causes of variability differences in the brain may provide clues for understanding male biased neurodevelopmental disorders. F I G U R E 5 Sex differences in variability interacted with age in 50% of the subcortical volumes, 30% of the surface area measures, and only one thickness measure. Three representative results are shown: right thalamus volume (top left), surface area of the right parahippocampal gyrus (top right) and thickness of the left insula (bottom center). Absolute residual values are modeled across the age range. Effects showed larger male than female variance in the younger age group, this effect attenuated with increasing age F I G U R E 6 (a-c) Stronger anatomical correlations for males than females are indicated in blue (larger homogeneity in males than females), while stronger correlations for females are displayed in red (larger homogeneity in females than males). The bottom left half shows the significant variance ratio's only, using two sided permutation testing. Results are displayed for subcortical volumes (a), surface area (b), and cortical thickness (c