Evaluating brain structure traits as endophenotypes using polygenicity and discoverability

Abstract Human brain structure traits have been hypothesized to be broad endophenotypes for neuropsychiatric disorders, implying that brain structure traits are comparatively “closer to the underlying biology.” Genome‐wide association studies from large sample sizes allow for the comparison of common variant genetic architectures between traits to test the evidence supporting this claim. Endophenotypes, compared to neuropsychiatric disorders, are hypothesized to have less polygenicity, with greater effect size of each susceptible SNP, requiring smaller sample sizes to discover them. Here, we compare polygenicity and discoverability of brain structure traits, neuropsychiatric disorders, and other traits (91 in total) to directly test this hypothesis. We found reduced polygenicity (FDR = 0.01) and increased discoverability (FDR = 3.68 × 10−9) of cortical brain structure traits, as compared to aggregated estimates of multiple neuropsychiatric disorders. We predict that ~8 M individuals will be required to explain the full heritability of cortical surface area by genome‐wide significant SNPs, whereas sample sizes over 20 M will be required to explain the full heritability of depression. In conclusion, our findings are consistent with brain structure satisfying the higher power criterion of endophenotypes.

( a ) Increased effect size in cortical surface area compared to cortical thickness. In the forest plots, the 50th percentile of ranked sSNP absolute effect size is shown with 95% CIs as error bars. * indicates phenotypes with lower/upper limits of proportion of sSNPs in cluster 1 out of range (>1 or <0, limited to 1 or 0). $ indicates phenotype where the 95% CI lower limit of 1 2 or 2 2 had a negative value and was limited to 0. Thus, the CI for those phenotypes needs caution in interpretation. Pearson's correlation coefficient (corr) showed ( a ) significant negative correlation between polygenicity ( c ) and discoverability (absolute value of effect size). ( b ) Covariance matrix from GENESIS output indicated a negative correlation between estimates of c and 2 which is likely producing the negative correlation in ( a ). We also observed ( c ) significant positive correlation between heritability ( h 2 ) and discoverability, but ( d ) no significant correlation between heritability and polygenicity. Only phenotypes best fit to M2 were shown in ( a ) to simplify assessment of the correlation of estimated parameters in the model (b) . Error bar indicates 95% CIs of the estimate.

Supplementary Figure 8: Estimates of heritability across multiple complex brain-relevant traits
The estimated SNP-based heritability by GENESIS across traits shows (a) that generally subcortical volume traits have the highest heritability, followed by cortical surface area, and then cortical thickness, and (b) increased heritability of global cortical traits compared to neuropsychiatric disorders, addiction traits, cognition, and anthropometric measurements. (c) The significance after FDR correction between categories, calculated via a heterogeneity test. The horizontal line indicates -log 10 (FDR = 0.05).
Supplementary Figure 9: Correlation between measurement error of MRI segmentations and discoverability/polygenicity. Correlation between measurement error of MRI phenotypes and discoverability/polygenicity. Test-retest correlation (i.e. the similarity between MRI segmentations from two scans of the same individual) for subjects that passed visual inspection was obtained from [Iscan et al., 2015] . The blue line indicates a regression line with 95% confidence intervals. Three regions (temporal pole, frontal pole and entorhinal cortex) with low TRC (<0.7) drove these significant correlations. When these three regions were removed, there was no detectable relationship between genetic architecture and measurement error (r=0.24; p=0.059 for discoverability and r=-0.21, p=0.100 for polygenicity).

Supplementary Figure 10: Correlation between population stratification and discoverability/polygenicity
No association between population stratification and polygenicity. LDSC intercept, a measure of population stratification, vs number of sSNPs, a measure of polygenicity, either including height (left), where population stratification has previously been shown to have a strong effect on summary statistics [Sohail et al., 2019] , and without height (right). Pearson' correlation coefficients suggest no correlation between LDSC intercept and estimated number of sSNPs in either case. While LDSC intercept for height suggests strong population stratification, brain-relevant traits tested in this study do not show strong evidence of population stratification (LDSC intercepts close to 1).

Supplementary Figure 11: Replication in summary statistics from UKBiobank
Comparisons of polygenicity and discoverability from GWAS summary statistics without meta-analysis in the UK Biobank cohort. Similar to findings in Figures 2 & 3, The predicted number of sSNPs shows ( a ) decreased polygenicity for global cortical traits compared to depression, addiction relevant traits, cognition, and anthropometric measurements. ( b ) The significance after FDR correction between categories, calculated via a heterogeneity test. The effect size distributions across UKBB traits suggest increased effect sizes in cortical structure compared to depression, addiction relevant traits, cognition and anthropometric measurements ( c-e ). Joint effect sizes are an approximation of Pearson's correlation coefficient between sSNPs and phenotype. M2/M3 indicates the best fit model for the traits. Predicted percentages (%) of genetic variance explained by genome-wide significant SNPs (y-axis) are shown with the given sample size (50K to 1.5M). At the right, predicted % of genetic variance explained with sample size of 1.5M are labeled. For regional cortical structures, only regions with the best or worst % of genetic variance explained are labeled.
Supplementary   proportion of sSNPs (and standard error) in cluster 1 (larger effect sizes) Heritability in cluster 1 heritability estimates (and standard error) explained by sSNPs in cluster 1 Heritability in cluster 2 heritability estimates (and standard error) explained by sSNPs in cluster 2 Total Heritability heritability estimates (and standard error) explained by all sSNPs