MicroRNA binding site variation is enriched in psychiatric disorders

Abstract Psychiatric disorders have a polygenic architecture, often associated with dozens or hundreds of independent genomic loci. Most associated loci impact noncoding regions of the genome, suggesting that the majority of disease heritability originates from the disruption of regulatory sequences. While most research has focused on variants that modify regulatory DNA elements, those affecting cis‐acting RNA sequences, such as miRNA binding sites, are also likely to have a significant impact. We intersected genome‐wide association study (GWAS) summary statistics with the dbMTS database of predictions for miRNA binding site variants (MBSVs). We compared the distributions of MBSV association statistics to non‐MBSVs within brain‐expressed 3′UTR regions. We aggregated GWAS p values at the gene, pathway, and miRNA family levels to investigate cellular functions and miRNA families strongly associated with each trait. We performed these analyses in several psychiatric disorders as well as nonpsychiatric traits for comparison. We observed significant enrichment of MBSVs in schizophrenia, depression, bipolar disorder, and anorexia nervosa, particularly in genes targeted by several miRNA families, including miR‐335‐5p, miR‐21‐5p/590‐5p, miR‐361‐5p, and miR‐557, and a nominally significant association between miR‐323b‐3p MBSVs and schizophrenia risk. We identified evidence for the association between MBSVs in synaptic gene sets in schizophrenia and bipolar disorder. We also observed a significant association of MBSVs in other complex traits including type 2 diabetes. These observations support the role of miRNA in the pathophysiology of psychiatric disorders and suggest that MBSVs are an important class of regulatory variants that have functional implications for many disorders, as well as other complex human traits.


Analysis of non-psychiatric traits
For comparison and to determine whether these phenomena were characteristic of psychiatric disorders in particular, or of complex human traits in general, we performed the same analyses on several non-psychiatric traits.We identified similar trends within GWAS of height, CAD, BMI, and T2D, including enrichment of strongly-associated MBSVs, smaller effect sizes of MBSVs compared to non-MBSV 3'UTR variants, as well as aggregation of association within specific miRNA families and relevant gene sets, such as blood eQTL MBSVs in BMI being associated with the gene ontology set "lipid localisation".For further details, see the supplementary text.

Differential distribution of CADD score ranks
CADD scores were found to be significantly differentially distributed for all classes of MBSVs in each version of the CADD database, except for pancreas eQTL MBSVs in type 2 diabetes (Supplementary Tables 19a-b).Interestingly, a significant effect on the median CADD score rank was only observed in v1.6 of the CADD database; except for the all blood-expressed MBSV class, which was non-significant, all other MBSV classes, both blood and pancreas-expressed, were associated with smaller CADD score ranks, mirroring the effect observed in psychiatric disorders (Supplementary Tables 22a-b and 23a-b).

Enrichment of MBSVs in each non-psychiatric trait GWAS
We found significantly altered p-value distributions for all MBSV classes in all non-psychiatric GWAS except for height (Supplementary Tables 20a-b).Quantile regression revealed that for each non-psychiatric trait, at least one class of MBSVs were associated with elevated negative log10 p-values at multiple tested quantiles (Supplementary Tables 24a-b).Absolute log-effect sizes were also significantly differentially distributed for most comparisons, except for: all MBSVs in BMI; GTEx eQTL MBSVs in height; and GTEx blood-specific eQTL MBSVs in BMI and height (Supplementary Tables 21a-b).Quantile regression showed that for BMI, CAD, and T2D (BMI adjusted and unadjusted, both relative to blood and pancreas gene and miRNA expression), multiple MBSV classes were associated with smaller absolute log-effect sizes at multiple quantiles; for height, only GTEx blood eQTL MBSVs were associated with reduced effect sizes at the 99.9 th percentile (Supplementary Tables 25a-b).
Similar to the psychiatric analyses, no significant enrichments of genome-wide significant variants was observed in any non-psychiatric trait.However, at the liberal GWAS p-value threshold of 1x10 -5 , T2D (BMI adjusted and unadjusted, both relative to blood and pancreas gene and miRNA expression), all MBSVs and GTEx eQTL MBSVs were significantly enriched; unadjusted T2D, relative to both blood and pancreas expression also showed an enrichment of suggestivesignificant GTEx blood and pancreas eQTL MBSVs, respectively.Height also showed a significant enrichment of MBSVs at a p-value threshold of 1x10 -5 (Supplementary Tables 26a-b, Supplementary Figures 30-31).
When aggregating p-values of MBSVs for each trait, we found a significant association between all MBSVs and all traits except CAD; similarly, we found a significant association between all eQTL MBSVs and all traits (Supplementary Tables 27a-b).We further found that, relative to blood gene and miRNA expression, all non-psychiatric traits showed an association with GTEx eQTL MBSVs, and BMI (unadj.),CAD, and height also showed an association with GTEx blood eQTL MBSVs.Relative to pancreas expression, T2D (BMI adj.) showed an association with GTEx pancreas eQTL MBSVs (Supplementary Table 27b).

MBSV p-value aggregation within miRNA families
Aggregation of all MBSV p-values into individual miRNA families revealed that there were 114 unique miRNA families associated with one or more of the non-psychiatric traits with respect to blood gene and miRNA expression (Supplementary Table 28a), and 30 associated with T2D (BMI adj.and unadj.)with respect to pancreas expression (Supplementary Table 28b).Height displayed the strongest associations, with miR-144-5p, miR-576-5p, miR-103-3p/107 and miR-423-5p being the most significantly associated miRNA families.Similarly, GTEx eQTL MBSVs affecting 99 unique miRNA families in blood-expressed genes were significantly associated with one or more non-psychiatric traits, and GTEx eQTL MBSVs affecting 34 unique miRNA families in pancreas-expressed genes were significantly associated with T2D.Finally, 48 families affected by blood-specific eQTL MBSVs were associated with non-psychiatric traits, and 10 families affected by pancreas-specific eQTL MBSVs were associated with T2D, respectively.Again, height demonstrated the most significant associations.

Associations between difference scores and effect sizes
We identified a single miRNA family, miR-1343-3p, that reached significance in a linear regression analysis between the difference score and the effect size of the alternate allele for height (p = 1.71x10 -4 , FDR = 0.0386, Bonf.p = 0.0386) (Supplementary Tables 31a and 32a).In this case, smaller difference scores, i.e. alternate alleles predicted to increase binding affinity were associated with increased height.In T2D (BMI adj., relative to pancreas expression) one miRNA family, miR-199-5p, was nominally significant, but did not pass multiple test correction (p = 0.0089, FDR = 0.55, Bonf.p = 0.58) (Supplementary Tables 31b and 32b).

Gene set association analyses
For non-psychiatric traits, we found that each trait had multiple significantly associated gene sets.
For T2D (both BMI adj.and unadj., blood MBSVs) "Semi lunar valve development" and "cardiac septum morphogenesis" were significantly associated (FDR < 0.05) with eQTL MBSVs (Supplementary Table 29a); further for both, "regulation of osteoclast development" was associated with all MBSVs.For T2D (both MBI adj.and unadj.)MBSVs relative to pancreasexpressed miRNA and gene expression, a similar theme of heart-related gene sets was observed, including "Semi lunar valve development" for eQTL MBSVs, and "cardiac chamber morphogenesis" and "heat morphogenesis" for pancreas eQTL MBSVs (Supplementary Table 29b).Blood eQTL MBSVs in BMI were associated with "lipid localisation", while all MBSVs were associated with "methylation dependent chromatin silencing", "negative regulation of protein impot into nucleus", "negative regulation of protein localisation to nucleus", "negative regulation of nucleocytoplasmic transport", and "acid phosphatase activity".In CAD, all MBSVs were associated with "cell morphogenesis involved in differentiation", and importantly, 126 genes in this set were affected by MBSVs.In height, all MBSVs were associated with several gene sets, including "response to vitamin A" and "glucose metabolic process".
At the gene level, we observed 515 unique MBSV-affected genes associated with one or more non-psychiatric traits relative to blood expression, and 31 unique MBSV-affected genes associated with T2D (BMI adj.and unadj.)relative to pancreas expression (Supplementary Table 30a).Height again showed the most significant associations, with VAPA, TNFSF14, and PLVAP being among the most significant genes (p = 1x10 -50 , Bonf.p = 2.12x10 -46 for all three genes).
Among the genes associated with T2D (BMI adj.and unadj., pancreas gene expression), NOTCH2 was the most significantly associated gene with both eQTL and pancreas eQTL MBSVs (Supplementary Table 30b).