Rare and common variants associated with alcohol consumption identify a conserved molecular network

Background: Genome-wide association studies (GWAS) have identified hundreds of common variants associated with alcohol consumption. In contrast, genetic studies of alcohol consumption that use rare variants are still in their early stages. No prior studies of alcohol


INTRODUC TI ON
Alcohol use disorder (AUD) is a highly heritable disease (Verhulst et al., 2015) with a significant public health burden (MacKillop et al., 2022).AUD can be viewed as the endpoint of a series of transitions, which begins with the initiation of use, continues with regular alcohol consumption, escalation to hazardous drinking, and culminates in compulsive harmful use that persists despite negative consequences (Sanchez-Roige et al., 2020).As such, alcohol consumption is frequently studied as a proxy for AUD, as it is a component of AUD, and is a quantitative trait that is widely measured, providing large sample sizes for genetic studies.In particular, genome-wide association studies (GWAS) have identified numerous common variants that contribute to AUD and related traits (Clarke et al., 2017;Kranzler et al., 2019;Liu et al., 2019;Mallard et al., 2022;Mallard & Sanchez-Roige, 2021;Sanchez-Roige et al., 2019;Saunders et al., 2022;Walters et al., 2018;Zhou et al., 2023), but the best powered GWAS of alcohol use are conducted in alcohol consumption (Clarke et al., 2017;Liu et al., 2019;Saunders et al., 2022).In this study, we focus on alcohol consumption both for its relationship with AUD, and its well-powered studies.
Recently, genetic studies of other psychiatric disorders have extended their reach to rare variants.One advantage of studies of rare exonic variants is that they unambiguously identify the causal gene (Sazonovs & Barrett, 2018).Such studies have already shown that rare exonic variants influence risk for multiple psychiatric disorders, including intellectual disability, autism spectrum disorder, and schizophrenia (Antaki et al., 2022;Charney et al., 2019;Fu et al., 2023;Ganna et al., 2018;Singh et al., 2022;Weiner et al., 2023).
Because they are uncommon, rare variants are not represented on genotyping microarrays and can be difficult or impossible to impute.However, they can be directly observed by sequencing (Backman et al., 2021;Karczewski et al., 2022;Manolio et al., 2009;Wang et al., 2021).Because few exome sequencing studies and rare variant studies for alcohol phenotypes have been undertaken (Ahangari et al., 2023;Curtis, 2022;Marees et al., 2018;Vrieze et al., 2014) the contribution of rare variation on alcohol behaviors remains poorly characterized, as does the relationship between common and rare variants.Even if the same genes are not identified, it is possible that shared biology might be implicated by studies of common and rare variants.
One way to identify this shared biology is by using biological knowledge networks.These networks contain information about the molecular interactions among genes and their products, both broadly and in disease contexts (Farris et al., 2015;Rosenthal et al., 2023).By defining a gene network from common and rare variants, we also produce a more holistic understanding of the biological mechanisms that are more actionable than lists of individual genes.While the interplay between rare and common variant-implicated genes has been studied in network space for other psychiatric traits (Ben-David & Shifman, 2012;Chang et al., 2018;Gilman et al., 2011), it has not been studied for alcohol-related traits or other substance use disorders (SUDs).Based on evidence from comparisons of common and rare variants for other psychiatric traits, we hypothesized that the same genes and molecular pathways would be identified by both approaches.
To test this hypothesis, we assembled data from UK Biobank (UKB) and other sources pertaining to both common and rare variants that are associated with alcohol consumption.We then used a network approach to investigate the biological overlap between common and rare variants for alcohol consumption.This approach allowed us to compare their relative contributions at the variant, gene, and molecular pathway levels.

Rare variant and gene experimental and control data acquisition and filtering
Rare variant summary statistics data was downloaded from Genebass's Hail library (gs://ukbb-exome-public/500 k/results/vari-ant_results.mt) and queried for alcohol consumption by phenotype code (alcohol_intake_custom) using Hail (https:// hail.is).To control against statistical artifacts, we selected alcohol consumption rare variants that had both MAF < 0.05 and minor allele count > 2.
Rare variant gene-level summary statistics were downloaded from Genebass browser (https:// app.geneb ass.org/ ).Due to limited power for computing genetic correlation and heritability from rare variants for some traits (Karczewski et al., 2022;Weiner et al., 2023), rare variant controls were chosen based on heritability and genetic correlations calculated from common variants, as described above.
As such, rare variant control traits were chosen if they had non-zero heritability (h 2 > 0.01), minimal genetic correlation with a comparable alcohol consumption trait (The Amount of Alcohol Drunk on a Typical Drinking Day (phenotype code: 20403)) (0.0 < |r g | < 0.2) and had the minimum number of rare seed genes recommended for network propagation using NetColoc (n > 5).
Phenotype codes for traits from Genebass that were used in this study are as follows: Alcohol Consumption (alcohol_intake_custom), For all traits, rare seed genes were defined as any gene that had False Discovery Rate-corrected q burden < 0.25, or q SK AT-O < 0.25, or q SKAT < 0.25, calculated for each individual dataset.For alcohol consumption, these cutoffs were p SK AT-O < 1.5 × 10 −4 , p burden < 1.1 × 10 −4 , or p SKAT < 2.7 × 10 −5 .The stringent alcohol consumption rare seed genes were defined as any genes that were significant based on the recommended significance threshold from Genebass (p SK AT-O < 2.5 × 10 −7 ; p burden < 6.7 × 10 −7 ; p SKAT < 2.5 × 10 −7 ) (Karczewski et al., 2022).

Common variant gene mapping
We generated gene-level significance values from the SNP-level summary statistics using MAGMA v1.10 (de Leeuw et al., 2015) using default parameters (https:// ctg.cncr.nl/ softw are/ magma ); annotation windows were 10 kb, the 1000 Genomes European sample was used for the reference panel (1000 Genomes Project Consortium, 2015), and the Hg38 Gene locations were used for mapping (Schneider et al., 2017).MAGMA projects the SNP matrix for a gene onto its principal components and uses the principal components as predictors for the phenotype using linear regression.Association of the gene to the phenotype using the principal component SNP matrix is used to calculate an F statistic, which is used to calculate the p-value for the individual genes.Bonferroni correction was used to define significant gene-level associations (p < 2.6 × 10 −6 ).

Molecular interaction networks
The Parsimonious Composite Network (PCNet) (Huang et al., 2018;Wright et al., 2024) v1.4 was obtained from NDEx, UUID: c3554b4e-8c81-11ed-a157-005056ae23aa.PCNet is a molecular interaction resource formed from integrating 21 interaction databases that contain various evidence types, including physical protein-protein, genetic, co-expression, and co-citation evidence.Each interaction in PCNet is supported by at least two of the component databases, a threshold chosen to maximize the ability of PCNet to perform gene set recovery tasks via network propagation.All seed genes were mapped to the nodes of PCNet via gene symbols.

Network propagation and co-localization
We used the Python package NetColoc (Rosenthal et al., 2023) (https:// pypi.org/ proje ct/ netco loc/ ) for network propagation and co-localization.The sets of significant trait-associated genes from GWAS were used as "seed" genes for network propagation using a Random Walk with Restart (Vanunu et al., 2010) algorithm.Following network propagation with α = 0.5, we calculated a network proximity score (NPS) for each gene in the network by comparing the observed results to a null distribution, formed by propagating 1000 randomly selected seed gene sets.Each set was sampled to preserve the size and degree distribution of the original input set.As previously implemented (Rosenthal et al., 2021(Rosenthal et al., , 2023;;Wright et al., 2023), we binned all genes in the network by degree with a minimum of 10 nodes per bin.For each gene, the NPS was calculated as a z-score comparing the observed heat at that gene after network propagation of the gene set, to the mean of the null distribution heats at that gene.All heat values are log-transformed to ensure the distributions are approximately normal.From input seed genes calculated from common and rare variants, we independently calculated NPS common and NPS rare for each trait.We then defined a gene as colocalized between both if it had high proximity to both input sets.Therefore, we defined the combined network proximity NPS common-rare as the product of the independent species vectors: NPS common−rare = NPS common * NPS rare .
NetColoc recommends fewer than 500 input seed genes given the sample space of PCNet (~18,000 genes).Therefore, as described previously (Rosenthal et al., 2023), we employed a weighted sampling procedure for any trait having more than 500 significantly associated genes.We sampled 500 genes from the set of all significant genes (weighted by -log 10 (p) from GWAS) and ran the propagation analysis from this subset.After 100 repetitions, the 75% percentile NPS score was selected to approximate a consensus score for each gene.

Definition of the alcohol consumption network
From these NPS scores, we selected genes with high proximity scores from both common and rare sources to define the network using the following thresholds: NPS common-rare > 3, NPS common > 1.5, and NPS rare > 1.5.NPS cutoffs were selected after a sensitivity analysis using seed genes from the alcohol consumption network (Figure S2).These cutoffs were used as they had the most significant network overlap compared with a permuted control (p < 0.05, Bonferroni corrected), while having a moderate to high observed/ expected network overlap, and moderate network size (N nodes > 100).As previously shown, these parameters performed well for other datasets as well (Rosenthal et al., 2023).To calculate the significance of the network co-localization, we compared the conserved network size and the mean NPS common-rare to a permuted null distribution.We permuted the labels of NPS rare and NPS common 10,000 times, each time calculating the mean NPS common-rare across all genes and the number of genes passing the above thresholds.For genes present in both input sets, labels were permuted separately to maintain the higher expected distribution for these genes.The significance of the conserved network size and mean NPS common-rare was calculated using the Z-test.

GWAS catalog
To identify previously annotated genes, we used GWAS findings aggregated by the GWAS catalog (https:// www.ebi.ac.uk/ gwas/ ).We used the GWAS catalog's gene level associations version v1.0.2-associations_e111_r2024-04-22.We identified genes that had previously been associated with categories of interest by querying the Mapped Traits and the Disease/Traits, listed as "Mapped Trait: Disease/Trait (Pubmed ID)" in Tables S1, S3-S5.Traits were grouped into categories alcohol use, smoking and nicotine use, non-alcohol or smoking SUDs (for example, opioid use disorder), and non-SUD neurological and neuropsychiatric (including cognitive, mental health and psychiatric, and neuro-degenerative).All traits from the GWAS catalog identified in each group are listed in Table S1.All groups are mutually exclusive.In Tables S3-S5, traits are listed only once per gene for readability.Enrichment for each group was calculated using a hypergeometric test.

Rare variant PheWAS
To assess the association of network genes with other phenotypes through rare variants, gene level PheWAS results were downloaded from Genebass's Hail database (gs://ukbb-exome-public/500 k/results/results.mt).PheWAS associations were mapped to network genes.Network genes were determined to be significantly associated with a phenotype using the same p-value cutoffs as used for lenient seed genes from alcohol consumption (p SK AT-O < 1.5 × 10 −4 ; p burden < 1.1 × 10 −4 , p SKAT < 2.7 × 10 −5 ).

Tissue enrichment
To assess the tissue-specific expression of network genes, we used the Functional Mapping and Annotation of Genome-Wide Association Studies (FUMA) suite's in-browser gene to function tool (Watanabe et al., 2017).We used FUMA to calculate the enrichment of gene sets for 54 tissue types from human GTEx v8 using default parameters (GTEx Consortium, 2020).As described previously, this method takes normalized gene expressions (reads per kilobase per million, RPKM) from each GTEx tissue, and maps these genes to entrez ID (Watanabe et al., 2017).Pre-calculated differentially expressed genes (DEGs) sets were defined using a two-sided t-test per label versus all remaining tissue types.Genes with a Bonferroni corrected p-value < 0.05 and absolute log fold change ≥ 0.58 were selected as DEG.For the signed DEG, the direction of expression was taken into account.The −log 10 (p-values) in the graph were calculated by hypergeometric test (Watanabe et al., 2017).
The mutations in ADH1C and C4orf54 were both protective.ADH1C and ADH1B both have known roles in ethanol metabolism (Le Daré et al., 2019;Tolstrup et al., 2008), but despite C4orf54 being associated with addiction risk in prior GWAS (Hatoum et al., 2023), its function is poorly understood.

Common and rare gene-level associations
Common loci were assigned to genes based on proximity using MAGMA (de Leeuw et al., 2015), identifying 294 genes (Figure 1C; Table S3, p < 2.6 × 10 −6 ).Rare variants were previously aggregated (Karczewski et al., 2022) into gene level associations using SKAT, SKAT-O, and a gene burden test (Karczewski et al., 2022).These tests identified four genes that were significantly correlated with alcohol consumption via both SKAT-O and gene burden tests (p SK AT-O < 2.5 × 10 −7 ; p burden < 6.7 × 10 −7 ): ADH1C, PMM2, GIGYF1, and ANKRD12.Only ADH1C was significantly associated by SKAT (p SKAT < 2.5 × 10 −7 ), and was the only one of these four genes that had been previously associated with alcohol-related traits by common gene analysis.
We also considered a more lenient cutoff for genes from rare variants (q < 0.25, Figure 1D; Figure S3A; Table S4), which identified 35 genes across all tests.A total of 20 genes were identified by both SKAT-O and the burden test (Figure S3B), however, only ADH1C and PMM2 were significant in all tests.51% of genes were functionally annotated as loss of function, followed by missense and low confidence loss of function (40%), and the remaining 9% as synonymous (Figure S3C).Fourteen of these genes had previously been identified by common variants as mediating alcohol consumption and alcohol use traits in the GWAS catalog (Sollis et al., 2023) (Table S4; p = 8.24 × 10 −33 , hypergeometric test).This includes alcohol dehydrogenase genes ADH1A, ADH1B, and ADH1C, and signaling genes FOXP1, AKAP6, AKAP9, and GRM5, highlighting the overlapping regulation of SUDs and psychiatric traits.
ADH1B and ADH1C were identified by both the rare and common gene-based analyses (Figure 2A, p = 0.01, hypergeometric test).

Generation of the alcohol consumption network
Next, we examined the molecular pathways wherein these alcohol consumption genes function (Figure S1).We used PCNet, a resource  of 2.7 million pairwise associations among genes (Huang et al., 2018;Wright et al., 2024).We found that 264 common seed genes and 32 rare seed genes were present in PCNet.NPS were generated for each gene in PCNet based on their proximity to the common (NPS common ) and rare (NPS rare ) seed genes (Figure S4A, Table S5).We used these NPS scores, and their product, NPS common-rare , to define the alcohol consumption network (Figure 2B), thus identifying genes that were close in the molecular network to both common and rare seed genes, even if they were not identified by the individual studies (Table S6).
We found that the alcohol consumption network contained significantly more genes (Figure 2C, p = 3.09 × 10 −8 ) and that the mean of NPS common-rare was significantly higher (Figure S4B, p = 5.51 × 10 −6 ) than expected under the null.As a negative control, we produced networks using both the alcohol rare and common seed genes in conjunction with arbitrary traits; these negative controls did not produce networks that were larger than the permuted control (Figure 2D; Table S7).Additionally, when we considered a more stringent threshold for rare seed genes (p SK AT-O < 2.5 × 10 −7 , n = 4) we obtained similar results (Figure S4C).
However, exclusion of shared seed gene ADH1C removed the significance of the network overlap shown in Figure 2C.This reinforces the importance of genes that encode alcohol metabolizing enzymes in this network.
As shown in Figure 3, the alcohol consumption network contained 208 nodes, connected by 1226 edges.Only 27 of 264 seed genes from common and five of the 34 seed genes from rare variants were included in the alcohol consumption network.ADH1B and ADH1C, which were seed genes for both common and rare, were both present in the network (Table S6).S7 for additional controls.

(A) (D) (C) (B)
The structure of the alcohol consumption network One of the goals of generating the network shown in Figure 3 is to identify the underlying biology identified by common and rare seed genes.Several gene families previously known to play a role in ethanol metabolism were present in the network (Figure S5).For example, eight genes from the alcohol dehydrogenase (ADH) family (Le Daré et al., 2019) and seven aldehyde dehydrogenase (ALDH) family genes (Edenberg, 2007) were in the network.Six cytochrome P450 (CYP) genes, which mediate about 10% of alcohol metabolism via the microsomal pathway (Hamitouche et al., 2006), were also in the network.In addition, genes from the non-oxidative ethanol metabolism pathways, which primarily function in phase II drug metabolism (Le Daré et al., 2019), were also present.This includes two sulfotransferase (SULT) family genes, which metabolize ethanol into ethyl sulfate, and 18 genes in the UDP-Glycosyltransferase (UGT) superfamily, whose encoded proteins glucuronidate ethanol into ethyl glucuronide, a minor non-oxidative metabolite of ethanol (Walsham & Sherwood, 2014).Thus, the network recapitulates previously known biologies relevant to ethanol metabolism.
Another benefit of the network is the ability to identify relevant tissues.We found 25 tissues that were significantly enriched for F I G U R E 3 The alcohol consumption network.Subnetwork of PCNet including all genes proximal to both rare and common alcohol consumption seed genes.Purple nodes indicate common seed genes, green nodes indicate rare seed genes, dark blue nodes indicate seeds from both sources, and white nodes are network-implicated genes.Edges maintained from PCNet.Red outlined nodes have previously been annotated in the GWAS catalog for alcohol use traits.

F I G U R E 4
Validation of alcohol consumption network.(A) Enrichment of gene sets from the alcohol consumption network with bidirectional differential expression for 54 tissues from GTEx v8.DEG sets were defined by a two-sided t-tests per label, versus all remaining tissue types.Genes with p < 9.26 × 10 −4 (Bonferroni corrected) and absolute log fold change ≥ 0.58 are selected as differentially expressed.Significance was calculated via hypergeometric test.Tissues are colored by type, non-significant (NS) associations are indicated in gray.(B) Upset plot showing the overlap of genes in the alcohol consumption network that have previously been annotated in the GWAS catalog for alcohol use traits, nicotine use and smoking traits, other SUDs, and neuropsychiatric traits.S6).Consistent with the presence of genes involved in ethanol metabolism in the network, the highest enrichment was in the liver and consisted of 115 genes, including 28 genes from the ADH, ALDH, UGT, CYP, and SULT families.In addition to the liver, numerous gastrointestinal tissues were also enriched: the gastrointestinal tract mediates absorption and gastric metabolism of alcohol, and chronic alcohol consumption may lead to inflammation and increased risk of gastrointestinal and esophageal cancers (Bode & Bode, 1997;Edenberg, 2007).As expected, all brain tissues were significantly enriched.
To determine whether the genes had been previously implicated by common variants in alcohol use, other SUDs, and related psychiatric disorders, we examined annotations from the GWAS catalog (Sollis et al., 2023).Specifically, we considered annotations for alcohol use, smoking and nicotine use, other SUDs, including opioid, cannabis, and polysubstance use, and neuropsychiatric disorders (Figure 4B).In total, 203 of the 208 genes in the net- Finally, to determine whether these genes had been previously implicated by rare variants in alcohol use, we examined gene-level annotations from Genebass of genes in the network (Karczewski et al., 2022).In total, six of the 208 network genes (ADH1C, AKAP7, ATG101, DTNA, NKX6-2, and SYNJ2) were associated with secondary alcohol use traits by rare variants, including use status and frequency of use, negative societal impacts from use, and alcoholic liver disease (Table S6).Only ADH1C was also associated with alcohol use traits by common variants.Notably, all of these genes, excluding ATG101, were associated with other SUDs and neuropsychiatric traits through common variants.

DISCUSS ION
The contribution of common variants in mediating alcohol consumption has been well documented, while rare variants represent a newer frontier that has recently become feasible due to the availability of large scale sequencing data.Prior rare variant analysis used in our study identified 4 genes at a stringent (p SK AT-O < 2.5 × 10 −7 ; p burden < 6.7 × 10 −7 ) and 35 genes at a lenient threshold (q < 0.25), demonstrating the importance of rare variants for alcohol-related behaviors (Figure 1).We combined the findings from common and rare variants to determine whether they identify convergent biological networks (Figure 2).We identified a highly significant network (Figure 3).The network emphasized the role of ethanol metabolism, which was further supported by the tissue specific enrichment in both brain and liver (Figure 4), consistent with decades of research on the genetics of alcohol consumption.
The role of common variants in ethanol metabolizing enzymes is well established for alcohol consumption and related traits (Sanchez-Roige et al., 2020).Similarly, rare variant analysis of alcohol consumption identified ADH1A, ADH1B, and ADH1C, which have well documented roles in ethanol metabolism (Edenberg, 2007).By the joint analysis of common and rare variants, the network further identified genes for both oxidative and non-oxidative ethanol metabolism, including ADH, ALDH, UGT, CYP, and SULT family genes.Ethanol is primarily metabolized in the liver, but is also metabolized by the stomach and the brain (Zakhari, 2006), which was reflected in the high enrichment of network genes in the liver, gastrointestinal tissues, and the brain.Disulfiram is an effective treatment for AUD due to its inhibition of ALDH enzymes (Lanz et al., 2023), a gene family that is prevalent in the alcohol consumption network, suggesting the possibility that other genes identified by our network might also be viable pharmacological targets.
In addition to ethanol metabolism, genes found by our analyses have also been associated with neuropsychiatric conditions that are correlated (Walters et al., 2018) and highly comorbid with AUD, such as depression, schizophrenia, bipolar disorder, neuroticism, and cognitive dysfunction (Tables S3, S4 and S6).For example, the rare variant analysis identified KIF21B, which has been associated with smoking initiation (Saunders et al., 2022), ADHD (Cross-Disorder Group of the Psychiatric Genomics Consortium, 2013), and schizophrenia (Trubetskoy et al., 2022).GIGYF1 has been associated with Alzheimer's disease and schizophrenia (Ding et al., 2023;Sollis et al., 2023).Finally, SCN7A has been associated with response to cognitive behavioral therapy in unipolar depression and educational attainment (Okbay et al., 2022;Rayner et al., 2019).Similarly, the alcohol consumption network identified genes that have also been associated with neuropsychiatric conditions, such as genes from the FOXP family (i.e., FOXP1, FOXP2, FOXP4; Davies et al., 2018;Sherva et al., 2023).Another example is CACNB3 and CACNG4, calcium channel genes that have been associated with bipolar disorder and major depression (Marshe et al., 2021;Psychiatric GWAS Consortium Bipolar Disorder Working Group, 2012).Finally, the gene ADGRG6, which was identified by the alcohol consumption network, has been associated with depression and smoking initiation (Saunders et al., 2022;Yao et al., 2021).Integrative analyses may help clarify the shared mechanisms of these conditions, but together this emphasizes shared genetic susceptibility across these traits.
While this study found that common and rare variants that were associated with alcohol consumption identified a shared network, there are several limitations to consider.We found that ADH1C is needed for network colocalization, showing that it is a hub gene for this network; this may reflect the need for increased power to detect additional rare variants.We only studied alcohol consumption, however as sample sizes grow, future studies should consider other AUD-relevant phenotypes such as AUD and problematic alcohol use.Similarly, methods for mapping common SNPs to genes are imperfect; we used MAGMA, but other more or less stringent methods might produce different results.Additionally, we used a lenient significance threshold to select rare variants (q > 0.25), which likely introduced some false positives into the network analysis.
However, we repeated this analysis with a more stringent cutoff for rare variants (p SK AT-O < 2.5 × 10 −7 ) and found little change in significance of network overlap.Additionally, NetColoc is robust to false positives, but functions best with a moderate number of input genes (Rosenthal et al., 2023).Larger sample sizes will increase power for both common and rare variant discovery, improving the ratio of true to false positive findings in the future.
While future improvements to our methodology and the underlying data will improve our ability to understand rare and common variant interaction, this work identified the first gene network from common and rare variants of alcohol consumption.

AUTH O R CO NTR I B UTI O N S
BSL, SSR, and AAP conceptualized the study.BSL and JJM performed the analysis.All authors wrote and edited the manuscript.

ACK N OWLED G M ENTS
Montana Kay Lara contributed scientific input to this manuscript.

CO N FLI C T O F I NTER E S T S TATEM ENT
TI is a co-founder, member of the advisory board, and has an equity interest in Data4Cure and Serinus Biosciences.TI is a consultant for and has an equity interest in Ideaya Biosciences.The terms of these arrangements have been reviewed and approved by the University of California San Diego in accordance with its conflict-of-interest policies.

F
Common and rare variants mediate alcohol consumption.Manhattan plot of (A) common variants and (B) rare variants associated with alcohol consumption.Significance cutoff indicated in red (common: p < 5 × 10 −8 ; rare: p < 8 × 10 −9 ).p-values for peaks outside of range labeled.Only rare variants with minor allele count > 2 are shown.(C) Manhattan plot of alcohol consumption common variantimplicated genes.Significance cutoff (p < 2.6 × 10 −6 ) indicated in red.Significant genes that overlap with rare-variant-implicated genes are labeled.(D) Porcupine plot of genes calculated by burden test, SKAT-O, and SKAT algorithms from rare variants.Significantly associated genes (q < 0.25) for each test are labeled and colored in yellow, blue, and pink, for burden, SKAT-O, and SKAT, respectively.See FigureS3Afor individual Manhattan plots for each test.

F I G U R E 2
Convergence of rare and common variants on the network level.(A) Left, venn diagram showing overlap of common seed genes (purple) and rare seed genes (green).Overlapping genes are indicated in dark blue and labeled.Significance of overlap calculated via hypergeometric test.Right, venn diagram of genes passing NPS thresholds after network colocalization.Purple indicates genes with NPS common > 1.5, green indicates NPS rare > 1.5, dark blue indicates genes with NPS common-rare > 3, NPS common >1.5, and NPS rare > 1.5.Significance of intersection calculated in C. (B) NPS common and NPS rare for all genes in PCNet, with genes passing all thresholds for the alcohol consumption network (NPS common-rare > 3, NPS common > 1.5, and NPS rare > 1.5) shown in dark blue.Dotted lines indicate NPS thresholds.(C) Observed (dark blue arrow) versus expected (gray distribution) size of the alcohol consumption network following 10,000 permutations of NPS labels.p-value calculated via Z-test.(D) The observed-to-expected ratio of colocalized network size for networks calculated from common and rare seed genes from alcohol consumption and from control trait FEV1 (Forced Expiratory Volume per Second).Vertical bars indicate 95% confidence intervals.Significance calculated by Z-test, Bonferroni corrected.* indicates p = 3.09 × 10 −8 .See also FigureS4Cand Table Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/acer.15399,Wiley Online Library on [08/07/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License differential gene expression (Figure 4A; Table work are annotated in the GWAS catalog.Of these, 39 have been previously associated in alcohol use (p = 9.2 × 10 −5 , OR = 2.01, hypergeometric test), 50 for smoking traits (p = 3.4 × 10 −3 , OR = 1.56, hypergeometric test) (Karlsson Linnér et al., 2021) 9 for SUDs (p = 0.048, OR = 1.70, hypergeometric test), and 129 for other neuropsychiatric traits (p = 2.1 × 10 −3 , OR = 1.45, hypergeometric test).Of the genes associated with these traits, many had annotations in multiple categories, such as EPM2A, EXOC2, NFAT5, and SNTB1.These findings highlight the neuropsychiatric function of the network and point to a shared underlying mechanism across alcohol and polysubstance use.

29937175, 0 ,
Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/acer.15399,Wiley Online Library on [08/07/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License BSL was supported in part by NIGMS T32 GM008666.SSR was supported by NIH/NIDA DP1 DA054394.SSR and AAP were supported by NIAAA R01 AA029688.FU N D I N G I N FO R M ATI O NBSL was supported in part by NIGMS T32 GM008666.SSR was supported by NIH/NIDA DP1DA054394.SSR and AAP were supported by NIAAA R01 AA029688.