Identifying functionally relevant candidate genes for inflexible ethanol intake in mice and humans using a guilt‐by‐association approach

Abstract Gene prioritization approaches are useful tools to explore and select candidate genes in transcriptome studies. Knowing the importance of processes such as neuronal activity, intracellular signal transduction, and synapse plasticity to the development and maintenance of compulsive ethanol drinking, the aim of the present study was to explore and identify functional candidate genes associated with these processes in an animal model of inflexible pattern of ethanol intake. To do this, we applied a guilt‐by‐association approach, using the GUILDify and ToppGene software, in our previously published microarray data from the prefrontal cortex (PFC) and striatum of inflexible drinker mice. We then tested some of the prioritized genes that showed a tissue‐specific pattern in postmortem brain tissue (PFC and nucleus accumbens (NAc)) from humans with alcohol use disorder (AUD). In the mouse brain, we prioritized 44 genes in PFC and 26 in striatum, which showed opposite regulation patterns in PFC and striatum. The most prioritized of them (i.e., Plcb1 and Prkcb in PFC, and Dnm2 and Lrrk2 in striatum) were associated with synaptic neuroplasticity, a neuroadaptation associated with excessive ethanol drinking. The identification of transcription factors among the prioritized genes suggests a crucial role for Irf4 in the pattern of regulation observed between PFC and striatum. Lastly, the differential transcription of IRF4 and LRRK2 in PFC and nucleus accumbens in postmortem brains from AUD compared to control highlights their involvement in compulsive ethanol drinking in humans and mice.


| INTRODUC TI ON
Alcoholism is a chronic disorder characterized by compulsive ethanol seeking and intake despite negative consequences (Koob & Volkow, 2016). Among the brain regions altered by chronic ethanol intake, the striatum and the prefrontal cortex (PFC) are considered central to reinforcement and decision-making over ethanol consumption (Koob & Volkow, 2010). Recently, studies have been showing the different pattern of gene expression among brain regions in both human and animal models of chronic alcohol administration (Bogenpohl et al., 2019;Farris et al., 2015). It is proposed that changes in the regulation of gene expression contribute to the long-lasting changes in chronic ethanol-induced neuronal plasticity resulting in inflexible changes in behavior (Nestler, 2001).
We published two transcriptional studies, in the PFC and striatum, that compared inflexible drinker mice (consume ethanol despite negative consequences, akin to human alcohol addiction) to light drinker mice (mice who preferred water before and after withdrawal and after ethanol adulteration with quinine) (da Silva E de Paiva Lima et al., 2017;Silva et al., 2016). The transcriptional analysis revealed that the Lrrk2, Camk2a, Camk2n1, Pkp2, and Gja1 genes were differentially regulated in inflexible drinkers compared to light drinkers, implicating them in the loss of control over ethanol consumption (da Silva E de Paiva Lima et al., 2017;Silva et al., 2016). However, there were many other genes differentially expressed in PFC and striatum that remained unexplored. In this regard, gene prioritization approach emerges as an extremely useful tool to explore and select remaining candidate genes from transcriptome studies that could be associated with a specific disease or condition (Albert & Lemonde, 2004;Kominakis et al., 2017;Tian et al., 2008). The "guilt-by-association approach" is one type of network-based prioritization tools, which principle suggests that the genes whose products (proteins) interact with the products of known disease genes are more likely to be disease genes (Guney et al., 2014).
The goal of the present study was to identify functional candidate genes associated with the regulation of dopamine pathways, neuronal activity, intracellular signal transduction, synapse plasticity, and behaviors, due to the relevance of these processes for the control of ethanol intake. To explore the genes on microarrays and identify possible candidate genes, we used a guilt-by-association approach that took into consideration the ethanol-induced neurobiological process already described in the literature. We then tested some of the prioritized genes that showed a tissue-specific pattern in postmortem brain tissue (PFC and nucleus accumbens (NAc)) from humans with alcohol use disorder (AUD). This preclinical postmortem translational study allowed us to corroborate functional candidate genes in PFC and striatum of an animal model of inflexible drinking to that from expression patterns in postmortem brain samples of individuals with AUD.

| Extended chronic ethanol intake
The present study was performed with striatum and PFC microarray data previously published by our group (da Silva E de Paiva Lima et al., 2017;Silva et al., 2016). These studies used samples from the animal model reported by Ribeiro and colleagues (Ribeiro et al., 2012), and a detailed description of experimental design, ethanol consumption, and blood ethanol concentration is published in (Ribeiro et al., 2012).
In short, Swiss male mice were subjected to a three-bottle freechoice treatment: a 10% and a 5% (v/v) ethanol solution, and water.
Only male mice were used to avoid interference of hormonal fluctuation, since we know that estrogen can enhance the reinforcing and rewarding effects of alcohol, contributing to the increase of alcohol intake in female mice (Hilderbrand & Lasek, 2018;Vandegrift et al., 2017). The experimental design consisted of four steps: (1) acquisition/free-choice (AC: 10 weeks) with simultaneous access to water and ethanol solutions 5 and 10% (v/v); (2) withdrawal of ethanol solutions (2 weeks); (3) reinstatement of ethanol solutions (RE: 2 weeks); and (4) adulteration of ethanol solutions with 0.005 g/L quinine (AD: 2 weeks). The control group had access to water only throughout the experiment (Ribeiro et al., 2012). The Swiss mice are an outbred strain and were chosen for this model in order to accesses the phenotypic variability in the pattern of alcohol intake that could reflect the genotypic variability, thus representing better what is observed in the human.
At the end of the three-bottle free-choice paradigm, mice were classified based on their ethanol consumption and preference: "light drinkers" (significant higher water than ethanol consumption throughout all experiment phases); "heavy drinkers" (higher ethanol consumption than water with significant reduction of ethanol intake after adulteration with quinine); and "inflexible drinkers" (higher ethanol than water consumption throughout the experiment, without LRRK2 in PFC and nucleus accumbens in postmortem brains from AUD compared to control highlights their involvement in compulsive ethanol drinking in humans and mice.

K E Y W O R D S
alcohol use disorders, GUILDify, IRF4, LRRK, microarray data, prefrontal cortex, striatum, ToppGene OR guilt-by-association approaches significant reduction in ethanol intake after adulteration with quinine). The individual ethanol consumption profile is shown in Table   S1. Animals that did not meet any of the classification criteria were excluded. At the end of the AD phase, mice from Inflexible, Heavy, and Light groups were exposed to the same free-choice task for an extra week, allowing them to return to their previous ethanol intake patterns.
Inflexible drinkers showed high and stable ethanol consumption even under an aversive condition generated by quinine, a nonpalatable compound. Quinine adulteration is a well-established approach that can be used to model compulsive drinking in animals and is a suitable mode to demonstrate aversion resistance that has face validity for human alcoholism (Blegen et al., 2018;H. Chen & Lasek, 2019;Hopf & Lesscher, 2014). In addition, this group (113 ± 11.3 mg/dl) along with the heavy drinker mice (79 ± 19.8 mg/dl) presented intoxication levels of BEC that were significantly higher than those in the light drinkers (48 ± 13.3 mg/dl) (Ribeiro et al., 2012).

| Microarray analysis
The gene expression of bilateral striatum (dorsal and ventral) and PFC was analyzed using an Affymetrix GeneChip® Mouse Genome 430 2.0 Array (Affymetrix, São Paulo, Brazil). For light and inflexible drinkers, a pooled sample of 4 animals of each group was hybridized in triplicates, totalizing 6 chips. The evaluation of the two extreme ethanol drinking groups allowed us to find possible genes related to the compulsive drinking phenotype while controlling for the chronic presence of alcohol even though BEC levels differed (da Silva E de Paiva Lima et al., 2017;Silva et al., 2016). The fragmentation and hybridization steps were performed in accordance with GeneChip 3'IVT Express Kit (Affymetrix, São Paulo, Brazil) manual. The fluorescent scanning step was performed using the GeneChip® Scanner 3,000 (Affymetrix, São Paulo, Brazil). The array data were normalized using the RMA (Robust Multi-array Average) method using the package "affy" in the R environment. Differentially expressed genes (DEGs) were identified using the RankProd algorithm with a significance level set at 99% (p < .01). After the acquisition of DEG list, the R package mouse4302.db (version 3.2.2) was used to retrieve important information such as gene name, chromosome loci, and function for each array probe. The volcano plot and heatmap clustering analysis showing the differentially expressed genes can be found in (da Silva E Silva et al., 2016). The microarray data are available on the Gene Expression Omnibus (GEO), NCBI, and can be assessed using the following ID: GSE12 3114.
For the present study, we used the DEG list generated by the analysis described above. As we choose to apply a guilt-by-association approach to prioritize those genes, we considered the module fold change (FC) values >1.3 to do the first gene selection. This value reflects at least a 30% expression difference (either for up or downregulated) in the Inflexible drinkers versus light drinkers, given us a higher number of genes to start the analysis. The value of >1.3 has been used by others [17,18,19], and it is an effect size large enough to assess the relationship between two variables and determine their biological relevance.

| Functional prioritization of differentially expressed genes
The prioritization of DEG followed three steps: (1) DEG in the microarray analysis for PFC and striatum were selected based on fold change value to generate the statistical candidate gene list; (2) GUILDify software was used to retrieve well-established functional candidate genes (trained list) for the neurobiological process already known to be triggered by alcohol to induces its effects, through keywords selection; and (3) ToppGene software was used to perform a candidate gene prioritization using simultaneously the trained list and the statistical candidate gene list. The workflow is represented in Figure 1.
In the first step, DEGs from the microarray data for PFC and Striatum (da Silva E de Paiva Lima et al., 2017;Silva et al., 2016) were selected based on FC > 1.3, to generate the statistical candidate gene list. In the second step, the GUILDify database (BIANA knowledge base) was used to link genes and phenotypes in animal models (Mus musculus). GUILDify uses keywords chosen by users to search in UniProt, OMIM, and GO databases, and products of genes (proteins) that match these keywords. GUILDify maps the selected proteins onto a genome-wide protein interaction network (PPI) and runs the global topology-based prioritization algorithm (NetScore). As an output, GUILDify provides a likelihood score (GUILDify score) associating the gene product with the phenotype for each gene product in the PPI network (Guney et al., 2014).
In the third step, ToppGene was used to perform a candidate gene prioritization using the trained list (obtained with GUILDify) and the candidate gene list (FC > 1.3) simultaneously. Briefly, ToppGene performs an annotation-based prioritization analysis through a fuzzybased multivariate approach to compute the similarity between any two genes based on semantic annotations. The similarity scores from individual features are combined into an overall score using a statistical meta-analysis. A p-value of each annotation of a test gene is derived by random sampling of the whole genome (J. Chen et al., 2009).
In our study, the functional information shared between the "trained" gene list and the candidate genes was used to perform the multivariate analysis. The following sources were used to retrieve the functional information for the genes in both lists: Gene Ontology (GO) terms for molecular function (MF), biological process (BP), and cellular component (CC); human and mouse phenotypes; metabolic pathways; PubMed publications; coexpression pattern; and diseases. Finally, p-values were obtained using a statistical meta-analysis, where a random sampling of 5,000 genes from the whole genome for each annotation information was combined to estimate an overall p-value. Subsequently, a false discovery rate (FDR) of 5% multiple correction (p-value ≤ 10e-4) was applied and the significant prioritized genes were selected. It is important to highlight that those genes that were present in both the trained and candidate gene lists were automatically selected as prioritized genes. These analyses were performed independently for the candidate genes identified in striatum and PFC.

| Gene Ontology and metabolic pathway enrichment analyses
The WebGestalt application was used to perform the Gene Ontology (GO) and metabolic pathway enrichment analyses for the prioritized genes in striatum and PFC, independently (Zhang et al., 2005). An overrepresentation enrichment analysis (ORA) was performed for each GO term category (biological process (BP), molecular function (MF), and cellular component (CC)) using a nonredundant database.
The ORA was also performed for the metabolic pathways present in the Kyoto Encyclopedia of Genes and Genomes (KEGG). For both analyses, the terms analyzed were annotated specifically for the Mus musculus genome. Terms were considered enriched with a pvalue < .05 and FDR 5% multiple correction testing and visualization of results was performed using the GOplot package on R statistical software (R Core Team, 2013;Walter et al., 2015). These analyses were performed independently for the candidate genes identified in striatum and PFC. To evaluate the fold change profile in each enriched term, a z-score was calculated using the following formula: where up is the number of genes with positive fold change, down is the number of genes with negative fold change, and count is the total number of genes related to the enriched term.
To evaluate the functional similarity of the prioritized genes, the enriched terms associated with the selected processes used during the guilty-by-association approach were selected and the hamming F I G U R E 1 Workflow of gene prioritization on the microarrays analysis in Prefrontal cortex (PFC) and striatum of inflexible drinker mice. Using keywords that describe the biological process that underlies addiction and compulsive ethanol intake, GUILDify generated a list of genes associated with these phenotypes (trained list). ToppGene related the functional information of the trained list genes with candidate genes of the microarray of each structure separately through a fuzzy-based multivariate analysis, which generated a list of prioritized genes. WebGestalt was used to perform Gene Ontology and metabolic pathway enrichment analyses for the prioritized genes in PFC and striatum with assistance of an overrepresentation enrichment analysis. Subsequently, the most functional relevant enriched terms were selected and the Hamming distance among the genes was estimated using an incidence matrix composed of the genes and the terms. Subsequently, the Hamming distance was used to calculate the Euclidian distance and the prioritized genes were clustered. Finally, NetworkAnalyst was used to identify potential transcription factors with higher regulatory potential for the prioritized genes in PFC and striatum. The microarray data are available on the Gene Expression Omnibus (GEO), NCBI, and can be assessed using the following ID: GSE12 3114 distance among the genes was estimated using the incidence matrix composed by the genes and the enriched terms. The hamming distance matrix was used to compute the number of differences among the DEG regarding the enriched terms (obtained from BP, MF, CC, and KEGG terms). In other words, all the pairs of DEG were compared using an incidence matrix to account the number of enriched processes were annotated to the first gene of the pair but not the second gene from the pair and vice versa. Consequently, the hamming distance matrix, obtained from the original incidence matrix (composed by genes and enriched terms), was used to calculate the Euclidian distance between the pairs of genes, resulting in a similarity matrix. Once the Euclidean distance was calculated, the similarity matrix was used as input of the multidimensional scaling analysis in order to create a map of the distances among the genes using two dimensions. Subsequently, the proportion of the variance explained by the two dimensions used to create the distance map among the genes was calculated.

| Identification of potential transcription factor for the best functional candidate genes
The prioritized genes identified in the striatum and PFC were subjected to a gene network analysis in order to identify potential transcription factors (TFs) using the NetworkAnalyst application (Xia et al., 2015). The potential TFs were obtained from the ENCODE ChIP-seq data using only peak intensity signal <500 and the predicted regulatory potential score <1 (using BETA Minus algorithm). Subsequently, a "regulatory network" was created using the interactions between the prioritized genes and the potential TFs, where the nodes represent either the genes or the TF (circles and squares, respectively), and the edges represent the predicted interaction between them. The centrality metrics (degree and betweenness) for each network were analyzed to identify those TFs that explain most of the network topology. Consequently, using this methodology, it is possible to identify those TFs that have a higher regulatory potential for the functionally prioritized genes.
To evaluate the relationship between the potential TFs and the prioritized genes between and within tissues, we used a Venn diagram.

| Postmortem Human Brain: subjects, clinical assessment, behavioral measures, and real-time PCR
Human postmortem brain tissue was obtained from the New South Wales Tissue Resource Centre (NSWBTRC) at the University of Sydney, Australia. PFC and nucleus accumbens (NAc) were analyzed from males with severe AUD (PFC: n = 10 and NAc: n = 8) and from male controls (PFC: n = 13 and NAc: n = 12) that consumed less than 20 g of absolute alcohol per day (Sutherland et al., 2016). All AUD subjects had alcohol detected in blood at the time of death.
Real-time PCR reactions for each gene were performed using 10 µl of TaqMan™ Universal PCR Master Mix (Thermo Fisher), 0.5 µl of TaqMan assay, and 3.5 µl of ultra-pure water. For all reactions, a negative control without cDNA template (NTC) was tested, and the final reaction volume was kept at 10 µl. The relative quantities of the transcripts were calculated by the delta-delta Ct method (Pfaffl, 2001) using the GADPH gene as a endogenous control according to Vandesompele et al. (2002). Data were analyzed for the Gaussian distribution using the Shapiro-Wilk and Anderson-Darling normality tests. ROUT method was used to identify outliers (Q = 1%).
Independent t tests were used to calculate differences in gene expression between AUD and controls for IRF4 and DNM2 in NAc and for IRF4 and PRKCB in PFC. The Mann-Whitney test was used for LRRK2 in NAc and PLCB1 in PFC. We report both uncorrected (p < .05) and corrected false discovery rate 5% (FDR) corrected (described as q value) results. Statistical tests were performed using GraphPad Prism version 7.01 and R software.

| Ethics statement
Animal experimentation was carried out in compliance with institutional guidelines and approved by the Ethics Committee for Animal  (Table 1 and Table 2 and Table S4) Table S3). Several terms related to the regulation of the nervous system (i.e., synaptic transmission, synaptic vesicle cycle, regulation of membrane potential), behavior, and response to stimulus were identified as enriched in both functional candidate genes' list. These results reinforce the potential of guiltby-association approaches to identify candidate genes associated with target phenotypes among a new list of candidate genes using the functional profile of previously reported candidate genes. Tables S5 and S6 present all the enriched terms for GO and KEGG analyses. terms (e.g., Pink1, Bdnf, Gria1), other genes were associated with just one or few terms (e.g., Il1rap, Scn1a, Cep97). Figure 3 depicts the TF-target gene network for PFC (3A) and striatum (3B). Each node in this network represents a gene (circles) or a TF (squares), and each edge between two nodes represents evidence of regulatory interaction. Table 3 shows the 10 TFs with the highest centrality metric in each network, as well as the potential target genes. The centrality metrics for all the nodes presented in Figure 4 are listed Table S7. The interferon regulatory factor gene (Irf4), prioritized in the striatum, was identified as one of the TFs with the highest centrality metric in the PFC network ( Figure 5).

| Potential transcription factors
The PCA plot in Figure 5 was created using the first two principal components of a multidimensional scaling (MDS) analysis. The components were obtained using the Euclidean distance between each pair of genes in the dataset. The Euclidian distance was estimated from a nongeometric distance (Hamming distance) in order to avoid geometric approximations. In summary, the MDS analysis is the final step for the functional similarity analysis among the genes. After some transformations the incidence matrix composed by the DEG and enriched terms (BP, MF, CC, and KEGG) are represented in a two-dimensional map. The first and second components explain 79.77% and 6.93% of the variance, respectively. Together, both components explain more than 86% of the total variance on the difference between genes, regarding the functional profile. In sum, Figure 5 reflects the results of a functional clustering analysis performed using the all the GO terms associated (filtering was not applied based on p-value) with the prioritized genes. Additionally, it TA B L E 1 Prioritized genes in prefrontal cortex. ToppGene related the functional information (retrieved from Gene Ontology; PubMed publications; coexpression pattern; and diseases) of the trained list genes with candidate genes of the microarray of each structure separately through a fuzzy-based multivariate analyze, which generated a list of prioritized genes. *Genes were prioritized in both prefrontal cortex and striatum. The microarray data are available on the Gene Expression Omnibus (GEO), NCBI, and can be assessed using the following ID: GSE12 3114   Figure 5 indicates that Irf4 has a more similar functional pattern than the PFC-prioritized genes (red circle). In addition, the cluster analysis ( Figure 5) showed that in the striatum Dnm2, Lrrk2, and Drd2 are the genes with the largest weight in the first, which explains around 80% of variance, and second components (along with BDNF). Furthermore, these striatal genes along with Plcb1 and Prkcb, in PFC, appeared detached from the other genes within and between the tissues, suggestive of a tissue-specific functional pattern. Table 4 summarizes the demographic and clinical characteristics of AUD and control subjects. Compared to controls, the AUD subjects had higher BMI, daily alcohol intake, drinks per week, blood alcohol concentration (BAC) at time of death, higher pack-years cigarettes, and younger drinking initiation, but they did not differ in age.

| Postmortem human brain qPCR results
Moreover, AUD subjects had a lower brain weight and smaller brain volumes than controls.
Exploratory correlations between mRNA levels and drinking, The cluster analysis for the prioritized genes ( Figure 5) suggested a tissue-specific functional pattern for Irf4, Dnm2, Lrrk2, Prkcb, and Plcb1 genes in the context of compulsive ethanol drinking.
Postmortem human brain from individuals with AUD was used to test whether those prioritized genes in our animal model that present face validity for human alcohol addiction would also be found

| D ISCUSS I ON
In the present study, using a guilt-by-association approach in microarray data from an animal model of inflexible ethanol consumption (da Silva E Ribeiro et al., 2012;Silva et al., 2016), we prioritized 44 DEGs in PFC and 26 in striatum. Among those genes, the Irf4 and Lrrk2 in addition to presenting a tissue-specific pattern of regulation in the inflexible drinker mice were also differentially regulated in the PFC and NAc of postmortem brain from AUD subjects. These results suggest a crucial role for Irf4 and Lrrk2 in the context of compulsive ethanol intake in mice and humans.
The guilt-by-association heuristic has led to the identification of genes that are believed to be associated with a specific disease, phenotype, or common cellular function. Although the guilt-by-association approach is widely applied in studies aiming to scrutinize the biological processes associated with complex traits (Albert & Lemonde, 2004;Altshuler et al., 2000;Bowcock, 2007;Guo et al., 2013;Stuckenholz et al., 1999;Ziganshin & Elefteriades, 2016),

TA B L E 1 (Continued)
The selection of terms and biological processes to build the trained list, using the GUILDify, can be considered a biased approach. However, this bias is consciously introduced in the analysis due to the functional relevance of the processes to the target phenotype. In our specific case, our phenotype is the inflexible pattern of ethanol intake that includes characteristics such as longterm high ethanol intake, heightened anxiety during withdrawal, and persistent intake despite ethanol adulteration with quinine.
Those behaviors can be resultants both from pre-existing genetic differences and from persistent changes in neuronal process induced by ethanol that are already described in the literature and can be represented by the keywords chosen here (e.g., "Firing midbrain dopamine"; "Long -term potentiation"; "Inhibition NMDA").
The ToppGene will not use these keywords to select our genes; instead, the software uses the similarities between the functional patterns of the genes presented in the candidate gene list and the trained gene list. Therefore, the prioritized genes presented in this study can be interpreted as a statistical measure of how much the functional profile of each candidate gene is similar with the whole functional profile of the trained list (GUILDify) that reflects the process behind the alcohol addiction. Consequently, even if some of the genes in our initial list of candidate genes were not previously assigned to our selected terms, we were able to identify a possible function of these genes in our candidate processes due to the functional similarity. However, it is important to highlight that it is not our goal, and neither is possible to detect all genes that are associated with the inflexible pattern of ethanol intake. Our goal is to find and select genes with higher evidence of association with the process that are crucial to the development and maintenance of the inflexible phenotype observed in mice.

TA B L E 2
Prioritized genes in striatum. ToppGene related the functional information (retrieved from Gene Ontology; PubMed publications; coexpression pattern; and diseases) of the trained list genes with candidate genes of the microarray of each structure separately through a fuzzy-based multivariate analysis, which generated a list of prioritized genes. *Genes were prioritized in both prefrontal cortex and striatum. The microarray data are available on the Gene Expression Omnibus (GEO), NCBI, and can be assessed using the following ID: GSE12 3114  Additionally, the prioritized genes in the PFC and striatum are differently regulated in comparison with all DEGs found in the same tissue, highlighting that prioritized genes are working in distinct ways in response to chronic alcohol. Unfortunately, our study could not determine causal interactions between brain regions; thus, further F I G U R E 2 Circle plots for the most functionally relevant enriched terms for PFC (first row) and striatum (second row), depicting the relationship between the enriched terms and the gene expression profile for biological processes (first column) and KEGG pathways (second column). The outer circle indicates the up-(red dots) or downregulate (blue dots) state of each gene associated with each term. The inner circle represents the z-score calculated for each term using the number of up-and downregulated genes. Negative z-scores indicate a downregulation of the genes annotated for the current biological process or KEGG pathways. Positive z-scores indicate upregulation of the genes annotated for the current biological process or KEGG pathways. For the biological process enriched terms in PFC and striatum, only the 10 most significant terms were shown in order to keep all the IDs legible studies are necessary to elucidate the regulatory role that striatum pursue over PFC or vice versa in response to alcohol intake.
The transcription factor analysis showed a pattern for Irf4 suggestive of a possible regulation of genes in the PFC over the striatum or vice versa. This gene belongs to the interferon regulatory factor (IRF) family of TFs related to gene expression regulation and immune response activation (Negishi et al., 2017). Despite our finding that Irf4 was prioritized in the striatum, it was a TF with the highest centrality metric in the PFC. Furthermore, this gene appeared in the cluster analysis together with genes in the PFC and showed a more similar functional pattern with this tissue. This result suggests that Irf4 may play a crucial role in the opposite pattern of regulation observed between PFC and striatum. The activation of TFs and the neuroimmune responses are two crucial mechanisms of the brain in response to chronic ethanol and can trigger longer-term molecular neuroadaptations (Koob & Volkow, 2016). In the TF-target gene network, the Irf4 in the striatum is also associated with diverse TFs such as Tbp, Elf1, Mxl1, Jun, Zmiz1, and Chd. So far, studies have only reported on the role of the IRF family in inflammation and secondary diseases from chronic alcohol (Petrasek et al., 2011;Seki & Brenner, 2008). Therefore, the association found here highlights the Irf4 as an important target to be investigated in animal models of alcohol intake.
To investigate whether the genes that showed a tissue-specific pattern of regulation in the inflexible drinker animals (Irf4, Plcb1, Prkcb, F I G U R E 3 TF-target gene network for the prioritized genes identified in the PFC (a) and striatum (b). The blue squares represent the potential transcription factors (TFs), and the circles, the prioritized genes. Each edge between a TF and a gene represents a potential regulatory activity. The colors of the circles, as well as the area of the circle, represent the number of possible TFs associated with this gene. We also observed that LRRK2 was significantly downregulated in the NAc of humans with AUD. We had previously suggested a role  Paiva et al., 2020). Though these transcriptional differences could reflect distinct responses between the NAc and the dorsal striatum or between species, it is also possible that it is not either the up-or downregulation of this gene, but it is dysregulation in general, that is relevant to the loss of control over ethanol intake.
In conclusion, the present study is the first one in the alcohol field to apply the guilt-by-association approach using the GUILDify and ToppGene to prioritize genes. We generate a list of DEG in both PFC and striatum that we do believe to be implicated in the transition of normal to compulsive ethanol intake and that can be tested in future functional studies. Most of the prioritized genes are involved F I G U R E 5 Multidimensional scaling plot (MDS) clustering the prioritized genes identified in the PFC (red symbols), striatum (green symbols), and both tissues (blue symbols) based on the functional annotation. The genes were clustered based on the Euclidian distance obtained from the hamming distance for the incidence matrix composed by genes and the most functionally relevant enriched GO and KEGG terms. The red circle highlights the position of the IRF4 gene Characteristics AUD (n = 10) Controls (n = 13) p-value in the establishment of synapse plasticity, a crucial process that leads to neuroadaptations and ethanol-related behaviors. The test of some of the prioritized genes that showed a tissue-specific pattern in postmortem brain tissue allowed us to uncover evidence from both human AUD and inflexible drinker animals for Ifr4 underlying the pattern of regulation observed between the PFC and striatum.
Our results also highlight a prominent role of LRRK2 in the pattern of responses to compulsive alcohol drinking in humans and mice.

ACK N OWLED G M ENTS
The authors would like to thank the Conselho de Auxílio do

CO N FLI C T O F I NTE R E S T
No potential conflict of interest was reported by the authors.

AUTH O R CO NTR I B UTI O N
LMC, PASF, and ALBG were responsible for the in silico study concept and design. DSS contributed to the acquisition of microarray data. PASF performed the bioinformatic analysis. LMC and PASF assisted with data analysis and interpretation of findings. LMC drafted the manuscript. ALBG, CW, NDV, PASF, IMP, SD, and ASP provided critical revision of the manuscript for important intellectual content.
CW and NDV provided the postmortem brain tissue from humans with alcohol use disorder. All authors critically reviewed content and approved the final version for publication.

PEER R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1002/brb3.1879.

F I G U R E 6
Relative mRNA quantification in postmortem brain tissue from alcohol use disorder (AUD) and control subjects. Prefrontal cortex (PFC) and nucleus accumbens (NAc). Relative mRNA levels of (a) IRF4, (b) PLCB1, (c) PRKCB, (d) IRF4, (e) LRRK2, and (f) DNM2. In a, d, and e, * # p < .05 different from control. IFR4 in NAC (p = .034, q = 0.066) and PFC (p = .030, q = 0.066) and LRRK2 (p = .005, q = 0.030) in NAc, survived the FDR 5% correction. Unpaired t test was used to analyze the differences between the groups for IRF4 and DNM2 in NAc and for IRF4 and PRKCB in PFC. Mann-Whitney was used to analyze LRRK2 in NAc and PLCB1 in PFC. Results are presented as mean ± SEM for (a, c, d, and f) and median with 95% CI for (b and e)