Human Variation in Alcohol Response Is Influenced by Variation in Neuronal Signaling Genes


  • Geoff Joslyn,

    1. From the Ernest Gallo Clinic and Research Center (GJ, AR, GB, RLW), Emeryville, California; Department of Psychiatry (MS), University of California, San Diego, and the Veterans Affairs San Diego Healthcare System, San Diego, California; and Department of Neurology (RLW), University of California, San Francisco, California.
    Search for more papers by this author
  • Ajay Ravindranathan,

    1. From the Ernest Gallo Clinic and Research Center (GJ, AR, GB, RLW), Emeryville, California; Department of Psychiatry (MS), University of California, San Diego, and the Veterans Affairs San Diego Healthcare System, San Diego, California; and Department of Neurology (RLW), University of California, San Francisco, California.
    Search for more papers by this author
  • Gerry Brush,

    1. From the Ernest Gallo Clinic and Research Center (GJ, AR, GB, RLW), Emeryville, California; Department of Psychiatry (MS), University of California, San Diego, and the Veterans Affairs San Diego Healthcare System, San Diego, California; and Department of Neurology (RLW), University of California, San Francisco, California.
    Search for more papers by this author
  • Marc Schuckit,

    1. From the Ernest Gallo Clinic and Research Center (GJ, AR, GB, RLW), Emeryville, California; Department of Psychiatry (MS), University of California, San Diego, and the Veterans Affairs San Diego Healthcare System, San Diego, California; and Department of Neurology (RLW), University of California, San Francisco, California.
    Search for more papers by this author
  • Raymond L. White

    1. From the Ernest Gallo Clinic and Research Center (GJ, AR, GB, RLW), Emeryville, California; Department of Psychiatry (MS), University of California, San Diego, and the Veterans Affairs San Diego Healthcare System, San Diego, California; and Department of Neurology (RLW), University of California, San Francisco, California.
    Search for more papers by this author

Reprint requests: Geoff Joslyn, PhD, Ernest Gallo Clinic and Research Center, 5858 Horton St Suite 200, Emeryville, CA 94608; Fax: 510-985-3101; E-mail:


Background:  Alcohol use disorders (AUD) exhibit the properties shared by common conditions and diseases classified as genetically complex. The etiology of AUDs is heterogeneous involving mostly unknown interactions of environmental and heritable factors. A person’s level of response (LR) to alcohol is inversely correlated with a family history and the development of AUDs. As an AUD endophenotype, alcohol LR is hypothesized to be less genetically complex and closer to the primary etiology of AUDs.

Methods:  A genome wide association study (GWAS) was performed on subjects characterized for alcohol LR phenotypes. Gene Set Enrichment Analysis (GSEA) of the GWAS data was performed to determine whether, as a group, genes that participate in a common biological function (a gene set) demonstrate greater genetic association than would be randomly expected.

Results:  The GSEA analysis implicated variation in neuronal signaling genes, especially glutamate signaling, as being involved in alcohol LR variability in the human population.

Conclusions:  These data, coupled with cell and animal model data implicating neuronal signaling in alcohol response, support the conclusion that neuronal signaling is mechanistically involved in alcohol’s cellular and behavioral effects. Further, these data suggest that genetic variation in these signaling pathways contribute to human variation in alcohol response. Finally, this concordance of the cell, animal, and human findings supports neuronal signaling, particularly glutamate signaling, as a prime target for translational studies to understand and eventually modulate alcohol’s effects.

Multiple genetic and environmental components contribute to alcohol use disorders (AUDs) (Goldman et al., 2005). As with other genetically complex common psychiatric and medical conditions, etiological heterogeneity decreases the power to detect marker-phenotype associations in a Genome Wide Association Study (GWAS). In this article, we used two strategies to overcome this difficulty: (i) AUD endophenotypes were employed as the association phenotype, and (ii) functionally related gene sets, rather than individual markers, were examined for phenotype association.

Alcohol use disorders diagnoses are not precise and are at least partially based upon symptoms that are a consequence of the AUD, rather than etiological. The DSM-IV diagnosis of alcohol dependence requires an individual to fulfill three of seven symptoms, making it possible for two diagnosed alcohol-dependent individuals to have no symptoms in common (AmericanPsychiatricAssociation, 2000). Endophenotypes, measurable subcomponents that fall in the pathway between genotype and obvious disease, provide a more homogenous grouping of subjects using criteria that are closer to the etiologic components of the disease (see, for review, Gottesman and Gould, 2003). Because endophenotypes are subcomponents that together contribute to the onset of obvious disease, each endophenotype is genetically less complex than the overall disease and should, therefore, increase a study’s power to detect genetic factors (Almasy, 2003). A number of endophenotypes have been proposed and studied for alcoholism (Almasy, 2003; Bierut et al., 2002), including alcohol metabolism; brain electrophysiological measures; personality disorders; and the level of response (LR) to alcohol.

Level of response to alcohol is defined as a subject’s psychological (e.g., feeling of “high”) and physiological (e.g., motor coordination) response to a given dose and blood level of alcohol. In this study, two LR to alcohol variables were measured in the laboratory with an alcohol challenge protocol (see Materials and Methods for details). After rapidly consuming alcohol equivalent to approximately three standard drinks, a subject’s body sway (BSA) and subjective feeling of “high” [(Subjective High Assessment Scale (SHAS)] are measured to assess alcohol LR. A third LR variable [(Self-Report of the Effects of Ethanol (SREF)] is obtained via questionnaire where the subject estimates how many drinks it took to reach a given level of intoxication during their first five drinking experiences. The two alcohol challenge variables, BSA and SHAS, are moderately correlated (0.318) while the SREF is less so (SHAS/SREF = −0.172; BSA/SREF = −0.045).

Level of response to alcohol variables have the properties of good endophenotypes: they are continuous variables independent of disease state that are genetically influenced in both animals (Baldwin et al., 1991; Li, 2000; Moore et al., 1998) and humans (Heath and Martin, 1992; Madden et al., 1995; Martin et al., 1981); they are associated with alcoholism in both families and the population (Erblich and Earleywine, 1999; Pollock, 1992; Schuckit and Smith, 1996, 2000; Schuckit et al., 1996, 2000); and, importantly, independent studies have demonstrated genetic association of alcohol dependence and alcohol LR to the same locus (Joslyn et al., 2008; Wang et al., 2008).

As the results from numerous GWAS studies were reported in 2008, it emerged that the disease-associated single nucleotide polymorphisms (SNPs) discovered explained a small proportion of the population attributable risk (reviewed by McCarthy and Hirschhorn, 2008). Because of the small population attributable risk coupled with the large number of statistical tests performed in a GWAS analysis, differentiating true from false positives is challenging involving replication in new samples (often not available) or through extensive corroborating laboratory studies (Wang et al., 2005). Thus, a common result of a GWAS experiment is a long list of markers/genes that are associated with the phenotype with unremarkable statistical significance. Interpreting the result with the knowledge that there is a high false positive rate is problematic.

Strategies to analyze and interpret such gene lists were initially developed for genome wide expression array data (reviewed by Curtis et al., 2005). The core idea of these strategies is to evaluate the experimental results at the level of gene sets rather than individual genes. Gene sets are defined using biological knowledge that links genes together such as shared sequence motifs; participation in common biochemical pathways; shared gene product localization; gene product interaction; or any other organizing principle. To facilitate such analyzes, several research groups, academic consortiums, and private companies have developed organizational systems; examples include KEGG (Kanehisa and Goto, 2000); GenMAPP (Dahlquist et al., 2002); GO (Ashburner et al., 2000); and Ingenuity ( The essence of gene set analysis strategies is to determine whether an experimentally derived list of genes, for example genes whose expression differs between experimental groups, is enriched for members in a given gene set, thus conferring the organizational principle of the gene set to the experimental results. Many algorithms have been developed to evaluate enrichment (reviewed by Goeman and Buhlmann, 2007).

Wang and colleagues (2007) described techniques to apply Gene Set Enrichment Analysis (GSEA) (Subramanian et al., 2005), originally designed to analyze gene expression data, to the analysis of GWAS data. Most gene set methods require the input of a list of “positive” genes; a list of genes that meet or exceed a target p-value that will then be investigated for functional enrichment. This, of course, leads to the conundrum of what to use as a p-value cut-off, because the point of the analysis is to seek meaning for results that do not meet the strict statistical significance required by GWAS. In contrast, the GSEA method of Subramanian and colleagues does not require the establishment of an arbitrary p-value cut-off. The input is a ranked list, from low to high p-values, of all genes analyzed for association. The GSEA algorithm then looks for clustering of functionally related genes toward the top of the ranked list. The extent of clustering and the position of the cluster within the ranked list determine the significance of the GSEA test statistic. Thus, the Subramanian GSEA method eliminates the need to make arbitrary p-value cut-offs while retaining the ability to detect enrichment of gene sets among the most significantly associated genes.

In this article, we demonstrate the utility of the Wang and colleagues approach by analyzing GWAS results of a subject sample characterized for alcohol LR. The results of the analysis are consistent with human “general addiction” GWAS studies, mouse Quantitative Trait Locus studies, cell experiments, and animal behavior studies (see, for review, Feltenstein and See, 2008; Uhl et al., 2008) in implicating neuronal connections and signaling. Further, the analysis leads to a large but manageable number of candidate genes to investigate in greater detail.

Materials and Methods

Subjects and Testing Protocol

The San Diego Sibling Pair investigation is described in greater detail elsewhere (Schuckit et al., 2005; Wilhelmsen et al., 2003). Under a protocol approved by the Human Subjects Protection Committee of the University of California, San Diego (UCSD), participants were chosen from among 18- to 25-year-old respondents to a questionnaire mailed to random students at UCSD. The questionnaire was used to ascertain siblings meeting the following criteria: (i) minimum family size of 2 siblings, male or female, 18 to 25 years old; (ii) have consumed alcohol but are NOT alcohol dependent; (iii) have at least one parent with repetitive alcohol-related life problems who met the criteria for alcohol dependence using the Diagnostic and Statistical Manual of the American Psychiatric Association, Fourth Edition (American Psychiatric Association, 2000); and (iv) are not afflicted with antisocial personality disorder or other psychiatric conditions. Selected subjects were telephoned to verify the questionnaire information and invited to a face-to-face interview where they completed the Semi-Structured Assessment for the Genetics of Alcoholism interview (SSAGA) (Bucholz et al., 1994; Hesselbrock et al., 1999); participated in an acclamation session regarding the alcohol challenge testing; completed the SREF questionnaire; were scheduled for an alcohol challenge protocol; and asked for 40 ml of whole blood for genetic analyzes.

The SREF score used in this, and most evaluations, records the subject’s perception of the number of drinks (10 to 12 g of alcohol) that were required to reach four different intoxication levels (such as slurring speech) during their first five drinking experiences. Only those events actually experienced are recorded. The SREF score is defined as the total number of drinks divided by the sum of the number of events experienced (Schuckit et al., 1997, 2007).

The alcohol challenge measures, in a laboratory setting, a person’s reaction to consuming approximately 0.75 ml/kg of ethanol within an 8- to 10-minute period (dose was weight and sex adjusted to produce similar blood alcohol levels). At baseline, 15, 30 minute, and every half hour after consuming the alcohol, subjects filled out the SHAS indicating their feelings of intoxication on 13 items, each rated on a 36-point scale indicating perceived subjective changes from baseline. Using the same time series, BSA was measured using a harness attached to the chest at the level of the axilla from which two perpendicular ropes extended forward and to the left side, passing over pulleys that measured the number of centimeters of movement per minute as gathered through three 1-minute evaluations at each time point (Schuckit and Gold, 1988). Postalcohol measures were continued until 210 minutes after beverage consumption. The scores used in these analyzes include the SHAS score at the time of peak alcohol effect (60 minutes), as well as the anterior-posterior BSA score at 60 minutes (BSA) representing the average of the three 1-minute evaluations.

The three variables, SREF, SHAS, and BSA, all measure an aspect of a subject’s LR to alcohol. As expected, the variables are correlated; the two alcohol challenge variables, BSA and SHAS, are moderately correlated (0.318) while the SREF is less so (SHAS/ SREF = −0.172; BSA/SREF = −0.045). The distribution of all three variables is right skewed so all needed to be transformed to normality for genetic association analysis.

For this study, in an effort to reduce ethnic and age heterogeneity, all subjects analyzed were Caucasian and from the 18- to 25-year-old sibling generation (no parents were analyzed). The 367 family history positive subjects, 134 males and 233 females, were selected from 186 independent families: 38 singleton families; 121 two siblings families; 23 three siblings families; 3 four sibling families; and a single six sibling family. The actual number of subjects per marker-phenotype analysis varied slightly because of missing genotype and phenotype data.

DNA Preparation

DNA was extracted from blood specimens within 5 days of the draw using Gentra Puregene reagents and protocols ( Extracted DNA was quantified using the Pico Green method (Molecular Probes/Invitrogen, Carlsbad, CA), and all stocks were normalized to a common concentration for genotyping assays.


Genotyping of self-reported Caucasian siblings was carried out using the Illumina HumanCNV370-Duo DNA Analysis BeadChip. Genotype generation and quality control were performed by deCODE Genotyping Service (

Association Analysis

Because the phenotypes in this analysis are not normally distributed (skewness and kurtosis, respectively, were BSA, 1.29 and 2.28; SHAS, 1.02 and 0.56; SRE First, 0.60 and −0.06), they were corrected for nonnormality using the Box-Cox transformation (Box and Cox, 1964; Venables and Ripley, 2002). They then were scaled to mean = 0 and SD = 1 to make them comparable essentially as Z-scores.

The tests of association were performed in R (RDevelopmentCoreTeam, 2008) with the lmekin function of the kinship package (Atkinson and Therneau, 2008). This function provides a linear mixed effects model whereby the genetic relatedness among individuals (based on the kinship coefficient) is incorporated into the covariance structure of the random effects. This adjusts the model fit and in particular tests of association for the fact that the siblings are related, and therefore too their genotypes and phenotypes, which would violate the assumption of independent observations in a linear regression model.

The fixed effects portion of the model is used for testing the association between a single SNP and a single phenotype. The SNP is treated in R as a factor with three levels (categories), which is similar to coding the major homozygotes as 1/1, the heterozygotes as 0/1, and the minor homozygotes as 0/0. Three tests of association could be performed for each SNP: the difference in average phenotype between (a) the major homozygotes (1/1) and the heterozygotes (0/1), (b) between the major homozygotes (1/1) and the minor homozygotes (0/0), and (c) between the heterozygotes (0/1) and the minor homozygotes (0/0). Tests (a) and (b) were performed by testing the significance of the regression coefficients from 0, of the heterozygote term and the minor homozygote term, while holding the major homozygote coefficient constant at 0. For test (c), the minor homozygote term was held constant at 0 and the heterozygote coefficient tested for a difference from 0. For rare SNPs, it is possible that no minor homozygote individuals are present in the sample. In this case, only test (a) was available for the SNP. In all cases, the Wald test was used to examine the significance of the regression coefficients.

The model was changed slightly when analyzing the SRE First phenotype, by including in the fixed effects a covariate term to correct for the correlation between sex and SRE. Males report drinking one more drink on average to reach the same degree of impairment as reported by females (= 0.31, = 5.1 × 10−11). In the model, sex therefore is included as an additive term to adjust the phenotype for this difference. It was coded 0 for females and 1 for males.

The very large number of tests performed in the analysis required that the nominal p-values be adjusted for multiple testing. For this, false discovery rate (FDR) q-values were calculated using the method described by Storey and Tibshirani (2003).

Gene Set Enrichment Analysis

Gene Set Enrichment Analysis was carried out using the methods of Subramanian and colleagues (2005) as implemented in the software provided by the Broad Institute ( The method statistically tests (described below) whether members of a predefined gene set are randomly distributed throughout a ranked list of genes or whether the members of the gene set cluster toward the top of the list.

The ranked gene lists were assembled using the GWAS results. Genes were ranked based upon the association p-value of SNPs located within the limits of the gene expanded 100 kb 5′ and 3′ (physical locations from NCBI Build 36). The large expansion was chosen so as not to exclude the possible contribution of gene regulatory elements that operate over long distances (Bartkuhn and Renkawitz, 2008; Kleinjan and van Heyningen, 2005). If a SNP did not map to a gene, it was eliminated from the analysis; if more than one SNP mapped to a gene, then the smallest p-value was chosen to represent the gene; if a SNP mapped to more than a single gene, then all genes overlapping the SNP were tagged with that SNP’s p-value. The over 300,000 SNPs genotyped thus tag 29,974 transcripts. After removing transcripts of unknown function (e.g., C4orf45 or LOC100133692), each of the three GWAS-analyzed phenotypes, SREF, BSA, and SHAS, produced p-value ranked lists containing 20,827 genes.

The three-ranked gene lists, as described earlier, were tested against the MSigDB gene set c5.all.v2.5.symbols.gmt which contains all three GO-organizing groups: cellular component, biological process, and molecular function. The MSigDB contains gene sets preprocessed and formatted for the GSEA software. The GO gene sets, were chosen as they are well described, are commonly used in the literature and have the largest number of annotated genes.

Enrichment was calculated using the GseaPreranked module. An enrichment score (ES) is calculated by walking down the ranked list and increasing the score when a gene set member is encountered and decreasing the score when no gene set member is encountered. The final ES, the maximum deviation from zero, corresponds to a weighted Kolmogorov–Smirnov-like statistic. The statistical significance of the ES scores is empirically estimated by performing 1,000 random permutations of the ranked gene list. Finally, the FDR (Benjamini and Yekutieli, 2001; Reiner et al., 2003) method is used to control for multiple testing.

Leading Edge analysis was then performed to identify the genes that are driving the high ES. Leading Edge genes are operationally defined as all gene set members that are present at or before the point where the running ES reaches its maximum (Fig. 1). Leading Edge genes were identified for all GO terms enriched in any of the three-ranked lists at the FDR significance <0.5.

Figure 1.

 (A) Gene Set Enrichment Analysis (GSEA)-enrichment plot of the body sway (BSA) genome wide association study (GWAS)-ranked gene list against the GO term: AXON_GUIDANCE. The vertical bars along the bottom of the figure mark the location of the GO term genes within the BSA p-value sorted gene list. The pattern of the running enrichment score is indicative of clustering at the top of the list. Genes located to the left of the maximum enrichment score (marked “LE” in the figure) drive the magnitude of ES-max and are defined as the “Leading Edge Genes.” (B) Leading Edge analysis of the 420 GSEA-enriched GO terms identified 3,194 unique genes. The WENN diagram illustrates how the genes identified for each phenotype overlap. The 173 Leading Edge genes that are common to all three phenotypes were selected for further analysis.

Leading Edge genes common to the SREF, BSA, and SHAS analysis were then examined using the software package Ingenuity Pathway Analysis (IPA). IPA provides tools to group and visualizes genes according to function, interactions, localization, and canonical pathway membership. It is important to note that the GSEA analysis performed all of the statistics identifying GWAS-enriched gene sets and the key genes driving the gene set enrichment. The IPA analysis was used to organize and visualize those genes identified by the GSEA.


A sample of 367 self-identified Caucasians were genotyped using the Illumina HumanCNV370-Duo DNA Analysis BeadChip genotyping array and were analyzed for association to three alcohol LR phenotypes: BSA, SHAS, and SREF (first 5 drinking episodes)—see Materials and Methods for details. Quantile–quantile plots of the genome wide association marker test statistics demonstrated little deviation from the null hypothesis distribution except for a slight inflation at low p-values for 2 of the 3 phenotypes (supplementary Fig. S1). This pattern indicates good control over Type I error and provides evidence of real phenotype associations (McCarthy et al., 2008). Although the number of nominally significant associations does exceed the random expectation, the modest sample size was not able to generate any single result that meets stringent genome wide significance. The top marker has an association nominal p-value of 1.4 × 10−8 and a FDR q-value of 0.11 (see Materials and Methods for analysis details).

Even in the absence of strong, individual associations, it is still possible to assess whether a functionally grouped set of genes is biased toward significant associations. The association results were tested for functional enrichment following the procedure outlined in Fig. 2. The essence of the protocol was to apply increasingly stringent filters at each step in an effort to balance specificity and sensitivity. This was achieved in the first two steps of the protocol by first loosening the statistical significance thresholds to increase sensitivity at the expense of specificity, followed by the specificity-enhancing step of only advancing those genes identified through the analysis of all three phenotypes.

Figure 2.

 Flowchart outlining the analysis protocol. Step 1: after assigning the GWAS marker p-values to the proper genes, Gene Set Enrichment Analysis (GSEA)-enriched GO terms for each of three phenotypes are identified. Step 2: the genes driving the enrichment statistic, the Leading Edge genes, are then identified for each of the enriched GO terms. Leading Edge genes identified in all three phenotypes are chosen for further analysis. Step 3: the Leading Edge genes are then categorized into functions/pathways to determine whether the Leading Edge genes implicate known biological processes.

The enrichment computations were carried out by adapting methods as outlined by Wang and colleagues (2007) using the preranked module of the GSEA software described by Subramanian and colleagues (2005) and provided by the GSEA team ( GSEA, as applied to GWAS data, is a statistical method to determine whether an a priori defined set of genes, a gene set, contains a greater number of associated members than what would be randomly expected. GSEA analysis requires two data inputs: a ranked gene list reflecting GWAS association p-values and gene set definitions. The GSEA algorithm then determines whether the members of a given gene set are randomly dispersed throughout the ranked list (null hypothesis) or clustered at the top of the list, indicating a correlation between being a member of a given gene list and being highly ranked (greater GWAS significance). Such a correlation confers the organizational principle of the gene set to the GWAS results.

The ranked gene list was produced based upon GWAS p-values. Genes were assigned the p-value of the most associated SNP (lowest p-value) located within the limits of the gene expanded 100 kb 5′ and 3′; the large expansion was chosen in an effort not to exclude long-range regulatory effects (Bartkuhn and Renkawitz, 2008; Kleinjan and van Heyningen, 2005). The over 300,000 SNPs on the Illumina 370-Duo thus tagged 29,974 transcripts and 20,827 genes after transcripts annotated as open reading frames, or hypothetical proteins were removed from the list. Three p-value ranked gene lists, one for each of the LR phenotypes, were evaluated.

This approach of assigning the smallest SNP p-value to a gene does produce false positives biased toward large genes. Nonassociated genes represented by many SNPs will, on average, be assigned lower p-values than nonassociated genes represented by a single or very few SNPs. Adjusting the p-value to take into account the different number of SNPs per gene will reduce the size bias in the false positive category but it would also penalize large genes that are truly associated. For example, a large gene spanning many linkage disequilibrium (LD) blocks with a true association within one of the LD blocks would have its p-value corrected to insignificance because of the large number of SNPs located within the nonassociated LD blocks of the gene. Wang and colleagues (2007) thus concluded, and we concur, that methods to adjust for this bias are overly conservative resulting in an unacceptable loss of power to detect true associations. Therefore, to preserve the ability to accurately represent the statistical significance of truly associated genes, we allowed the statistical significance of large, unassociated genes to be overestimated.

We executed a three-step analysis strategy to determine whether our GWAS results identify gene sets that illuminate the biological processes influencing alcohol LR (Fig. 2). In the first step, GSEA analysis was used to identify Gene Ontology (GO) (Ashburner et al., 2000) terms with genes that cluster at the top of the ranked GWAS list. Analysis was performed using all GO terms present in the Molecular Signature Database (MSigDB) (Subramanian et al., 2005). The MSigDB is a database of gene sets curated and formatted specifically for GSEA and contains an abbreviated list of GO terms; redundancy was eliminated as well as very small and very large gene sets. The MSigDB contains 1454 GO terms annotating 8,299 human genes. GO terms identified as enriched at an FDR value ≤ 0.5 in any one of the three LR phenotypes, 420 GO terms in total, were promoted to the next step of the analysis.

The second step of the analysis was to identify the specific genes driving the GO term selection. In addition to the genes clustered near the top of the ranked list, enriched gene sets also contain genes from the middle and bottom of the ranked list. The genes from the top of the ranked list account for the enrichment signal. The GSEA method (Subramanian et al., 2005) identifies the genes driving the enrichment signal using the “Leading Edge” analysis module (Fig. 1). Leading Edge analysis was performed on the 420 GO gene sets enriched at an FDR value ≤ 0.5 for at least one of the LR phenotypes. In total, 3,194 Leading Edge genes were identified from the 420 GO gene sets. By making the FDR cut-off so high, we increased sensitivity by sacrificing specificity at this step of the analysis protocol. Specificity was later attained by selecting only those 173 Leading Edge genes that were independently identified in all three LR phenotypes (Fig. 1). It must be noted that while 173 genes were selected, they represent 156 independent loci. There are fewer independent loci because some of the associated SNPs, while defining a single associated locus, are in close proximity to more than a single gene; 10 of the 156 loci correspond to more than a single gene (Table 1).

Table 1.   Leading Edge Genes Thumbnail image of Thumbnail image of Thumbnail image of Thumbnail image of

The 173 genes identified by Leading Edge Analysis (henceforth referred to as LE-genes) common to all three LR phenotypes were functionally grouped and explored using the commercial software IPA. IPA compared the 173 LE-genes with its human curated and literature supported database that places genes into functional/disease groups, canonical pathways, and gene product interactions. Importantly, 172 of the 173 LE-genes were present in the IPA database and 166 of the genes have information that enables them to be functionally categorized into pathways, networks, or functions (see Table 1). A high inclusion rate was expected as all of the LE-genes are present in GO annotations indicating that there are literature sources that contain functional insights. The top 10 enriched functions and pathways are presented in Fig. 3; more comprehensive data are available in supplementary Tables S1 and S2. The top enriched functions and pathways are predominantly involved in neuronal cell processes.

Figure 3.

 Functions and pathways enriched in the 173 Leading Edge genes. The bars indicate the Ingenuity Pathway Analysis (IPA) enrichment p-value listed function/pathway with the scale on the left y-axis. The line indicates the number of LE-genes in the listed function/pathway with the scale on the right y-axis. Functional enrichment is strongest for neural cell signal transmission. Glutamate signaling is the most enriched canonical pathway. Overall, the majority of enriched functions/pathways involve neural cell functions.


We have performed a 300,000 marker GWAS on 367 Caucasian subjects characterized for their LR to alcohol, an endophenotype of AUDs. Three LR phenotypes were tested for association: BSA (Body Sway, Anterior/posterior), SHAS (Subjective High Assessment Scale), and SREF (Self-Report of the Effects of Ethanol, First 5 drinking episodes). BSA and SHAS measure the physical and cognitive effects following oral consumption of 3 drinks in about 10 minutes; SREF infers the physical and cognitive effects experienced during the subject’s first 5 experiences with alcohol based upon the subject’s memory of how many drinks it took to reach a specified level of impairment. While the data suggest that there is an excess of associated markers when compared to random expectations, there is no strong statistical evidence implicating a specific marker. In the context of recent reports of GWAS experiments looking at other genetically complex diseases, the lack of strong statistical support for an individual marker in this experiment is not unusual.

We next analyzed the data with a systems biology viewpoint. Rather than focus on individual markers/genes, we sought to determine whether there was any evidence that a functionally related set of genes, when taken together, have an effect on alcohol LR. We tested this hypothesis using gene set enrichment techniques initially developed to analyze expression array data to determine whether genes with similar expression profiles have a known functional interrelatedness. To perform this analysis, 20,827 genes were placed in a ranked list sorted from lowest to highest association p-value (analogous to ranking genes according to their expression profile in an expression array experiment). Using the GSEA method, we identified GO terms whose gene members are clustered toward the top of a ranked gene list. Using GSEA Leading Edge analysis, we then identified the specific genes that were driving the GO term enrichment signal. We thus identified 173 GSEA Leading Edge genes (LE-genes) that were commonly identified through the independent analysis of each of the three LR phenotypes. These 173 LE-genes were further explored using the software “Ingenuity Pathways Analysis” (IPA) to determine whether they fall into functional groups that shed light upon the hereditary mechanisms of alcohol LR and the development of AUDs.

The LE-gene list is strongly enriched for cell signaling functions and pathways with an emphasis on neuronal signaling as exemplified by the inclusion of Glutamate Receptor Signaling (Fig. 3). While this result may not be unexpected given the substantial cell and animal model evidence that neurological signaling mechanisms are involved in alcohol’s effects and related problems (Feltenstein and See, 2008; Uhl et al., 2008), it must be emphasized that the experimental model results provided no evidence that natural variation of these mechanisms in the human population contributes to human variability in alcohol response and dependence susceptibility. Our results provide the evidence that natural variation in neuronal signaling functions play a role in the variation of alcohol response in at least one human population.

The top four signaling pathways (cAMP, calcium, glutamate, and G-protein) can be integrated by using the NMDA receptor as a joining point (Fig. 4). All of these pathways have been implicated in alcohol studies. The NMDA receptor, in particular, has a long experimental history concluding that ethanol behaves as an antagonist of NMDA receptors [see Tsai et al. (1995) for an early review and Krystal et al. (2003) for a more recent review].

Figure 4.

 An Ingenuity Pathway Analysis (IPA) pathway diagram that incorporates the top four enriched signaling pathways. Glutamate signaling is the core of the illustration. The integration of the cAMP, calcium, and G-protein-coupled signaling pathways is illustrated. Molecules shaded gray are members of the LE-gene list. An illustration of the glutamate receptor signaling pathway as it is defined in the IPA software package. The cellular components shaded gray contain members of the 173 Leading Edge gene set identified in this analysis.

Glutamate receptors in the mammalian CNS have been divided into two major groups (Collingridge and Lester, 1989; Michaelis, 1998). Ionotropic receptors provide the best evidence for the effects of alcohol on the CNS (Nevo and Hamon, 1995) and are further categorized as AMPA, Kainate, or NMDA based upon their agonists. As NMDA receptors are sensitive to low levels of ethanol (5 to 50 mM) and the action of ethanol is considered to be most potent at NMDA receptors (Lovinger et al., 1989, 1990), they are considered ideal candidate genes for mediating AUDs. While kainate receptor-mediated currents are also inhibited by ethanol, higher concentrations are required for inhibition (Costa et al., 2000). NMDA receptors are assembled from an obligatory NR1 subunit and a combination of 4 other subunits (NR2a-d) and are divalent cation permeant. Influx of calcium via the NMDA receptors is important for the induction of synaptic plasticity underlying learning, memory, and addiction (Bannerman et al., 1995; Morris et al., 1986; Ungless et al., 2001).

The LE-gene list is enriched for genes involved in glutamate signaling including the genes encoding the NMDA subunit NR2b, along with the kainate receptor subunits GRIK1 and GRIK4. NR2b has been implicated in animal models of ethanol abuse and dependence (Ren et al., 2007; Yaka et al., 2003). GluR5, encoded by GRIK1, has been shown to bind to topiramate, a glutamate modulator that shows robust efficacy in the treatment of alcohol disorders (Gryder and Rogawski, 2003; Kaminski et al., 2004) and has been genetically associated to alcohol dependence (Kranzler et al., 2009). The genes encoding the type III metabotropic glutamate receptors (mGLURs 4, 7, and 8) that negatively modulate NMDA receptor function (Ambrosini et al., 1995) are also present in the LE-gene list. SNPs in the mGLUR8 receptor gene have previously been demonstrated to be associated with alcohol dependence (Chen et al., 2009). Further, glutamate decarboxylase (GAD1 and GAD2), the enzyme that converts glutamate to GABA, is present in the LE-gene list. Variation in the function of these enzymes could potentially affect the availability of glutamate to activate GluRs in the synaptic cleft. Finally, the LE-genes are highly enriched (9 genes from a family of 20) for transmembrane receptor-like protein tyrosine phosphatases (PTPs) that can promote the disassembly of NMDA receptors to its constituent subunits (Ferrani-Kile and Leslie, 2005).

There is evidence that NMDA receptors could be a molecular focal point contributing to the relationship between a person’s LR to alcohol and increased risk for developing AUDs. A family history of AUDs is associated with a low-alcohol LR as well as being a risk factor for AUD development. To explore the possibility that the relationship between alcohol LR and AUDs involves NMDA receptor function, Petrakis and colleagues (2004) treated family history positive and family history negative subjects with ketamine, an NMDA receptor antagonist that produces a variety of psychiatric effects that can be quantified. Interestingly, as is true for alcohol response, the family history positive group demonstrated an attenuated response to ketamine when compared to the family history negative group. This result supports the hypothesis that NMDA receptors are involved in a person’s response to alcohol and, importantly, that variation in response is mediated by variation in the NMDA receptor. Our GWAS/GSEA result implicating the involvement of Glutamate Receptor Signaling in alcohol LR variation strengthens this interpretation.

In conclusion, through the analysis of genome wide association data, we have discovered evidence that variation in the genes involved in neuronal cell communication contribute to the population variation in an individual’s response to alcohol. Many of the 173 specific genes identified have reported involvement in alcohol response and addiction in cell, animal, and human experiments. Our data, when interpreted in the context of the wider published work, leave little doubt that the general functions and pathways presented here do influence alcohol response and dependence at the level of the individual and population. The evidence for specific genes, in this and other published work, is far less certain probably reflecting the small phenotypic influence of any single allele.


This research was supported by funds provided by the State of California for medical research on alcohol and substance abuse through the University of California, San Francisco.