Candidate gene studies in the 21st century: meta-analysis, mediation, moderation


  • M. R. Munafò

    Corresponding author
    1. Department of Experimental Psychology, University of Bristol, Bristol, UK
      *M. R. Munafò, Department of Experimental Psychology, University of Bristol, Bristol, B58 ITN, UK. E-mail:
    Search for more papers by this author

*M. R. Munafò, Department of Experimental Psychology, University of Bristol, Bristol, B58 ITN, UK. E-mail:


The results of a large body of candidate gene studies of behavioural and psychiatric phenotypes have been largely inconclusive, with most findings failing to replicate reliably. A variety of approaches that augment the ‘traditional’ candidate gene approach are discussed, including the use of meta-analysis to combine findings from existing published reports, the investigation of mediating variables (including the use of intermediate phenotypes or endophenotypes) and the awareness of possible moderating influences (such as sex or ethnicity) and gene–environment interactions on genetic associations, possibly via epigenetic mechanisms. Advances in genotyping technology will also allow the routine use of haplotype analysis and linkage disequilibrium mapping. Examples of how these approaches may improve our understanding of how genetic associations with behavioural and psychiatric phenotypes obtain are given.

A quick search through PubMed should be enough to convince anyone that the yearly rate of published candidate gene studies is increasing exponentially. In the field of psychiatric genetics alone, the rate is currently about one paper a day (Munafò & Flint 2004). It is also abundantly clear that the volume of reports is no index of the reliability of the results: each convincing association can be paired with an equally convincing rebuttal, followed in turn by another positive finding, ad infinitum. A survey of 600 positive associations between common gene variants and disease phenotypes showed that most reported associations are not robust: of 166 associations studied three or more times, only six were consistently replicated (Hirschhorn et al. 2002). The results of several large and well-funded studies of major psychiatric disorders have been disappointing and inconsistent (Hamer 2002).

Moreover, candidate gene studies have typically focused on the ‘usual suspects’: studies of specific candidates, such as the serotonin transporter and the dopamine receptor D2 subtype, are ubiquitous in studies of behavioural and psychiatric phenotypes. While much good research is being done, it is not clear to what extent the increase in quantity of knowledge has been matched by an increase in the quality of that knowledge. Indeed, the increased volume brings with it unique difficulties: it is extremely difficult to gauge whether or not a genetic association is in fact genuine based on a cursory appraisal of the extant literature and, if it is, whether or not it obtains in all subpopulations (e.g. in both men and women, across ethnic groups, across age groups and so on).

Even if one can be reasonably confident that a genetic association is genuine, what this tells us is debatable. The direct effect of genetic variation is limited and represents only the first step in a chain of events that may ultimately lead to the behavioural phenotype that is of interest to psychologists, psychiatrists and the like. For this reason, journal editors increasingly only accept association studies of candidate genes for which there is robust evidence of a functional effect of the polymorphism under investigation. Understanding the mediating mechanisms that subserve simple genetic associations, and the role of moderating factors such as environmental events, takes more than simply the establishment of association.

Given these difficulties, what future directions will the study of candidate genes and their relationship to behavioural and psychiatric phenotypes take? A variety of possible approaches exist, including meta-analysis of existing individual candidate gene studies to obtain a more definitive conclusion regarding the role of a specific gene in relation to a specific phenotype, the investigation of possible mediator variables (both neurobiological and behavioural) in an attempt to elucidate the mechanisms that subserve any genetic association and the inclusion of putative moderator variables to clarify which subpopulations a particular genetic association obtains in.


A perennial difficulty of candidate gene studies is that they are underpowered to detect effects that, when one is considering behavioural and psychiatric phenotypes, are likely to be small. Small effect sizes consequently require extremely large sample sizes, or the employment of alternative study designs such as extreme-score designs (which may not help that much, given that a large sample will often need to be recruited to select extreme scorers on the dimension of interest). While one laboratory may not be able to obtain the numbers required to adequately power a definitive candidate gene study, however, the combined world literature may (Munafò & Flint 2004). Meta-analysis goes some way to providing a tool to analyse the data from several studies jointly. The term was first coined by Glass (1976), although the basic elements of meta-analytic techniques can be traced back to Fisher (1925), and refers to the synthesis of disparate datasets to ascertain a summary conclusion derived from this global corpus of data. It is a quantitative approach to combining systematically the results of previous research to arrive at a conclusion about a body of evidence.

Usually, meta-analysis relies on the assessment of individual effects, subgroup analyses being used to investigate different levels of potential-moderator variables. An alternative approach is to perform a multivariate meta-analysis, in the form of a meta-regression, with the inclusion of covariates within this framework. This approach allows the moderating effect of a covariate, such as sex, to be tested formally (e.g. Munafo et al. 2004a). In contrast to simple meta-analysis, meta-regression aims to relate effect size to one or more characteristics of the studies included (Thompson & Higgins 2002), and the potential utility of this approach has resulted in a marked increase in the use of meta-regression in meta-analytic reviews. In particular, this approach may reveal previously uninvestigated moderators of any association, which may then be investigated formally in a study explicitly designed for this purpose.

Meta-analysis is a potentially powerful tool for assessing population-wide effects of candidate genes on behavioural and psychiatric phenotypes and may provide evidence for previously unexpected diversity, for example by revealing heterogeneity in studies of apparently similar populations (Ioannidis et al. 2001; Ioannidis et al. 2003). But the results are only as good as the data that go into it in the first place, and the reporting of this data may constrain the extent to which a meta-analysis may be informative or even performed in the first place. In the case of genetic association studies, this can often mean, for example, being constrained to investigate two genotype groups only, as smaller studies frequently combine rare homozygote and heterozygote groups to increase statistical power. Meta-analysis is therefore not a replacement for adequately powered genetic association studies (Munafò & Flint 2004).

Several recent meta-analyses illustrate the potential utility of meta-analytic techniques in identifying genes of small effect associated with behavioural and psychiatric phenotypes. In the ADHD genetics literature, there have been several recent meta-analyses that have provided evidence for associations with dopaminergic candidate genes such as the DRD4 (Faraone et al. 2001; Maher et al. 2002) and DRD5 (Lowe et al. 2004; Maher et al. 2002;) genes. Similarly, in the schizophrenia genetics literature, there has been a recent proliferation of meta-analyses, which has gone some considerable way to resolving the confusion regarding the evidence for the role of specific candidate genes, such as the COMT (Glatt et al. 2003), DRD2 (Jonsson et al. 2003a) and DRD3 (Jonsson et al. 2003b) genes. While it would be premature to consider these results definitive, they certainly offer weightier support for the involvement of each of these individual genes in the aetiology of the corresponding phenotype than any individual study.

For a full review of the application of meta-analytic techniques to genetic association studies, see Munafò and Flint (2004).


In general, a given variable may be said to function as a mediator to the extent that it accounts for the relationship between the predictor and the outcome. Mediation can be said to occur when the predictor (i.e. genotype) significantly affects the proposed mediator (i.e. intermediate phenotype), the predictor significantly affects the outcome (i.e. phenotype) in the absence of the mediator, the mediator has a significant unique effect on the outcome and the effect of the predictor on the outcome is reduced on the addition of the mediator to the model. These criteria can be used to informally judge whether or not mediation is occurring (Baron & Kenny 1986), but MacKinnon and Dwyer (1993) have described methods by which mediation may be formally assessed.

Although tests of possible mediation should be regarded as hypothesis-generating, they can be useful in unravelling the complex pathways that will inevitably determine the association between a specific genetic variant and a behavioural phenotype. The approach is particularly salient in candidate gene studies, given the likelihood that a single gene will affect a range of complex behavioural and psychiatric phenotypes, via a number of pathways. Such genetic pleiotropy makes it difficult to determine from simple studies of association whether the variance in phenotype accounted for by variation in the genotype of interest may in fact be entirely accounted for by an intermediate phenotype.

This potential problem may be compounded by the focus on a limited number of candidate genes. For example, the serotonin transporter gene has variously been investigated in relation to anxiety-related personality traits (Lesch et al. 1996), clinical depression (Hauser et al. 2003), smoking behaviour (Munafòet al. 2004b), alcohol consumption (Hammoumi et al. 1999) and so on. Because several of these behaviours and traits are themselves correlated, one might expect that any association between, say, the serotonin transporter gene and smoking behaviour may be entirely mediated by anxiety-related personality traits, given the well-established association between elevated trait anxiety and likelihood of being a smoker. A recent study of the possible mediation of the association between serotonin transporter genotype and nicotine dependence by trait neuroticism (Munafòet al. 2005) in fact found that the association was not mediated by this trait, which has important theoretical and clinical implications.

The investigation of possible mediating influences on genetic associations requires the definition of intermediate phenotypes of theoretical interest, which may be behavioural or neurobiological in nature. Recently, the term ‘endophenotype’ has been used to describe intermediate phenotypes that are more biologically proximal to the ultimate phenotype of interest. Testing mediational hypotheses between genes, biochemical or physiological measures and behaviours may significantly enhance our understanding of how such associations operate (Stoltenberg et al. 2002).

Examples of putative endophenotype measures exist for genetic susceptibility to, for example, schizophrenia and ADHD. Impaired response inhibition has been proposed as a cognitive endophenotype for ADHD (Slaats-Willemse et al. 2003), while measures of deficits in sustained attention and visual performance have received considerable attention in the schizophrenia genetics literature. This field is perhaps leading the way in the development and implementation of endophenotype measures, which has included cognitive (e.g. Chen & Faraone 2000; Cornblatt & Malhotra 2001), psychophysiological (e.g. Anokhin et al. 2003; Myles-Worsley et al. 2004) and psychophysical (e.g. Keri et al. 2004) measures. In addition to the potential utility of such measures to afford greater statistical power to detect genetic effects of small size, these measures may serve to improve our understanding of the aetiology of these disorders and, potentially, help in reshaping the classical nosological systems and diagnostic categories. Gottesman and Gould (2003) review the criteria for considering for considering potential endophenotypes for use in behavioural and psychiatric genetics.


In statistical terms, a moderator effect can be represented as an interaction between a major independent variable and a factor that specifies the appropriate conditions for its operation. That is, the effect of the major independent variable depends on the value of the moderator variable. In the context of candidate gene research, this equates to the existence of variables that define subpopulations in only one of which a specific genetic association may exist. The existence of specific subpopulations in which certain genetic associations do or do not obtain is particularly troublesome, as it removes the possibility of a relatively straightforward neurobiological explanation of the association. For example, several recent candidate gene studies have reported a moderating effect of sex on the observed association (e.g. Yudkin et al. 2004). Another likely moderator is ethnicity, although in this case the issue is further complicated by the possibility of different allele frequencies for a certain gene within an ethnically defined subpopulation.

What the presence of such moderator variables highlights is that genetic associations are unlikely to be straightforward and will depend in large part on the environmental, social and cultural context within which they operate. To take a hypothetical example, if one were to find an association between a specific candidate gene and the likelihood of being, say, a basketball player, that association could only possibly exist in subpopulations where basketball was played as a sport. A less trivial example is that of smoking behaviour: as smoking prevalence in developed countries has declined over the past 50 years (Jarvis 2003), this has resulted in there being more variability in who is and is not a smoker. This increased variance needs to be accounted for, and genetic variability will inevitably account for some. Therefore, genetic associations may now be observed that would not have been observed 50 years ago (assuming the technology had been available then).

What the increasing number of reports of genetic associations that describe moderating influences such as these emphasize is the complexity of the causal pathways that will explain such associations. Moderator variables may be serving as proxy measures for systematic differences (for example, between men and women) in the environment in which genes are having their effect (for example, due to different social acceptability of particular behaviours). This merely serves to illustrate once more the likely complexity of the mechanisms under investigation.

Future directions

In addition to the utilization of the techniques and methodologies described above to ascertain with greater certainty the role of individual genes, to define with more clarity the phenotypes of interest and to design studies with an awareness of the possible existence of distinct sub-populations, future studies will increasingly move away from the study of individual loci of questionable functional significance.

It is now increasingly difficult to publish an association study that relies on a single marker, with the study of multiple loci and corresponding haplotype analyses becoming the gold standard. This is increasingly becoming feasible, as genotyping technologies afford the possibility of assaying multiple single nucleotide polymorphisms (SNPs) at far lower costs than would have been possible even a few years ago. This trend towards low-cost, high-throughput genotyping is likely to continue. Recently, attention has focused on the use of whole-genome linkage disequilibrium (LD) studies to map common genes. Such studies would employ a dense map of SNPs to detect association between a marker and phenotype (Kruglyak 1999). Construction of SNP maps is currently underway, for example as part of the HapMap Project.

Relatedly, functional studies are becoming increasingly important. While a number of candidate gene polymorphisms that have been extensively studied in the recent past do show evidence for a functional effect, a large number of SNPs that have been investigated lie some distance (often several Kb) from the candidate gene of interest and may provide only extremely limited information about that specific gene. Indeed, a commonly studied SNP in the DRD2 gene (the Taq1A polymorphism), which has been associated with a range of neuropsychiatric disorders, has recently been shown to lie within a novel and previously undescribed kinase gene (Neville et al. 2004). This case is particularly intriguing, as the Taq1A polymorphism, a single nucleotide polymorphism (32806C>T), has been associated with reduced dopamine D2 receptor density (Thompson et al. 1997). Clearly, functional data will be of central importance if the causal pathway from candidate gene to behavioural phenotype is to be fully understood.

A final area that is likely to become increasingly important is that of epigenetics, which relates to the stable and heritable (or potentially heritable) changes in gene expression that do not entail a change in DNA sequence (Jiang et al. 2004). In recent years, there has been rapid progress in understanding epigenetic mechanisms, which include differences in DNA methylation, as well as difference in chromatin structure. Such mechanisms may be of increasing importance in understanding gene–environment interactions. For example, cigarette smoking and alcohol consumption are now widely accepted to be environmental factors that influence DNA methylation (Poschl et al. 2004; Ventorin von Zeidler et al. 2004).


Several recent studies illustrate the potential power of these approaches to enhance molecular genetic research into behavioural and psychiatric phenotypes. As has been noted, the serotonin transporter gene has received considerable attention, not least because of the known functional effects of variation at this locus and the well-established involvement of serotonin in a range of behaviours. In broad terms, the phenotypes that have attracted the most interest in respect of the serotonin gene have been those associated with anxiety-related behaviours and traits, given the known importance of serotonergic neurotransmission in clinical anxiety and depression.

Since the first report of an association between the short allele of the serotonin transporter gene and anxiety-related personality traits (Lesch et al. 1996), the path of subsequent research has followed what has now become a familiar one of replication and non-replication, to a more-or-less even degree. By combining data from all published (and some unpublished) studies of relevant studies, Munafòet al. (2003) reported that the evidence for the association remained modest and that if it did exist it was likely to be small (as one might expect). Despite over 20 published studies being brought together in a single analysis, the effect remained marginal, and a power analysis suggested that a sample of several thousand would be required to determine definitively whether or not the effect was genuine. Moreover, discrepancies in study design (in terms of measure of phenotype employed, sex and ethnicity distribution of samples, etc.) meant that there was considerable between-study heterogeneity. While providing a useful starting point and suggesting that the serotonin transporter gene may at least be worthy of further investigation, such an approach in itself is insufficient for determining the veracity of reports of an association with anxiety-related behaviours and can certainly shed no light on the mechanisms by which such an association might operate.

Fortunately, other approaches exist, and these have been employed in respect of the same gene and related phenotypes. Hariri et al. (2002) examined the influence of the serotonin transporter gene on the response of the amygdala to fearful stimuli. This approach was predicated on the assumption that a biologically proximal endophenotype such as amygdala response would provide a stronger genetic signal than self-report measures such as personality questionnaires. As predicted, subjects with the short allele of the serotonin transporter demonstrated a greater amgydala response to fearful stimuli than those homozygous for the long allele. Importantly, the effect size was an order of magnitude greater than that found in association studies of the serotonin transporter gene with self-reported personality, with genotype accounting for 20% of the variance in amygdala response. This is what one would presumably expect if one regards the brain as the principal mediator in any genetic association with behaviour (Hamer 2002), although some caution is warranted in the blind acceptance of this assumption. But what can such studies tell us about how such neurobiologically-mediated genetic associations arise in the first place? Why does the amygdala of an individual with one genotype respond differently to that of another with a different genotype?

Some progress towards an answer can be provided by longitudinal data that enables the moderating influence of genotype on environmental events to be investigated in a methodologically sound manner. Caspi et al. (2003) used data from a large, representative birth cohort, followed longitudinally for 26 years, to investigate the interaction of serotonin transporter genotype and early life stress on the subsequent development of depression and depressive symptoms. Individuals with one or two copies of the short allele exhibited more depressive symptoms and diagnosable depression in relation to stressful life events than individuals homozygous for the long allele. The conclusion reached was that an individual's response to environmental insults is moderated by genotype.

Although these studies provide a glimpse of how various methodologies can be employed and, most importantly, integrated in the study of complex behavioural traits, they remain at the somewhat primitive stage of investigating single genes (Hamer 2002). Inevitably, such traits will be under polygenic influence, which will require the investigation of several genes simultaneously. If multiple candidates are to be investigated simultaneously, this will require explicitly designed studies that are adequately powered to detect effect sizes that are likely to be modest. In addition, candidate genes that influence neurotransmitter biosynthesis systems (e.g. tryptophan hydroxylase, monoamine oxidase, serotonin transporter, in the case of serotonin neurotransmission) are conceptually distinct from those that influence receptor systems (e.g. serotonin receptor subtypes) so that it may be desirable to investigate multiple candidate genes grouped in this way. Relatedly, the study of single markers in any given gene will increasingly be supplanted by haplotype analysis of multiple markers. Only then will we begin to make progress towards a fuller understanding of the interplay of genetic and environmental influences on the mechanisms subserving complex human behaviours.


This research was supported by Cancer Research UK. There are no conflicts of interest to declare.