Notice: Wiley Online Library will be unavailable on Saturday 27th February from 09:00-14:00 GMT / 04:00-09:00 EST / 17:00-22:00 SGT for essential maintenance. Apologies for the inconvenience.
Three primary hypotheses currently prevail for correlations between heterozygosity at a set of molecular markers and fitness in natural populations. First, multilocus heterozygosity–fitness correlations might result from selection acting directly on the scored loci, such as at particular allozyme loci. Second, significant levels of linkage disequilibrium, as in recently bottlenecked-and-expanded populations, might cause associations between the markers and fitness loci in the local chromosomal vicinity. Third, in partially inbred populations, heterozygosity at the markers might reflect variation in the inbreeding coefficient and might associate with fitness as a result of effects of homozygosity at genome-wide distributed loci. Despite years of research, the relative importance of these hypotheses remains unclear. The screening of heterozygosity at polymorphic DNA markers offers an opportunity to resolve this issue, and relevant empirical studies have now emerged. We provide an account of the recent progress on the subject, and give suggestions on how to distinguish between the three hypotheses in future studies.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
Inbreeding increases the level of homozygosity on a genome-wide basis which, in turn, might depress fitness as a result of the expression of partly recessive deleterious alleles and loss of heterozygote advantage. The occurrence of inbreeding depression is supported by numerous pedigree-based breeding experiments showing a decline in fitness-associated traits in individuals with high inbreeding coefficients (Charlesworth & Charlesworth 1987; Falconer & Mackay 1996; Lynch & Walsh 1998).
In natural populations, the immigration of individuals of unknown origin and relatedness, and the occurrence of extra-pair fertilizations, makes pedigree-building an inefficient method to determine individual inbreeding coefficients. As a result, most studies that have evaluated the relationship between the inbreeding coefficient and fitness in the wild with some potency, are either experimental (Jiménez et al. 1994), or performed on smaller isolated populations (van Noordwijk & Scharloo 1981; Gibbs & Grant 1989; Keller et al. 1994). To study the fitness effects of inbreeding in other breeding situations, researchers have relied on indirect estimates of inbreeding, which have been based on allelic data at molecular markers. The majority of such studies focus on the relationship between heterozygosity at a set of molecular markers and variation in fitness-associated traits, and several studies have shown significant multilocus heterozygosity–fitness correlations (Mitton & Grant 1984; Allendorf & Leary 1986; Ledig 1986; Houle 1989; Mitton 1993, 1997; David 1998). On the other hand, because of publication bias in favour of significant correlations, null results are likely to be under-represented in the literature. Therefore, it is difficult to assess the generality of heterozygosity–fitness correlations in natural populations (Britten 1996; David 1998).
In partially inbred populations, heterozygosity–fitness correlations might, indeed, be equivalent to inbreeding depression in its classical sense. Multi-locus heterozygosity might reflect variation in the inbreeding coefficient in such populations and might associate with fitness because of the effects of homozygosity at genome-wide distributed loci. However, this hypothesis, ‘the general effect hypothesis’, is only one of three primary hypotheses that currently prevail for the explanation of heterozygosity–fitness correlations in natural populations (Ledig et al. 1983; Leary et al. 1987; Houle 1989; Mitton 1997; David 1998; Thelen & Allendorf 2001). The general effect hypothesis and the two other prevailing hypotheses, ‘the direct effect hypothesis’ and ‘the local effect hypothesis’, are summarized in Table 1. It should be noted that the direct effect hypothesis is sometimes referred to as the ‘functional overdominance hypothesis’ and that the local effect hypothesis and the general effect hypothesis are often jointly referred to as the ‘associative overdominance hypothesis’ (e.g. Zouros 1993; Pogson & Zouros 1994; David 1998).
Table 1. Primary hypotheses that currently prevail for multilocus heterozygosity–fitness correlations in natural populations
The direct effect hypothesis: heterozygote advantage as a result of functional overdominance at the scored loci per se. Potentially important in allozyme studies.
The local effect hypothesis: apparent heterozygote advantage at the markers as a result of effects of homozygosity at closely linked fitness loci. Requires linkage disequilibria (nonrandom associations of alleles at different loci in gametes) which, for example, are expected in recently bottlenecked-and-expanded populations.
The general effect hypothesis: apparent heterozygote advantage at the markers as a result of effects of homozygosity at genome-wide distributed fitness loci. Requires identity disequilibria (nonrandom associations of diploid genotypes in zygotes) which are mainly generated by partial inbreeding.
According to the local effect hypothesis and the general effect hypothesis, heterozygosity–fitness correlations result from associative overdominance, that is, apparent heterozygote advantage as a result of genetic associations between the neutral marker loci and the loci under selection (Frydenberg 1963; David 1998; Lynch & Walsh 1998). In contrast to the direct effect hypothesis, these two hypotheses need particular population structures and do not predict heterozygosity–fitness correlations in populations that have been large and panmictic for a long period of time (Houle 1989; David 1998; Bierne et al. 2000). Even though the local and general effect hypotheses have the associative overdominance mechanism in common, they are potentially important in different ecological situations, and for different genetic reasons. The local effect hypothesis requires linkage disequilibria (non–random associations of alleles at different loci in gametes) which are, for example, expected in recently bottlenecked-and-expanded populations (Hästbacka et al. 1992; Reich et al. 2001; Stephens et al. 2001). Because strong linkage disequilibria are mainly restricted to physically linked loci, the markers will reflect fitness effects of loci in the local chromosomal vicinity. The general effect hypothesis requires identity disequilibria (non–random associations of diploid genotypes in zygotes) which are mainly generated by partial inbreeding (Ledig et al. 1983; David 1998; Lynch & Walsh 1998). In such situations there will be correlations in homozygosity throughout the genome, and the markers will reflect the fitness effects of homozygosity at genome-wide distributed loci.
Despite numerous evaluations of multilocus heterozygosity–fitness correlations during the last decades (Mitton & Grant 1984; Allendorf & Leary 1986; Ledig 1986; Houle 1989; Mitton 1993, 1997; David 1998), there is still controversy over the underlying mechanisms of such correlations in natural, as well as experimental, populations (Britten 1996; Lynch & Walsh 1998; Brookfield 1999). For example, Houle (1989), Charlesworth (1991), and Lynch & Walsh (1998) adopt the view that most heterozygosity–fitness correlations could be explained by linkage disequilibrium or partial inbreeding in the studied populations, whereas Mitton (1997) suggests that the heterozygosity–fitness correlations in the majority of allozyme studies are because of direct effects. The screening of heterozygosity at noncoding DNA markers offers an opportunity to resolve the issue, and relevant studies have recently emerged (Bierne et al. 1998; Coltman et al. 1998, 1999; Coulson et al. 1998; Slate et al. 2000; Hansson et al. 2001; Thelen & Allendorf 2001; Slate & Pemberton 2002). There are several reasons why the relative importance of the different hypotheses of heterozygosity–fitness correlations is of interest. For example, only in populations where the general effect hypothesis applies will the slope of a heterozygosity–fitness regression predict the fitness consequences of matings between close relatives. Here we provide an updated review of the importance of each specific hypothesis in explaining heterozygosity–fitness correlations, point out the problems with which any one hypothesis can be conclusively assigned (see examples in Table 2), and give suggestions on how to distinguish between the hypotheses in future studies.
Table 2. Observed multilocus heterozygosity-fitness correlations likely to be explained by any of the proposed hypotheses
Direct effect hypothesis because only one type of markers (allozymes) is informative
Local effect hypothesis because extended linkage disequilibrium is expected because of recent population bottleneck and expansion
General effect hypothesis because a large number of highly polymorphic markers make a correlation with the inbreeding coefficient likely
Local effect hypothesis because allozymes might be located in particular gene-rich chromosome regions, and therefore be in linkage disequilibrium with fitness loci
Direct effect hypothesis because microsatellites might not be selectively neutral General effect hypothesis because there might be within-pedigree variation in level of homozygosity
Direct effect hypothesis because microsatellites might not be selectively neutral Local effect hypothesis because extended linkage disequilibrium is expected as a result of recent population bottleneck and expansion
General effect hypothesis because microsatellites are expected to correlate more strongly with the inbreeding coefficient than are expected to allozymes
The direct effect hypothesis
Functional overdominance occurs when individuals that are heterozygous at the markers have an intrinsically higher fitness than homozygotes (Frydenberg 1963; Houle 1989; Mitton 1997; Lynch & Walsh 1998). As mentioned above, one suggested mechanism for this, in allozyme studies, is that the biochemical system will be more efficient in heterozygotes than in homozygotes, because of the combined catalytic properties of the different enzymes in heterozygotes (Mitton 1997). This scenario is supported in studies where the whole path from enzymological properties to fitness is known, such as at the phosphoglucose isomerase of Colias butterflies (Watt 1977; more examples are listed in Mitton 1997). Hence, the direct effect hypothesis is potentially important in allozyme studies (Mitton 1997), while being of lesser importance in studies using noncoding DNA markers, such as microsatellites (Pogson & Zouros 1994; Thelen & Allendorf 2001). It is, however, important to point out that, even if microsatellites are considered neutral (Queller et al. 1993; Jarne & Lagoda 1996), some particular microsatellites serve functional roles as coding or regulatory elements and are evidently under selection (Dermitzakis et al. 1998; Kashi & Soller 1999).
It was recently shown that heterozygosity at a set of allozyme loci, but not heterozygosity at an equal number of microsatellite loci, explained the variation in juvenile condition (length-adjusted weight) in an experimental population of rainbow trout (Oncorhynchus mykiss; Thelen & Allendorf 2001). A similar contrast between the effects of heterozygosity at allozymes and DNA markers has also been found in growth rate in the deep-sea scallop Placopecten magellanicus (Pogson & Zouros 1994). These studies give support to the direct effect hypothesis. This is because the levels of linkage disequilibrium are likely to be similar at the different types of markers, and because allozymes are expected to correlate more weakly with the inbreeding coefficient than do microsatellites. Furthermore, the effects of allozyme heterozygosity on the fitness traits were the result of the relatively small effects of many loci (Pogson & Zouros 1994; Thelen & Allendorf 2001), ruling out the possibility that a few, particularly informative, allozyme loci were selected by chance. But there is still a chance that the allozymes, but not the microsatellites, are linked to fitness loci. This might occur because the exceptionally high mutation rate of microsatellites increases the probability that independent mutation events will cause alleles to be identical by state, that is, homoplasy (Pogson & Zouros 1994; Thelen & Allendorf 2001). If so, individuals that are homozygous at the microsatellite loci may not be homozygous by descent and therefore less would be likely to reflect homozygosity at linked loci. Moreover, there might be systematic differences in where different types of markers are located on the chromosome in relation to fitness loci (Thelen & Allendorf 2001). In the human genome, there are gene-rich and gene-poor regions (Thelen & Allendorf 2001). If allozyme loci tend to be in gene-rich regions, these loci would be more likely to be in linkage disequilibrium with other loci affecting fitness than would microsatellites. In this case further gene mapping might reveal that associations with very closely located genes cause seemingly direct effects. For example, in the fire ant (Solenopsis invicta), strong selection against homozygotes at the allozyme locus Pgm-3 turned out to be an effect of disequilibrium to the linked Gp-9 locus (Ross 1992; Keller & Ross 1998). For these reasons, associative overdominance caused by linkage disequilibrium cannot be ruled out in the studies by Pogson & Zouros (1994) and Thelen & Allendorf (2001), or in many other multilocus allozyme studies conducted in natural populations (Mitton & Grant 1984; Ledig 1986; Houle 1989; Mitton 1993, 1997; Britten 1996; David 1998).
The local effect hypothesis
In natural populations, significant levels of linkage disequilibrium (also referred to as gametic phase disequilibrium) are expected after certain demographic processes, such as founder events or bottlenecks followed by rapid population expansions, and intermixing of genetically different populations (Ohta 1982; Hästbacka et al. 1992; Reich et al. 2001; Stephens et al. 2001). Linkage disequilibria can also arise as a result of genetic drift in small populations (Houle 1989) and as a result of selection (Rieseberg et al. 1996; Ford-Lloyd et al. 2001). At high levels of linkage disequilibrium, markers will be associated with linked fitness loci, and are hypothesized to show associative overdominance as a result of the fitness consequences of recessive deleterious alleles or overdominance at those linked loci (Frydenberg 1963; Houle 1989; David 1998; Lynch & Walsh 1998).
Hitherto, the common view has been that linkage disequilibria normally extend over only a few kilobases (kb), because recombination drives the population towards equilibrium (Kruglyak 1999; Dunning et al. 2000). As commonly illustrated by a two-locus model with specific recombination rates (Falconer & Mackay 1996; Lynch & Walsh 1998), the decay of linkage disequilibrium is rapid between loci on different chromosomes (unlinked loci) and between loci located far from each other on the same chromosome, but takes a longer time between more closely linked loci (Fig. 1). Within the context of the local effect hypothesis, the effect of decaying disequilibrium is better exemplified by plotting the portion of the chromosome that shows the same state of homozygosity or heterozygosity as the marker locus at a certain time after a specific demographic event (Fig. 1).
Importantly, recent empirical studies challenge the view that low levels of linkage disequilibrium should be the norm in natural populations. In humans, highly and biologically significant levels of linkage disequilibria were found in recently bottlenecked-and-expanded populations (Hästbacka et al. 1992; Goldstein 2001; Reich et al. 2001; Stephens et al. 2001). In these populations the linkage disequilibria extended, on average, about 60 kb, but substantial variation existed and in several chromosome regions associations between loci were found at the longest distance measured, that is, 160 kb (Reich et al. 2001). Hence, high levels of linkage disequilibrium are also expected in other recently bottlenecked-and-expanded populations. Such populations will, for example, be found in areas glaciated during the last ice age, that is, in a considerable number of species living in the northern part of the northern hemisphere. Moreover, very recent expansions of previously bottlenecked populations will be found among populations colonizing areas affected by modern man, in introduced species, in populations that have successfully undergone enhancement programmes, and in metapopulations with high local turnover rates. In general, the linkage disequilibrium will be stronger where a bottleneck has been more severe and recent.
Potential support for the local effect hypothesis comes from data on great reed warblers (Acrocephalus arundinaceus) living in a recently founded and expanded metapopulation in Sweden (Hansson et al. 2001; see Table 2). Here, microsatellite heterozygosity was shown to explain the probability of survival among full siblings, that is, individuals of equal inbreeding coefficient. The demographic history of the species in the area, with fewer than 15 generations elapsed since the founder event in the early 1960s (Hansson et al. 2000), makes the occurrence of high levels of linkage disequilibrium, and the effects thereof, very probable (Hansson et al. 2001). The fitness effects were, however, especially strong at two of the loci in the study (Hansson et al. 2001). Even though this is also expected through interlocus variation in the rate by which linkage disequilibrium decays (Goldstein 2001; Stephens et al. 2001; see also variation between replicates in Fig. 1), and even though most microsatellites are considered selectively neutral (Queller et al. 1993; Jarne & Lagoda 1996), functional overdominance effects at these two particular loci cannot exclusively be ruled out (Kashi & Soller 1999). Moreover, at high levels of linkage disequilibrium, loci do not segregate independently along the pedigree, which means that the genome is fragmented into a finite number of chunks of chromosome (Bierne et al. 2000). As a result of random segregation of chromosomes at meiosis, and depending on the number of chunks (which is unknown for most species and populations), there will be variation in the genome-wide level of homozygosity also among individuals of the same inbreeding coefficient (Bierne et al. 2000). Thus, heterozygosity–fitness associations that are detected within pedigrees might still be explained by the general effect hypothesis. Heterozygosity–fitness correlations within sibling groups have also been found in a few experimental populations, for example, in an allozyme study of rainbow trout (Leary et al. 1987), and in a microsatellite study of the flat oyster Ostrea edulis (Bierne et al. 1998).
The general effect hypothesis
In partially inbred populations, showing significant identity disequilibria, individuals have different inbreeding coefficients, and thus differ in degree of heterozygosity. In such populations, heterozygosity at a set of markers makes it possible to distinguish between (genome-wide homozygous) inbred individuals and (genome-wide heterozygous) outbred individuals, and heterozygosity–fitness correlations might reflect the variation in heterozygosity at genome-wide distributed fitness loci as predicted by the general effect hypothesis (Frydenberg 1963; Ledig et al. 1983; Lynch 1988; David 1998; Lynch & Walsh 1998; Bierne et al. 2000). The general effect hypothesis relies on associative overdominance at genome-wide distributed loci, and can therefore be seen as a test of inbreeding depression in its classical sense.
As a result of variation in identity by descent within and between loci, and variation in identity by state between alleles, the correlation between marker heterozygosity and the inbreeding coefficient is expected to be weak in studies using few markers (Chakraborty 1981; Lynch 1988; Lynch & Walsh 1998). Hence, very large sample sizes are needed in such studies to confirm the general effect hypothesis. This line of argument suggests that heterozygosity–fitness correlations in studies using few markers — for example, as in most allozyme studies — are the result of either direct or local effects.
Support for the general effect hypothesis comes from those studies using a high number of markers (Strauss 1986; Brouwer & Osborn 1997; Coltman et al. 1999; Slate & Pemberton 2002), and from studies of partially selfing populations where a large variation in the inbreeding coefficient is expected (Strauss 1986; Brouwer & Osborn 1997; Weeks et al. 1999). For example, in the study by Slate & Pemberton (2002) on a Scottish red deer (Cervus elaphus) population, heterozygosity at as many as 71 polymorphic microsatellites was positively associated with birth weight of calves (Table 2). In this study it is very likely that marker heterozygosity correlated with variation in the inbreeding coefficient. However, this was not confirmed — probably because detailed pedigrees are difficult to obtain in this as well as in other populations. Moreover, this isolated population was founded by introducing a few animals from at least four different founder populations starting in 1845, and has expanded to a current population size of about 1500 individuals (Pemberton et al. 1988). Since the first founder event, only a few generations have elapsed (about 20 generations assuming a generation time of 7 years, Coulson et al. 1998), suggesting substantial levels of linkage disequilibrium in this population. Hence, the local effect hypothesis might apply to findings of heterozygosity–fitness correlations in this population (Pemberton et al. 1988; Coulson et al. 1998; Slate et al. 2000).
From the fact that a single generation of outcrossing will destroy the associations found under partial inbreeding, Charlesworth (1991) suggested that findings of variation in the strength of the heterozygosity–fitness correlation between populations (Ledig et al. 1983), and between years in the same population (Gaffney 1990), are likely to reflect the effects of variation in identity disequilibrium. The significance of this argument has to be confirmed, because such relationships can be readily explained by spatial and temporal variation in processes generating linkage disequilibrium, as well as in selection pressures.
Few studies give conclusive evidence for either of the three primary hypotheses that currently prevail to explain multilocus heterozygosity–fitness correlations (Table 2). There is therefore a pressing need to design future studies that would evaluate their relative importance in natural populations.
A promising way to proceed is to use data from those populations where detailed pedigrees can be constructed, and examine the strength of the correlation between the inbreeding coefficient and marker heterozygosity, on the one hand, and between these two parameters and fitness on the other. For example, it might be possible to show, as in a study of standardbred horses (Cothran et al. 1986), that variation in marker heterozygosity, but not in inbreeding coefficient, correlates with the trait under study (which is counter to the general effect hypothesis). Another promising approach is to compare the heterozygosity–fitness correlations of different types of markers, specifically allozymes vs. microsatellites, as in the studies by Pogson & Zouros (1994) and Thelen & Allendorf (2001).
It would also be highly informative to perform heterozygosity–fitness studies among full siblings, or within other groups of individuals of equal inbreeding coefficient, and, in this way, exclude effects that are connected to variation in the inbreeding coefficient (Leary et al. 1987; Bierne et al. 1998; Hansson et al. 2001). To disentangle whether possible heterozygosity–fitness correlations in such data sets are because of functional overdominance or are effects related to linkage disequilibrium (local effects because of closely linked genes or general effects because of within-pedigree variation in level of homozygosity), such studies could be accompanied by additional temporal analyses. More specifically, a decline in the strength of a positive heterozygosity–fitness correlation over time is predicted in recently bottlenecked-and-expanded populations, as a consequence of the decay in linkage disequilibrium as a result of recombination. The absence of a temporal decline in the strength of the heterozygosity–fitness correlation does not, however, exclusively support the direct effect hypothesis, because linkage disequilibrium might be recreated (for unknown reasons) in the population.
Furthermore, it is important to report null results as well because the different hypotheses sometimes give different predictions on when heterozygosity–fitness correlations are expected in relation to marker- and population-specific patterns.
The adaptive distance model by Smouse (1986) was designed to distinguish the direct effect hypothesis from the two other hypotheses. Within a locus, a positive correlation between phenotypic expression at the different homozygotes and the respective allele frequencies was hypothesized to favour functional overdominance, and hence give support for the direct effect hypotheses. A possibility to distinguish the hypotheses was also hoped for in analyses between loci, by assessing the degree of correlation between the average trait difference between heterozygotes and homozygotes and an index of allele frequency evenness (Smouse 1986; Zouros 1993). Unfortunately, these correlations are also expected under associative overdominance as shown by the work of Zouros (1993) and Houle (1994).
To resolve the issue finally, it might be necessary to perform fitness studies on genetic model species with well-characterized genomes and a large number of available polymorphic markers. For example, studies on such species will potentially reveal whether allozymes are located in gene-rich chromosome regions, and hence are more likely than other markers to be in linkage disequilibrium with fitness loci. The knowledge gained would be of crucial importance when evaluating heterozygosity–fitness correlations in population studies performed with different types of markers (Pogson & Zouros 1994; Thelen & Allendorf 2001). Moreover, a large number of markers will make it possible to quantify the extent of linkage disequilibrium in populations with different demographic histories (Reich et al. 2001; Stephens et al. 2001). This in turn will facilitate the evaluation of effects of closely linked fitness loci and make it possible to estimate the expected variation in homozygosity within pedigrees. Another advantage of the availability of many markers is that the variation in identity disequilibrium could be measured empirically. Also, information from quantitative trait loci analyses of these model species can be used to choose a large set of markers of which single loci are not associated with the trait under study in the population. Such an approach will make an evaluation of the general effect hypothesis possible.
Overall, much empirical work remains to be done before a full picture emerges on why and when significant heterozygosity–fitness correlations are expected in natural populations. It is our belief that this will happen only by taking into consideration marker-, population-, and species-specific patterns and processes.
We thank David Richardson, Håkan Wittzell and an anonymous referee for comments on the manuscript, and Steve Ridgill for linguistic corrections.
The authors are doctoral candidates with a common interest in population ecology and conservation biology.