INVITED REVIEW: Quantitative trait locus mapping in natural populations: progress, caveats and future directions

Authors

  • JON SLATE

    Corresponding author
    1. Department of Animal and Plant Sciences, University of Sheffield, S10 2TN
      J. Slate. Tel.: 44 0114 2220048; Fax: 44 0114 2220002; E-mail: j.slate@sheffield.ac.uk
    Search for more papers by this author

J. Slate. Tel.: 44 0114 2220048; Fax: 44 0114 2220002; E-mail: j.slate@sheffield.ac.uk

Abstract

Over the last 15 years quantitative trait locus (QTL) mapping has become a popular method for understanding the genetic basis of continuous variation in a variety of systems. For example, the technique is now an integral tool in medical genetics, livestock production, plant breeding and population genetics of model organisms. Ten years ago, it was suggested that the method could be used to understand continuous variation in natural populations. In this review I: (i) clarify what is meant by natural population in the QTL context, (ii) discuss whether evolutionary biologists have successfully mapped QTL in natural populations, (iii) highlight some of the questions that have been addressed by QTL mapping in natural populations, (iv) describe how QTL mapping can be conducted in unmanipulated natural populations, (v) highlight some of the limitations of QTL mapping and (vi) try to predict some future directions for QTL mapping in natural populations.

Introduction

Evolutionary geneticists have a long-standing interest in phenotypic variation between individuals, either of the same or of different species. Understanding this variation can inform us how species adapt to their environment, and how this adaptation can lead to speciation. For example, to what extent are individual differences due to environmental effects, genetic effects or both? Before and after the synthesis of population genetics during the early 20th century the nature of genetic variation of continuous traits was keenly, sometimes acrimoniously, debated (Provine 1971; Barton & Keightley 2002). Questions that have fascinated evolutionary geneticists over the last century include: ‘How many genes determine quantitative genetic variation?’, ‘Are major genes (those that individually explain a large proportion of phenotypic variance) common?’, ‘Do individual genes explain variation in several traits (pleiotropy)?’, ‘Are the same genes responsible for trait variation across populations, and even across species?’, ‘Is gene action dependent on the environment?’, ‘What are the actual genes responsible for phenotypic variation, adaptation and speciation?’, and perhaps the most challenging of all, ‘What are the evolutionary forces that maintain genetic variation?’ (Barton & Turelli 1989). In the late 1980s, the advent of molecular markers and genetic maps meant it was possible to map the genes that explained continuous variation. While most mapping studies were focused on humans and agriculturally important organisms, evolutionary biologists were quick to initiate quantitative trait locus (QTL) studies, especially in classical model organisms such as Drosophila melanogaster (Shrimpton & Robertson 1988; Mackay 1995).

In this review I will describe how QTL mapping can be used to elucidate the genetic architecture of continuous traits in natural populations. There are already a number of excellent reviews of QTL discovery in Drosophila (Mackay 1995, 2001, 2004), domestic animals (Haley 1995; Haley 1999; Andersson 2001; Andersson & Georges 2004) and plants (Kearsey & Farquhar 1998; Mauricio 2001), as well as others with a broader taxonomic focus (Orr 2001; Barton & Keightley 2002). However, it is almost a decade since the only review of mapping in natural populations (Mitchell-Olds 1995) was published. At that time the major conclusion was that we were largely ignorant of the molecular quantitative genetics of natural populations, and Mitchell-Olds’ review can be largely regarded as a call-to-arms. The purpose of this review is to consider whether evolutionary biologists have been successful in carrying out QTL mapping in natural populations. I deliberately highlight some recent studies that have proved particularly useful in addressing some of the questions posed in the opening paragraph. I also discuss, at some depth, how QTL mapping can be performed in unmanipulated natural populations, as opposed to experimental crosses.

QTL mapping: historical background and underlying principles

I provide only a brief historical and methodological overview of QTL mapping as detailed descriptions are available elsewhere (Liu 1997; Lynch & Walsh 1998; Ott 1999; Weller 2001). The acronym QTL was first coined by Geldermann (1975). However, the underlying concept is older, having originated in the early 1900s (reviewed in Lynch & Walsh 1998). The basic premise that underlies all QTL mapping methods is straightforward. Genetic markers dispersed over an organisms’ genome are typed within a mapping population of individuals, for whom phenotype data are available. If a marker is in close physical linkage with a QTL the two will be in linkage disequilibrium within the mapping population, generating a statistically significant association between the marker genotype and trait variation. An excellent overview of experimental designs is given by Lynch & Walsh (1998), who make the distinction between inbred line crosses and outbred (intrapopulation) designs. The latter category can be separated into crosses deliberately created to maximize power to detect QTL (e.g. by creating large families of half-sibs) and those that are conducted on unmanipulated populations. Below, I briefly describe the different types of mapping population.

Inbred line crosses

The simplest and most efficient way to detect QTL is by using inbred line crosses. By crossing two inbred parental lines, populations or species, the progeny will exhibit a fixed difference between every marker and trait locus. Thus, all linked loci in the progeny (F1 generation) are in linkage disequilibrium. These F1 progeny are in turn used to create a mapping population. The two most commonly employed mapping populations are the F2 design (whereby F1 individuals are interbred) and the backcross design (whereby F1 individuals are mated to one of the parental populations. The major advantage of the F2 design over the backcross (hereafter BC) is that three genotypes are present at every marker (and QTL) in the mapping population. BC populations only have two possible genotypes at a locus. Thus, the F2 design enables estimation of dominance components of a QTL but the backcross design does not.

Outbred populations

The use of inbred line crosses to detect QTL in natural populations is unlikely to provide an accurate description of the genetic architecture of the focal trait in the parental lines in their natural environment (see section: What is meant by natural population?). Furthermore, the creation of specific mapping populations may be logistically impossible in some species. Thus, it is sometimes desirable to conduct mapping experiments on outbred populations. This is a difficult, although a possible task. An excellent discussion of the major difficulties, and of how they can be overcome is provided by Lynch & Walsh (1998). One of the major aims of this review is to illustrate how methods and software commonly employed in domestic livestock and human mapping programs can be of great value to ecologists and evolutionary biologists. Broadly speaking, outbred populations can be mapped using either sibships (half-sibs or full-sibs) or using general pedigrees spanning several generations. The distinction between these experimental designs is discussed later in the review.

Regardless of experimental design, three basic requirements must be met to map QTL. The first is a genetic map of variable markers. The second is a pedigree with which to follow the segregation of those markers. The third is phenotypic data (trait measurements) on the individual members of that pedigree. Strategies to obtain all three are discussed (see section: How to map QTL in unmanipulated populations).

What is meant by natural population?

For the purposes of this review, I regard a natural population as one in which the individuals used in a mapping study are descended from recently sampled individuals of a nondomesticated organism. This definition excludes model organisms that have been reared in the laboratory for many generations. I also exclude humans from this review, although I accept that they are regarded as natural populations by some biologists. The definition of natural population I use can be further categorized as those in which the mapping population is created by the experimenter (e.g. by crossing two individuals from different populations/ecotypes/species) and those where an unmanipulated pedigree is used. A further distinction I make between mapping populations is between those in which the population is reared in a laboratory/glasshouse environment and those in which phenotype values are measured in the wild. It is important to make the distinction between the different types of mapping population, because the preferred experimental design very much depends on the question that is being addressed (Fig. 1).

Figure 1.

QTL mapping in natural populations: strategies and examples. A mapping experiment can be conducted in either interspecies or intraspecies populations. Further distinctions between experiments can be made depending on whether they are conducted in the natural environment or in laboratory/glasshouse conditions, and on whether they are conducted in specially created mapping crosses or in unmanipulated pedigrees. The preferred experimental approach depends on the question being addressed, although there is overlap between strategies. For example, an experiment conducted in an interspecies cross in a laboratory would be appropriate to investigate speciation genetics but inappropriate to learn about microevolutionary change in either parental population. 1. Verhoeven et al. (2004); 2. Slate et al. (2002b); 3. Lexer et al. (2003); 4. Gockel et al. (2002),Calboli et al. (2003); 5. Ungerer & Rieseberg (2003); 6. Bradshaw et al. (1995, 1998); 7. Peichel et al. (2001), Colosimo et al. (2004); Shapiro et al. (2004); 8. Hawthorne & Via (2001).

If one is interested in the number and distribution of effect sizes of QTL involved in speciation or reproductive isolation, then it is usually necessary to create a mapping population from two parental species, races or populations. In theory, such an analysis could be performed in an unmanipulated population in a hybrid zone (if one exists), but understanding the genetic basis of species differences is nearly always more tractable in specially created mapping crosses. Similarly, when trying to establish whether the magnitude and direction of QTL effects are constant across environments, it is usually desirable to create replicates of a mapping population that can be studied in heterogeneous environments (either in the laboratory or the natural environment). Again, this type of study requires a specially created mapping population. However, if one is interested in studying the genetic architecture of continuous traits of a particular population, then the use of an artificially created mapping population that is reared in the laboratory presents some difficulties. First, any QTL that are detected represent fixed differences between the parental lines, rather than the standing variation segregating within the parental lines. Second, environmental variance tends to be greater in the wild relative to a carefully controlled laboratory environment, so that heritability (and therefore the proportion of trait variance explained by a QTL) is likely to be lower in the wild than in the laboratory (Hoffmann 2000). Only by measuring QTL magnitude in an ecologically realistic setting can the distribution (and intensity of selection) of allelic effects be estimated.

Although I make the distinction between the different categories of natural population used in QTL mapping I do not intend to present some as being ‘better’ than others. This review does pay particular attention to unmanipulated pedigrees, primarily because this is an area in which relatively little progress has been made.

How common are QTL mapping experiments in natural populations?

I conducted a Web of Knowledge (http://wok.mimas.ac.uk/) search on the keywords ‘QTL’ combined with either ‘natural’ or ‘wild’. By the end of 2003 there were 416 published articles that used these terms in the title, keywords or abstract, which suggests that QTL mapping experiments in natural populations are now commonplace. The number of articles published per annum has risen every year since 1992 and shows no sign of levelling off (Fig. 2). At the time Mitchell-Olds’ 1995 review appeared, just 19 articles had been published. Thus, at face value, the last decade has witnessed a dramatic uptake of QTL mapping to investigate the genetic architecture of continuous traits in natural populations. However, only 15 articles contain these search terms in the title. The vast majority of these studies have been conducted in inbred line crosses, sometimes using parental lines that have been removed from the wild for many generations. The next most common type of study is that in which mapping populations are recent descendants of wild-sampled individuals. A small subset of these mapping populations has been reared in the natural environment. In contrast, I am aware of only one study where QTL have been mapped in an unmanipulated wild population (Slate et al. 2002b). In other words, QTL mapping in natural populations has taken off, but some experimental designs have been largely ignored. The imbalance between studies of experimental and free-living populations presumably reflects the fact that all three requirements for QTL mapping have been, until recently, unavailable for most free-living natural populations. A goal of this review is to encourage more biologists to initiate QTL studies in unmanipulated populations in the natural environment.

Figure 2.

Publications on QTL mapping in natural populations since 1990. The number of publications using the terms ‘QTL’ and either ‘natural’ or ‘wild’ was obtained using Web of Knowledge. Shaded bars are yearly totals and open bars are cumulative total. Note that the number of publications per annum has risen since the early 1990s.

Recent progress

The second aim of this review is to describe how QTL mapping in natural populations can be used to address some of the questions posed in the opening paragraph. Most of the examples I discuss have been published in the last three years. Older studies have been reviewed elsewhere, or tend to focus on the number and magnitude of QTL. Although the vast majority of studies have been conducted in specially created mapping populations, an obvious way in which these studies can be distinguished from each other is on the grounds of whether parental lines are from the same or from different species.

Inter-species crosses and the genetics of reproductive isolation

‘Historically, however, the most contentious question has concerned whether major genes play a major role in species differences. It is now clear that the answer is yes, sometimes’ (Orr 2001).

One area in which QTL mapping has been particularly illuminating is in understanding the genetic architecture of reproductive isolation. Orr (2001) highlighted some of the main conclusions from interspecies crosses (notably, that some species differences can be explained by major genes, and the number of QTL underlying differences is highly variable) as well as some as the caveats from this type of study (distinguishing between factors that arose before and after isolation is not straightforward; the importance of comparing the magnitude of QTL effects to phenotypic variation within parental lines). Since Orr's review, the number of QTL studies in natural populations has doubled. The overall conclusions reached by Orr remain unchanged. However, a handful of recent studies have advanced our understanding of the genetics of reproductive isolation, addressing questions beyond the somewhat equivocal, one of, ‘Are species differences attributable to major effect genes?’. Here, I focus on three systems, which have proved particularly informative.

Threespine sticklebacks.  At the end of the last ice age, approximately 15 000 years ago, marine threespine sticklebacks (Gasterosteus aculeatus) colonized new freshwater lakes, and rapidly adapted to a diverse array of new environments. Sympatric speciation has occurred in at least six lakes in British Columbia, Canada. Within these lakes, sympatric species have adapted to a different ecological niche — benthic forms are invertebrate feeders found close to the shore and, relative to their marine ancestor, they have a reduced body armour, increased body depth and a decreased number of gill filters used to filter food. Limnetic forms more closely resemble marine forms with a streamlined body, extensive armour and a large number of gill rakers. Benthic and limnetic forms are reproductively isolated, although fertile F1 crosses can be generated in an aquarium. In a series of beautiful experiments, David Kingsley (Stanford University), Dolph Schluter (University of British Columbia) and colleagues have dissected the genetic architecture of the traits that cause reproductive isolation, many of which have arisen by independent parallel evolution in a number of different populations. The key to this work was the construction of a medium density microsatellite map (Peichel et al. 2001) containing 227 loci at a mean intermarker interval of 4 cm centimorgans. To achieve this notable feat, the authors sequenced a staggering 3560 clones and identified over 1000 loci before designing primers to amplify the most promising 428 loci. A decade ago, an exercise on this scale would only have been performed in humans (Weissenbach et al. 1992) or model organisms (Dietrich et al. 1994). Microsatellites are the marker of choice for this system because they are variable in different populations, eliminating the need for a new marker set for alternative populations. The mapping population was derived from a benthic female and a limnetic male captured at the Priest Lake, British Columbia. An F1 male was then backcrossed to a benthic female to produce a mapping population of 92 progeny. The progeny were measured at several gill raker and body armour traits and interval mapping revealed major QTL for every trait, each explaining between 17% and 37% of the phenotypic variance observed in the progeny. Of course, in a relatively modest mapping population (n = 92), any QTL that is detected would inevitably explain a large proportion of phenotypic variance to reach statistical significance (see section: Caveats). However, it is the subsequent experiments that reveal the beauty of the stickleback system.

The observation that genetic variation for a quantitative trait can be explained by several loci, some of which are of relatively large effect is hardly surprising or even novel. More challenging questions that arise from a mapping experiment of this kind include: (1) ‘Do QTL of large effect actually represent the effects of more than one linked gene?’; (2) ‘Are the same QTL segregating in different populations?’, and (3) ‘What are the actual genes responsible for phenotypic variation in a mapping population’. Colosimo et al. (2004) created a new stickleback mapping population, using a marine female (from Japan) and a benthic male (from a second British Columbia lake, Lake Paxton) to investigate the genetics of armour plate reduction (benthic fish have reduced armor plates relative to limnetic and marine species). For this paper, an F2 design was used (enabling the measurement of dominance effects at QTL), and the mapping population was relatively large (n = 360). Four lateral plate number QTL were identified, one of which, on linkage group 4, explained 75% of the phenotypic variance. A second cross was then created from benthic and limnetic fish at another geographical location (Lake Frient, California). Three of the QTL (including the major one) were segregating in this second mapping population. The authors also conducted an elegant ‘complementation’ cross to test whether the same gene was responsible for low plate number in benthic fish from the Paxton and Frient populations. At the major QTL, the high plate number allele is dominant over the low plate number allele (i.e. heterozygotes have high plate number). Therefore, F1 fish from a Frient benthic × Priest benthic cross can only have a low plate number if the same gene is responsible for the QTL in each population. Eighty-four progeny were examined and all had low plate number. Thus, the authors convincingly showed that a single locus can cause a major shift in phenotype, and that the same locus (but not necessarily the same mutation or even the same gene) can explain cases of parallel evolution. It is also notable that one of the other lateral plate number QTL that was identified in both the Frient and Paxton populations was also segregating in the original mapping population from Priest Lake (Peichel et al. 2001). Indeed, the colocation of QTL in different mapping populations appears to be a fairly regular occurrence in the stickleback system, although I am unaware of this being formally tested.

Remarkably, the same Japanese marine fish × Paxton benthic fish mapping population has been used to illustrate a second case of a (different) major gene appearing to cause the parallel evolution of a dramatic morphological change (Shapiro et al. 2004). Here the authors examined the genetic basis of pelvic reduction — the complete loss of pelvic spines and structures — that is observed in some benthic stickleback populations. Again, the phenotypic variance within the F2 was largely explained by one major QTL and a handful of minor QTL. The authors then identified candidate genes from the extensive literature on hindlimb developmental genetics in model systems. One candidate, Pitx1, causes a similar phenotype in mice, most notably asymmetry in pelvic limb reduction, a trait that is also characteristic of benthic sticklebacks. Pitx1 was mapped in sticklebacks to exactly the same location (on linkage group 7) as the major QTL, confirming it as a strong candidate. However, when Pitx1 was sequenced in marine and benthic fish, no amino acid-changing nucleotide differences were observed. The authors next compared gene expression of Pitx1 in developing larvae of marine and benthic fish. In benthic fish, Pitx1 was not expressed in pelvic tissues but expression was normal elsewhere. Marine fish showed normal expression in pelvic tissue. Thus, the QTL is likely to be explained by an unknown regulatory mutation within the noncoding part of Pitx1. Complementation tests indicate that a similar (or the same) mutation is responsible for the loss of pelvic structures in an Icelandic population. In summary, studies of sticklebacks have shown that parallel evolution may frequently be caused by independent mutations of the same gene. Several important lessons can be learned from the stickleback experiments. For example, QTL mapping experiments that can be conducted in replicate populations offer considerable power for understanding evolutionary change. Second, candidate gene analyses may help identify the loci responsible for quantitative variation. Finally, QTL are not necessarily determined by mutations that alter the amino acid sequence of a gene. It should, however, be noted that the detection of QTL responsible for evolutionary change is not the same as identifying the genes and mutations that underlie the QTL.

Pea aphids.  Another notable example of QTL mapping helping to elucidate the mechanisms of reproductive isolation has been conducted in pea aphids, Acyrthosiphon pisum pisum (Hawthorne & Via 2001). Pea aphids have two races that are specialized to alfalfa and red clover hosts. The two races are reproductively isolated because mating only occurs on the host plant. Hawthorne & Via (2001) tested the prediction that genetic correlations between mate choice and resource use had promoted speciation. An F2 population was created, and the genetic correlations between fecundity on each host (a surrogate for resource use) and host acceptance (a surrogate for mate choice) were measured by traditional quantitative genetics methods. Positive genetic correlations were observed between resource use and mate choice, while negative genetic correlations were observed between acceptance of one host and fecundity on the other. These genetic correlations are in exactly the direction required to promote reproductive isolation. It might then be asked, why perform a time-consuming and relatively expensive mapping exercise when quantitative genetics can adequately describe important features of the genetic architecture of the focal traits? However, quantitative genetics cannot pinpoint the loci responsible for trait variation, so it cannot reveal how the genetic correlations are maintained. More specifically, are correlations caused by tight linkage or pleiotropy of the QTL, in which case they may be the cause of reproductive isolation? Alternatively, are they explained by gametic disequilibrium between unlinked loci? If the latter is true, these ephemeral associations must be maintained by selection or population structure and are likely to have arisen post reproductive isolation. By typing 194 F2 progeny at 173 AFLP (amplified fragment length polymorphism) loci Hawthorne & Via (2001) detected colocalized QTL for host acceptance and fecundity, indicating that the former scenario (linkage and/or pleiotropy) was acting in pea aphids and may be an evolutionary force that commonly drives reproductive isolation.

Monkey flowers.  Perhaps the best known QTL mapping studies of reproductive isolation have been conducted within the monkey flower species, Mimulus lewisii and Mimulus cardinalis. M. lewisii has pink flowers with a shape and high nectar concentration that is attractive to its bumblebee pollinator. M. cardinalis has yellow flowers with a tubular corolla and a high nectar volume preferred by its hummingbird pollinator. The two species are found in sympatry in the Sierra Nevada Mountains of California but hybrids are absent (making analysis of unmanipulated populations impossible). Bradshaw and colleagues created an F2 mapping population and a RAPD marker map which they used to identify QTL for eight floral traits. Every trait had a QTL that explained at least 25% of the F2 phenotypic variance (Bradshaw et al. 1995). A follow-up study used a larger mapping population (n = 465) to investigate 12 traits and confirmed the findings of the earlier study (Bradshaw et al. 1998). The largest effect QTL (named the yup locus) explained ∼80% of F2 carotenoid pigment variation, a trait that largely controls flower colour. Recently, the authors have conducted a series of elegant backcrosses and field experiments to examine the effects of an allelic substitution at the yup locus in an otherwise uniform background (Bradshaw & Schemske 2003). M. cardinalis plants with an introgressed M. lewisii yup allele had dark pink rather than yellow flowers and were visited by bees 74 times more frequently than the wild type. Similarly, the reciprocal cross resulted in yellow-orange rather than pink flowers and a 68-fold increase in hummingbird visits. Thus, a single gene (or more accurately, single locus — close linkage of several genes cannot be excluded) substitution can alter phenotype sufficiently to cause a dramatic shift in pollinator preference. However, this adaptive change may have arisen after reproductive isolation, i.e. by reinforcement, and it should not yet be regarded as a ‘speciation gene’. If the causative mutation is eventually identified, its age can be estimated by population genetic methods to determine whether it arose before or after reproductive isolation. Of course, identification of a mutation responsible for a QTL is not trivial in a nonmodel organism, unless a good candidate gene has been isolated in other model species.

Sunflowers.  One example where performing QTL analysis in the natural environment has helped understand the genetics of speciation is within the sunflower genus Helianthus. Helianthus paradoxus is a salt-tolerant homoploid species derived from Helianthus annuus and Helianthus petiolaris. Neither parental species is tolerant of the saline marshes to which the hybrid species is adapted. Lexer et al. (2003) created a backcross mapping population from the parental species and transplanted seedlings to the habitat in which H. paradoxus is found. A number of QTL for mineral ion uptake were identified as well as three survivorship QTL. Co-localization of QTL suggests that survivorship is associated with increased calcium ion uptake and exclusion of sodium ions. Furthermore, QTL alleles associated with survivorship were derived from both parents, indicating that phenotypes that are more extreme than those observed in either parent species are possible in hybrids. Selection coefficients at survivorship QTL were large, suggesting that the homoploid species could have become established rapidly, even in the presence of gene flow from the parent species. Although, H. paradoxus is found in the wild, this experiment could only have been conducted by creating a mapping population from the parental lines and transplanting it to the wild. Because of the strong selection acting at the survivorship QTL, they would not be segregating in adapted H. paradoxus populations. Furthermore, in a replicate glasshouse population, only four of the 14 QTL found in the natural environment were detected. Thus, conducting the experiment with an unmanipulated population or in the glasshouse may have yielded fundamentally different results.

Results from intraspecies crosses

QTL mapping is also beginning to illuminate our understanding of the architecture of intraspecific genetic variation. Here I will only focus on studies that have been conducted on mapping populations directly generated from wild caught/sampled individuals. Mirroring the case of interspecies crosses, a handful of very recent studies have proved particularly illuminating, addressing issues beyond the magnitude of QTL effects. Linda Partridge and colleagues (University College London) have examined the question of whether the same genes are involved in adaptation in different populations, by studying the genetic basis of variation in body size in Drosophila melanogaster which shows clinal variation in both Australia & South America. Gockel et al. (2002) derived an Australian mapping population from low latitude (small bodied) and high latitude (large bodied) wild caught flies. Composite interval mapping using 41 microsatellite loci revealed QTL on chromosome 3 and the distal end of chromosome 2. A similar experiment conducted on flies sampled from a South American cline revealed almost identical results (Calboli et al. 2003), indicating that the same loci may have been involved in adaptation to climatic variation on different continents. However, it should be noted that Drosophila have only three major chromosomes and one minor chromosome, and so the probability of observing QTL in similar locations by chance is greater than for species with a larger number of linkage groups. Furthermore, there is increasing evidence that Drosophila QTL are actually explained by multiple linked QTL (Mackay 2004). There is no a priori reason to assume this phenomenon is unique to Drosophila.

A detailed understanding of quantitative genetic variation must also address whether QTL effects are dependent on both the environment and the genetic backgrounds in which they are segregating. Verhoeven et al. (2004) have recently described an extensive QTL mapping study of fitness-related traits in natural populations of the wild barley, Hordeum spontaneum. Crosses were made between plants adapted to coastal and steppe environment and the fitness of the progeny was measured in both parental environments as well as in a common-garden experiment involving high and low nutrient conditions. For most fitness traits, between 1 and 5 QTL were identified and they explained between 9% and 76% of the interline phenotypic variance. This pattern is similar to that observed in many other studies — an exponential distribution of QTL effects. Interestingly, a high proportion of QTL were identified in only one environment. Where QTL were identified in both environments, their effect (but not necessarily magnitude) was in the same direction across environments, i.e. adaptive evolution to different environments has not resulted in genetic trade-offs. These findings are broadly similar to those of Lexer et al. (2003) who found that only four out of 14 Helianthus paradoxus mineral tolerance and survivorship QTL that were detected in the natural environment could be detected in the glasshouse. Thus, evidence that QTL effects are dependent on the environment is beginning to accumulate.

Recent experiments in Arabidopsis thaliana have examined whether QTL effect can be influenced by genetic background. Ungerer et al. (2003) created two populations of genetically variable lines derived from different ecotypes and performed a multigenerational selection experiment. One population comprised plants that contained approximately 7/8 genome from one ecotype and 1/8 from the other. Ecotype proportions were reversed for the replicate population. By typing 60 randomly selected plants of each population at 31 markers at the end of the selection experiment, it was possible to detect deviations from expected allele frequencies, and hence determine the genomic regions that responded to selection. The same genomic regions showed the greatest response to selection in both genetic backgrounds, indicating that gene by gene interactions (epistasis) were unimportant in this experiment. In a follow-up experiment, Ungerer & Rieseberg (2003) created an F2 population from the same parental ecotypes and measured them for a series of fitness-related traits (e.g. time until flowering, longevity and biomass). Composite interval mapping on 294 progenies indicated that those regions that responded to selection in the initial experiment colocalized to regions where QTL were discovered in the second experiment. For each trait, between 1 and 4 QTL were detected (experiment-wide P < 0.05) that explained between 3% and 51% of the phenotypic variance. Epistasis between the QTL was of limited importance compared to the additive effects of these QTL. In contrast, analyses of recombinant inbred lines of A. thaliana have revealed epistasis in the natural environment (Weinig et al. 2003) and the laboratory (Kearsey et al. 2003), although these populations would not be considered natural under the definition provided earlier. Studies of model organisms, especially Drosophila suggest that epistatic interactions between QTL are common (Mackay 2004) and further work to address this question in natural populations is required.

In summary, recent experiments such as those described above add an extra layer of sophistication to the relatively straightforward task of finding QTL and estimating their effect size. Thus far, the findings are relatively uncomplicated in the sense that the same QTL might be responsible for adaptive evolution across populations (within a species), and the action of these QTL is not greatly influenced by gene–gene interactions. Whether these interpretations are more broadly applicable requires further testing and care should taken not to oversimplify the interpretations of these data. Lessons might be learned from studies of Drosophila, where it is apparent that the last decade of QTL research has revealed an unanticipated complex genetic architecture of continuous traits (Mackay 2004).

Results from unmanipulated populations

Earlier in this review, it was described how most QTL analyses in natural populations have been conducted in specially created mapping crosses. Studies conducted within these populations have undoubtedly enhanced our understanding of adaptation, reproductive isolation and speciation. However, the extent to which they can tell us about microevolution within populations is questionable for several reasons. Crosses invariably have elevated levels of phenotypic and genetic variation relative to parental strains. Even if both parental lines are from the same population and genetic variance is not elevated, the genetic architecture of mapping populations may have been shaped by genetic drift, mutation and adaptation within the laboratory environment (Hoffmann 2000). A further problem is that laboratory populations are often reared in a constant environment, whereas natural populations are typically found in a fluctuating environment, which very often encompass harsher conditions than are found in the lab. This elevated environmental variance may result in QTL–environmental interactions as well as a relative reduction of QTL magnitude (when expressed as a proportion of trait variance explained). In short, we need to know whether conclusions reached from QTL experiments that are performed in unnatural conditions apply to truly natural populations. To date, only one QTL study has been conducted in an unmanipulated wild population, the red deer (Cervus elaphus) on the Scottish island of Rum (Slate et al. 2002b). Three QTL for birth weight (a trait positively correlated with fitness components) were identified in a pedigree containing 365 deer. All QTL were of large effect, and interestingly, one appeared to be paternally silenced, i.e. the QTL effect was only present when inherited from the mother. Because a major goal of this review is to highlight how QTL mapping can be conducted in unmanipulated populations, I will return to this example in some depth. Similar exercises are now underway in other species, although none are yet published.

How to map QTL in unmanipulated populations

In this section, two alternative strategies for QTL detection in unmanipulated pedigrees are presented: mapping within sibships and mapping in general pedigrees. Before describing these approaches, it is useful to reflect on recent developments in quantitative genetic analysis in the wild. The last five years has witnessed the uptake of a favourite tool of plant and animal breeders, the ‘animal model’, to examine microevolutionary change in natural populations. Briefly, the animal model is a mixed effects model in which components of variance (e.g. additive genetic variance, maternal effects, environmental variance) can be estimated from a pedigreed population of individuals of any structure. Variance components are usually estimated by restricted maximum-likelihood (REML). Interested readers are referred to Kruuk (2004) for an excellent review of the application of the animal model to natural populations, and to Lynch & Walsh (1998) for a more general description of the underlying methodology.

Fisher's fundamental theorem of natural selection (Fisher 1958) has commonly been interpreted to mean that selection will deplete additive genetic variation fastest for traits most closely related to lifetime fitness (see also Frank & Slatkin 1992; Walsh & Lynch, unpublished). By extension, fitness traits should be less heritable than others. Initial investigations in natural populations supported this idea (Gustafsson 1986; Mousseau & Roff 1987). More recent analyses using the animal model on long-term data sets of marked individuals find the same pattern but have shown that a low heritability of fitness traits is not necessarily explained by a low additive genetic variance (Kruuk et al. 2000; Merilä & Sheldon 2000). In fact, traits closely related to fitness often have more additive genetic variance than traits less closely related to fitness (Houle 1992; Merilä & Sheldon 1999). However, fitness traits also have a high environmental variance (a component of the denominator when measuring heritabilities), which results in a low heritability (Kruuk et al. 2000). The animal model has subsequently been employed to investigate the causes of evolutionary stasis (Meriläet al. 2001; Kruuk et al. 2002), the importance of maternal effects (Kruuk 2004), and to measure genetic correlations (Coltman et al. 2001; Charmantier et al. 2004). In short, the animal model has greatly enhanced our understanding of the evolutionary quantitative genetics of natural populations. Of course, classical quantitative genetics cannot identify the genes that underlie quantitative variation. However, the animal model can be adapted to identify QTL, as outlined later in this section.

Consider the three requirements to identify QTL in an unmanipulated natural population. The first is a population of individuals of measured phenotype. The second is that the population is pedigreed, and the third is the availability of a genetic map of variable markers (discussed further in succeeding sections). Measuring individual fitness in the wild is notoriously difficult (Endler 1986), especially over several generations of a long-lived organism. However, the growth in animal model analyses (Kruuk 2004), which shares the first two requirements outlined above, indicates that appropriate data sets for QTL mapping are available for an increasingly broad range of taxonomic groups.

Mapping in sibships

Readers of Molecular Ecology will be well aware that the application of microsatellite markers to infer parentage in natural populations has become widespread. In many wild populations, an impressively large number of parents have been genetically assigned, typically using likelihood-based methods implemented in well-known freeware packages (Marshall et al. 1998; Duchesne et al. 2002; Signorovitch & Nielsen 2002; Gerber et al. 2003; reviewed in Jones & Ardren 2003). Often, relatively large half or full-sibships can be constructed, especially in species that exhibit a large variance in male reproductive success. These sibships may be suitable for QTL mapping and can be considered analogous to domestic livestock pedigrees. For example, large paternal half-sibships are well established as suitable mapping populations in domestic sheep (Montgomery et al. 1993) and cattle (Georges et al. 1995). Similarly, full sib families are often used in chicken and pig gene mapping experiments.

Outbred half-sib and full-sib families are similar to BC and F2 populations used in inbred line crosses. The fundamental difference is that F1 parents derived from an inbred line cross are heterozygous at every segregating QTL and marker. Furthermore, if the parental lines exhibit a fixed difference at marker and QTL, the phase between markers and linked QTLs is identical in all F1s. Outbred sibships differ in several important respects. First, there is no guarantee that every parent will be heterozygous at both marker loci and QTL. Thus, many families will be uninformative with respect to QTL detection. By using highly variable markers such as microsatellites, the power to detect QTL can be enhanced, but power is always limited by the heterozygosity of the QTL. A second important consideration is that the phase between marker and QTL is not necessarily consistent across families. Thus, marker effects must be considered independently within each sibship (whereas all sibships can be treated as a single large family in inbred line designs), for example in a nested anova design. Note that in inbred line designs QTL effects are usually expressed as a difference in means of different genotypes (a fixed effect), whereas in an outbred design, QTL effects are typically expressed as a proportion of trait variance explained (a random effect). Mapping in outbred sibships commonly uses the Haley-Knott regression; a method that was initially developed for inbred line crosses (Haley & Knott 1992), but has been extended to detect QTL in both half-sib (Knott et al. 1996) and full-sib outbred families (Haley et al. 1994). Haley-Knott regressions can be readily implemented via qtl express (Seaton et al. 2002), a suite of programs available via a web server housed at the University of Edinburgh.

Mapping in general pedigrees by variance components

Of course, not every natural population contains sufficiently many or sufficiently large sibships to detect QTL by the Haley-Knott regression. The closest analogy to this sort of population might be a human pedigree used to detect disease QTL. Such a pedigree may span many generations, contain some inbreeding and may even be comprised of a series of related sibships. Fortunately, QTL can still be detected, by estimating variance components, including those associated with a QTL. QTL mapping by variance components in general pedigrees has been developed independently by both the human genetics (Almasy & Blangero 1998) and animal breeding (George et al. 2000) communities. The animal breeders’ method is essentially an extension of the animal model, and is the approach used by Slate et al. (2002b) to detect QTL in a wild red deer population.

Consider the polygenic (‘animal’) model used to estimate additive genetic variance in a pedigreed population. In matrix algebra form, the animal model can be written:

y = Xβ + Za + e

where:

  • y is a vector of phenotypes of all pedigreed individuals

  • β is a vector of fixed effects

  • X is a design matrix relating the appropriate fixed effects to each individual

  • a is a vector of random effects (polygenic additive effects)

  • Z is an incidence matrix relating the appropriate random effects to each individual

  • e is a vector of residual errors.

For any pair of individuals in the pedigree, the genetic covariance between them is a function of 2Θij where Θij is the coefficient of coancestry, the probability that an allele randomly drawn from individual i is identical-by-descent (IBD) with an allele randomly drawn from individual j. Note that the coefficient of coancestry is obtained from the pedigree structure of the individuals concerned, rather than, for example, any marker data.

This model returns an estimate of the heritability of the trait, in addition to a likelihood value (l0) for the REML solution.

Consider now, a second linear mixed model, containing the same terms as in the first model, plus a QTL effect at a location of interest. This model, termed the ‘polygenic + QTL’ model can be written as:

y = Xβ + Za + Zq + e

where q is a vector of additive QTL effects.

To obtain an REML solution of this model, marker data are used to infer Rij, the proportion of alleles that two individuals i and j actually share IBD at a chromosomal location. The use of all markers on a chromosome to estimate Rij at each location is known as multipoint mapping. Note that Rij is an estimate rather than a probability, and varies at each test location (Fig. 3). Estimating Rij is time-consuming in large pedigrees, especially as it must be estimated for every test location (e.g. every 2 cm) along a chromosome. A number of different programs are now available to estimate Rij(see Table 1).

Figure 3.

QTL mapping in general pedigrees. A simplified version of a general pedigree is illustrated. The pedigree contains 10 individuals (1–10), all of whom are typed at a microsatellite locus with four alleles (A–D). 2Θij (twice the coefficient of coancestry) and Rij (the IBD coefficient at the marker) between each pair of individuals is shown. An animal model (variance components) analysis requires the former to estimate additive genetic variance and the latter to estimate the variance component explained by a QTL. Note that individual 10 is the grandchild of individuals one to four. 2Θij is, by definition 0.25 between a grandparent and grandchild. However, the number of alleles actually shared IBD varies across the genome. In this example, individual 10 shares half of its alleles IBD with grandparents 1 and 4 but no alleles IBD with grandparents 2 and 3.

Table 1.  Software to aid QTL detection in natural populations
ProgramURLComments
Linkage map construction
mapmakerftp://ftp-genome.wi.mit.edu/distribution/software/mapmaker3/Linkage map construction and QTL analysis in experimental populations.
crimaphttp://compgen.rutgers.edu/multimap/crimap/index.htmlLinkage map construction in general pedigrees. Requires UNIX operating system.
QTL mapping in experimental crosses
bqtlhttp://hacuna.ucsd.edu/bqtl/Bayesian QTL analysis in line crosses.
Requires R package (open source).
mapmanager qtxhttp://www.mapmanager.org/mmQTX.htmlQTL mapping in experimental populations
mapqtlhttp://www.kyazma.nl/index1.phpQTL detection in experimental crosses. Commercial package. joinmap software also available for map construction.
multimapperhttp://www.rni.helsinki.fi/~mjs/Bayesian QTL analysis in line crosses and outbred sibships. C source code provided.
plabqtlhttp://www.uni-hohenheim.de/~ipspwww/soft.htmlQTL analysis in line crosses by compositeinterval mapping. Popular with plant breeders.
qtlcartographerhttp://statgen.ncsu.edu/qtlcart/index.phpInterval, composite interval and multi-trait mapping in experimental populations. Very widely used.
qtl expresshttp://qtl.cap.ed.ac.uk/QTL mapping in inbred and outbred experimental crosses. Implements Haley-Knott regression. Data submitted to web server. QTL mapping in general pedigrees expected shortly.
r/qtlhttp://www.biostat.jhsph.edu/~kbroman/qtl/QTL mapping in experimental crosses. Free add-on package for the R and S languages.
IBD coefficient estimation
simwalkhttp://watson.hgen.pitt.edu/docs/simwalk2.htmlIBD coefficient estimation and haplotype inference. IBD coefficients can be used in mixed effects model to detect QTL in general pedigrees.
lokihttp://loki.homeunix.net/IBD coefficient estimation and QTL analysis in general pedigrees. IBD coefficients can be used in mixed effects model to detect QTL in general pedigrees. Requires UNIX or LINUX operating system.
QTL mapping in general pedigrees
solarhttp://www.sfbr.org/solar/index.htmlQTL mapping by variance components in general populations. IBD coefficient calculation implemented within program. Requires UNIX or LINUX operating system.
merlinhttp://csg.sph.umich.edu/pn/index.php?furl=/abecasis/Merlin/index.htmlQTL detection and IBD coefficient estimation in general pedigrees.
Pedigree management software
grrhttp://qtl.well.ox.ac.uk/GRR/Detecting pedigree errors from marker data. Uses identity by state allele sharing.
pedviewerhttp://www-personal.une.edu.au/~bkinghor/pedigree.htmDraws pedigree diagrams. Can also be used to calculate inbreeding coefficients.
pedcheckhttp://watson.hgen.pitt.edu/register/docs/pedcheck.htmlDetects pedigree errors from family and marker data.

The ‘polygenic + QTL’ model returns estimates of the additive genetic variance, the variance attributable to a QTL at the test location and the likelihood value of the REML solution (l1). A likelihood ratio test statistic (LRT) can be obtained from the two models by:

LRT = −2 ln(l0 − l1)

Under the null hypothesis of no QTL at the test location the test statistic follows a 50: 50 mixture distribution, where one component is a point of mass 0 and the other mixture component is a inline image distribution (Almasy & Blangero 1998; George et al. 2000). Under the subtly different null hypothesis of no QTL anywhere on the chromosome, the test statistic appears to approximate to a inline image distribution (George et al. 2000).

It should be pointed out that QTL detection by this process is time-consuming and does have some other disadvantages. Most notably, statistical significance testing by permutation testing cannot be readily employed in the way that it can in more familiar mapping designs. However, mapping in general pedigrees does utilize more pedigree information than half-sib or full-sib designs, and therefore appears to have greater power to detect QTL (Slate et al. 1999; George et al. 2000). In their study of birth weight QTL in red deer, Slate et al. (2002b) used both a general pedigree and a half-sib approach to identify QTL. Two out of three QTL were detected by both methods. In summary, natural populations for which the animal model has been used to measure additive genetic variance components can also be used to detect QTL (sample size permitting), provided a map of variable markers is available.

Maps and markers

Any mapping project is, of course, impossible without a linkage map of variable markers. The choice of marker largely depends on the type of mapping population. If inbred line crosses are used, then the ideal marker will be biallelic, reflecting a fixed difference between the parental lines. Thus, AFLPs or single nucleotide polymorphisms (SNPs) are appropriate. The advantages of AFLPs are that they can be generated for any organism, and that numerous genotypes can be obtained rapidly and cheaply. Disadvantages of AFLPs are that: (i) bands are usually unique to a particular mapping population, making them inappropriate for comparative studies with other populations; (ii) bands are dominant, meaning heterozygotes cannot be distinguished from one of the homozygote genotypes, e.g. in an F2 population; (iii) AFLPs cannot be identified in a targeted way in the sense that other markers can. By this, I mean that if a particular chromosome has poor marker coverage, identifying additional AFLPs to enhance coverage is not straightforward. SNPs are becoming increasingly widely used in ecology (Morin et al. 2004), although they have yet to be used to map QTL in natural populations. The advantages of SNPs are that they are more abundant than any other marker, they can be obtained in coding or noncoding regions, they can be targeted to particular genes or regions of the genome (Aitken et al. 2004), they are codominant and they may be conserved between mapping populations. Disadvantages include: (i) SNP discovery requires genome sequence data from the study species or a related model organism; (ii) heterozygosity can be low; (iii) SNP discovery is relatively expensive compared to AFLP genotyping, although subsequent genotyping can be cost-effective on some systems (Morin et al. 2004).

For outbred mapping populations, the marker of choice is usually the microsatellite. Unlike populations that are derived from inbred lines, parents are not by definition heterozygous at a marker. Biallelic markers such as SNPs or AFLPs can never have an expected heterozygosity greater than 0.5 in a randomly mating population. In outbred populations, the proportion of parents that are informative at a marker is maximized by genotyping with multiallelic microsatellites. It is notable that many of those species where quantitative genetics has been conducted on pedigreed natural populations already have microsatellite maps. For example, medium density maps already exist for red deer (Slate et al. 2002a), sheep (Maddox et al. 2001) and Atlantic salmon (Gilbey et al. 2004; Moen et al. 2004). Geneticists may also take advantage of the fact that microsatellites are often conserved between related organisms. For example, PCR (polymerase chain reaction) primers for markers developed in livestock, humans, Arabidopsis thaliana and laboratory rodents have all proved useful in natural populations of related species (Morin et al. 1994; van Treuren et al. 1997; Slate et al. 1998; Rogers et al. 2000; Peakall et al. 2003; Kuittinen et al. 2004). Passerine bird species have been the focus of more longitudinal quantitative genetics studies than any other taxonomic group (Kruuk 2004). Approximately 500 passerine microsatellites have been cloned and many of these coamplify in related species (Primmer et al. 1996; Dallimer 1999; Dawson et al. 2000; Richardson et al. 2000). However, the degree of cross-utility declines with time since common-ancestry, and in no single species are the majority of available markers useful. Fortunately, the first draft of the chicken genome has recently been made publicly available (see http://www.genome.wustl.edu/projects/chicken/), providing a bioinformatics resource to aid the discovery of SNPs in passerine species. Thus, it should be possible to construct passerine maps using a combination of microsatellites and SNPs.

Having generated a suite of variable markers constructing a linkage map is deceptively simple. The only requirement is a pedigreed population for which the segregation of marker alleles can be followed. The process of genotyping a mapping population to identify QTL generates exactly the data required to construct a linkage map. Several software packages are readily available to create linkage maps from inbred line crosses, outbred populations and even multigenerational pedigrees (Table 1). The optimal number of markers required to conduct a QTL analysis is a function of genome size, degree of linkage disequilibrium and recombination rate. Depending on the organism, an average intermarker interval of 10 cm can usually be achieved with between 50 and 200 markers. For nonmodal organisms, it is useful to have at least a rudimentary knowledge of the karyotype to establish whether any chromosomes remain unmapped.

Pedigrees of natural populations are inferred by behavioural methods, genetic profiling, or in most cases, a mixture of both. Of course, some cases of inferred parentage are likely to be wrongly assigned. However, once the mapping genotype data has been accumulated, it should be straightforward to identify cases of misassigned parentage due to numerous mismatches between parent and offspring, e.g. Slate et al. (2000). Failure to account for pedigree error will cause map distances to be incorrectly estimated, and may even lead to the linkage being incorrectly assigned (Type I error) or missed entirely (Type II error). Many of the programs listed in Table 1 will fail to run in the presence of parent-offspring mismatch. Pedigree error checking packages are listed in Table 1. When designing mapping experiments, it is prudent to increase the sample size to allow for the subsequent ‘culling’ of misassigned individuals in the pedigree.

Caveats

Having typed a pedigree, built a map and detected QTL there are a number of further issues to consider. Some are discussed here, but where necessary, the reader is guided to more in-depth discussion of some of the well-known difficulties and biases associated with QTL discovery. It should be remembered that the majority of these issues relate to all QTL mapping experiments, not just those conducted in natural populations.

Are my QTL real?

Any QTL mapping experiment involves a large number of statistical tests. Clearly, using nominal significance levels would result in a large number of false positive QTL. Conversely, very stringent criteria for declaring linkage will cause some QTL to be missed. QTL mappers are well aware of the problem, and guidelines have been proposed to declare linkage (Lander & Kruglyak 1995). For mapping in general pedigrees, the appropriate test statistic thresholds for declaring linkage can be calculated as a function of genome length, the number of chromosomes and the recombination rate. Formulae to obtain appropriate thresholds are provided by Lander & Kruglyak (1995). In line crosses or in outbred sibships, significance testing can be determined by permutation testing (Churchill & Doerge 1994; Doerge & Churchill 1996). This method makes no assumptions about the null distribution of the test statistic, and knowledge of parameters such as the genome length is not required. Thus, permutation testing provides a robust method for significance testing.

QTL detected in natural populations may often be of marginal significance. Sample sizes and marker density are often modest. The power to detect QTL in unmanipulated populations will be further reduced if environmental sources of variation provide ‘noise’. Repeating experiments on an independent sample may be impractical or impossible in some populations. However, there have been examples of QTL being replicated in independent mapping populations, e.g. in monkey flowers and threespine sticklebacks. Clearly, mappers have to be wary of detecting false positives, especially when independent evidence to verify their conclusions is lacking.

Are my QTL effects overestimated?

Not only can false QTL be spuriously identified and real QTL missed in genome scans, but the estimate of QTL magnitudes are commonly upwardly biased. The problem was first described by Beavis (1994), and is known as the ‘Beavis Effect’. Consider two QTL of equal magnitude. If environmental variance causes the magnitude of one QTL to be overestimated and the other to be underestimated, the overestimated QTL is more likely to provide a test statistic that exceeds significance thresholds. The problem is exacerbated with smaller sample sizes (e.g. n = 100), although the degree of upward bias is believed to be small when sample size exceeds 500 individuals (Beavis 1994). Further discussion of the Beavis Effect is provided in Roff (1997), Orr (2001), Allison et al. (2002) and Barton & Keightley (2002). It is worth noting that many studies of natural populations have used considerably fewer than 500 individuals. Studies that conclude that individual QTL account for a large proportion of phenotypic variation must be treated with some caution. The problem can be avoided by replicating experiments, or by estimating QTL magnitude in a different sample of individuals to those where the QTL was identified.

An additional issue when estimating QTL magnitude is described by Orr (2001). If a mapping population is derived from two inbred lines, or from reproductively isolated populations, the amount of phenotypic and genetic variation within the mapping population (e.g. an F2) is often huge in comparison to the variation within the parental populations. QTL effects are commonly described by the proportion of trait variation they explain in the F2 generation. However, it can sometimes be more informative to describe a QTL effect relative to the variation in the parent population in which a new mutation first appeared (the standing variation). Measuring QTL effects in this way reduces the confounding influence of time since divergence of the parental lines and provides a more accurate reflection of the strength of selection on the QTL (Orr 2001).

How do I find my QTN?

The term QTN refers to quantitative trait nucleotide; in other words the mutation that is responsible for a QTL. Mapping studies in livestock, humans and model organisms are usually regarded as an initial step towards identifying a QTN. A variety of strategies have now been proposed to aid QTN discovery including association mapping (linkage disequilibrium mapping) on very large samples of unpedigreed individuals (Kruglyak 1999; Risch 2000), combining QTL with microarray data on gene expression (Schadt et al. 2003), candidate gene analysis (see for example Galloway et al. 2000; Shapiro et al. 2004), and, in model systems, mutagenesis (Flint & Mott 2001; Mackay 2001). It should be noted, however, that the process of QTN discovery is time-consuming, expensive and thus far has resulted in surprisingly few successes. Clearly, scientists studying these systems have greater resources (financial and technological) than those studying natural populations. At this stage, QTN detection in the wild is unlikely unless very convincing candidate genes are available. However, the integration of microarray and QTL mapping methodologies does appear promising, provided good quality RNA from appropriate tissues can be obtained from individuals in natural populations.

Some QTL mapping studies in natural populations have sought to test specific hypotheses that do not necessarily require the identification of QTN. For example, Hawthorne & Via's pea aphid experiment examined whether genetic correlations were ephemeral (maintained by selection) or were constrained by pleiotropy/linkage disequilibrium. Thus, while QTN detection may be difficult in natural populations, it may not be an overall objective of a QTL mapping study.

Lessons from model organisms?

Clearly, one of the reasons for conducting mapping studies in natural populations is to establish whether findings from earlier studies in model organisms apply more generally. While many of the results described in this review appear to be uncomplicated, it is useful to make a comparison with the experiences of those conducting studies in laboratory populations of Drosophila or other model organisms. Upon reading the experiences of Drosophila QTL mappers, it is apparent that QTL effects are dependent on genetic background (dominance and epistasis are common), environment and sex, and that many mutations have pleiotropic effects (Mackay 2004). A decade earlier it was anticipated that the picture would be less complex (Mackay 1995). A similar picture is emerging from studies of Arabidopsis (Kearsey et al. 2003; Weinig et al. 2003). Given that natural populations will generally harbour more genetic and environmental variance than their equivalent model systems, it seems probable that the situation will be at least as complex in the wild. Thus, results from natural populations should be interpreted with some caution.

Particular caution should be employed when QTL colocalize to the same genomic region. Even in genome-scans conducted on very large sample sizes, confidence intervals for QTL location are wide, with tens or hundreds of genes lying beneath a QTL peak. Thus, it is dangerous to assume that a QTL is determined by a single gene or that colocalized QTL are explained by the same gene. In one notable study, an approach termed reciprocal hemizygosity analysis was used to dissect the molecular basis of a QTL in yeast, Saccharomyces cerevisiae (Steinmetz et al. 2002). The QTL was in fact attributable to three tightly linked genes. Similarly, QTL in Drosophila have been shown to be, in fact, caused by linked genes acting cumulatively (Mackay 2004). At this stage it is too early to say whether this phenomenon is widespread, although data from model organisms suggest that selection has favoured the clustering of genes of related function (Cohen et al. 2000; Pal & Hurst 2003). Given these findings, it would be premature to argue that the colocalization of QTL affecting the same trait in different populations is the product of parallel mutations, or that the colocalization of QTL affecting different traits in the same population is the result of pleiotropy (rather than linkage).

Future directions and conclusions

What does the future hold for QTL mapping in the natural populations? Given that the number of mapping experiments in the wild continues to grow exponentially, it seems likely that many more will be conducted in the next few years. One of the most encouraging signs is the creation of consortia to create genetic tools in ecologically interesting organisms. For example, the Daphnia Genomics Consortium (DGC), http://daphnia.cgb.indiana.edu/, has facilitated genome sequencing, linkage map construction and microarray development in Daphnia pulex and related organisms. As more model organisms become mapped and sequenced, it will become easier to create maps in related nonmodel species. For example, mapping studies in red deer and Arabidopsis lyrata have benefited from resources created for cattle and Arabidopsis thaliana. In the UK, the Natural and Environmental Research Council (NERC) has invested heavily in an Environmental Genomics program (http://www.nerc.ac.uk/funding/thematics/envgen/), which includes QTL mapping studies of natural populations of Soay sheep, wild brassicas, wild relatives of Arabidopsis and Senecio species.

It is apparent that SNP markers are becoming increasingly useful in ecological genetics (Luikart et al. 2003; Morin et al. 2004). The decreasing cost of SNP genotyping is also likely to make genetic map construction feasible in nonmodel organisms, although it should be remembered that SNPs are not ideal for mapping in outbred populations. Comparative anchor tagged sequences (CATS) primers may prove to be particularly useful. CATS primers anneal to conserved exonic regions, enabling the amplification and sequencing of adjacent introns. Because the primers are in conserved regions, CATS loci can be useful to identify SNPs for comparative genomics projects. Initial investigations suggest that mammalian CATS primers designed from primates and rodents generate a useful PCR product across a wide range of mammals (Aitken et al. 2004). Cheaper SNP genotyping should also result in the wider uptake of population genomics. Population genomics involves the genotyping of many neutral loci in a large number of unpedigreed individuals, and then estimating population genetic statistics on each locus (Luikart et al. 2003). Outlier loci, those with unusual statistical properties relative to others in the population, are likely to be linked to genes under selection. Population genomics can be conducted in its own right (Wilding et al. 2001; Campbell & Bernatchez 2004) or in tandem with a more conventional QTL approach (Ungerer & Rieseberg 2003). It has the advantage that pedigrees or phenotype data are not required. However, while it may identify loci under selection, it does not relate this to any particular trait or to a proportion of quantitative genetic variation explained by that region of the genome.

It is anticipated that more unmanipulated natural populations will be subject to QTL analysis. The application of the animal model to quantitative genetic analysis in the wild has seen a dramatic gain in popularity in just a few years. QTL analysis in many of these populations is a logical next step, provided a map is available. Generating passerine linkage maps would be a particularly useful development, because in some passerine species, several populations have been the focus of long-term study (Charmantier et al. 2004). It would therefore be possible to determine whether the same QTL are associated with microevolution in distinct populations of the same species, and would also provide the opportunity to confirm QTL in independent samples.

One of the aims of this review has been to highlight how QTL mapping in natural populations has matured in recent years. Initial studies provided descriptions of the number and magnitude of QTL determining phenotypic variance. However, it could be argued that results were equivocal (Orr 2001; Barton & Keightley 2002), and may even be confounded by problems such as Type I error and biased estimates of QTL magnitude. It is questionable therefore the extent to which individual studies have dramatically enhanced our understanding of how populations evolve or how species arise. However, recent studies have begun to explore increasingly sophisticated issues such as genetic correlations, gene-by-environment interactions, epistasis and the adaptive importance of particular genes. As more studies are conducted, a clearer picture should emerge. Furthermore, QTL studies have now begun to be conducted in unmanipulated populations where natural selection is operating, and where genetic architectures may differ from the lab. It seems inevitable that the field will continue to grow when the latest tools to identify QTL and QTN become more readily applicable to natural populations. The future of QTL mapping in natural populations as an approach to understanding microevolution, adaptation and speciation looks rosy.

Acknowledgements

I thank Josephine Pemberton, Allan McRae, Jake Gratten, Gavin Hinten, Harry Smith and two anonymous referees for their insightful comments that greatly improved the manuscript. My interest in QTL mapping in natural populations has been inspired by several years of collaboration and discussion with Terry Burke, Dave Coltman, Allan Crawford, Ken Dodds, Jake Gratten, Loeske Kruuk, John McEwan, Allan McRae, Josephine Pemberton, Mike Tate and Peter Visscher.

Jon Slate is a Lecturer in population genetics at the University of Sheffield. Research interests of the Slate Group are based around the broad theme of evolutionary genetics of natural populations using comparative genomics, linkage analysis and molecular evolution tools. Other interests include the molecular evolution of mitochondrial DNA and the use of molecular markers to infer inbreeding depression.