SEARCH

SEARCH BY CITATION

Keywords:

  • Evolvability;
  • G-matrix;
  • genetic covariance;
  • QTLs

Abstract

  1. Top of page
  2. Abstract
  3. DYNAMICS OF THE G-MATRIX UNDER SELECTION
  4. DIVERSIFICATION OF THE G-MATRIX OVER LONGER TIME SCALES
  5. VARIANCE COMPONENTS AND THE MAINTENANCE OF GENETIC VARIATION
  6. EXPERIMENTAL DESIGN FOR ESTIMATION OF QTL (CO)VARIANCES
  7. HYPOTHESIS TESTING
  8. DIRECT ESTIMATION OF QTL ALLELE FREQUENCIES
  9. SUMMARY AND CONCLUSIONS
  10. ACKNOWLEDGMENTS
  11. LITERATURE CITED
  12. Appendices

Evolutionary quantitative genetics has recently advanced in two distinct streams. Many biologists address evolutionary questions by estimating phenotypic selection and genetic (co)variances (G matrices). Simultaneously, an increasing number of studies have applied quantitative trait locus (QTL) mapping methods to dissect variation. Both conceptual and practical difficulties have isolated these two foci of quantitative genetics. A conceptual integration follows from the recognition that QTL allele frequencies are the essential variables relating the G-matrix to marker-based mapping experiments. Breeding designs initiated from randomly selected parental genotypes can be used to estimate QTL-specific genetic (co)variances. These statistics appropriately distill allelic variation and provide an explicit population context for QTL mapping estimates. Within this framework, one can parse the G-matrix into a set of mutually exclusive genomic components and ask whether these parts are similar or dissimilar in their respective features, for example the magnitude of phenotypic effects and the extent and nature of pleiotropy. As these features are critical determinants of sustained response to selection, the integration of QTL mapping methods into G-matrix estimation can provide a concrete, genetically based experimental program to investigate the evolutionary potential of natural populations.

Over the last 25 years, quantitative genetics has become both a unifying conceptual framework in evolutionary biology and an essential tool for experimental studies. One large research effort has centered on the “G-Matrix, ” the set of genetic variances and covariances associated with a collection of quantitative traits. This effort has involved at least three interrelated components: (1) an explosion of studies applying the Lande and Arnold (1983) regression methodology to measure selection in natural populations (reviewed by Kingsolver et al. 2001), (2) the continued estimation of genetic (co)variances of traits (e.g., Mezey and Houle 2005), and (3) the development of theoretical models to predict multi-trait evolution under a variety of scenarios (e.g., Kirkpatrick and Lande 1989; Jones et al. 2003; Turelli and Barton 2004). Advances in molecular genetics and the development of high density genetic maps have facilitated a distinct research effort. Quantitative trait locus (QTL) mapping allows the dissection of trait variation into genomic components (Paterson et al. 1988; Lander and Botstein 1989; Tanksley 1993), potentially revealing the underlying genetic details of quantitative variation (Mackay 2004).

Despite the fact that QTL studies examine the same class of characters that populate G-matrices, these two streams of quantitative genetics remain strangely isolated in practice. This is illustrated by manuscript citation patterns. Figure 1 enumerates citations to two standard references: Lande and Arnold (1983) is routinely cited in G-matrix studies whereas Lander and Botstein (1989) is for QTL mapping papers. In the interval from 1984 to 2007, a total of 2343 papers cited Lander and Botstein (1989) whereas 1479 cite Lande and Arnold (1983). Remarkably, only 15 publications reference both papers and most of these do not relate the two foci of quantitative genetics in any material way.

image

Figure 1. The number of citations per year to Lande and Arnold (1983, diamonds) and Lander and Botstein (1989, squares) was estimated using Web of Science (Thomson Reuters, Philadelphia, PA). The number of publications citing both papers (triangles) was determined by comparing these lists.

Download figure to PowerPoint

The disconnection between G-matrix and QTL experiments is due in part to the nature of variables typically estimated by each type of study. Genetic variances and covariances are population statistics. These quantities summarize variation in the genotypic values of individuals randomly sampled from some reference population. In statistical terminology, these genotypic values are random effects (Searle et al. 1992, ch. 1). The G-matrix is the parameter set associated with this vector of random effects. In contrast, evolutionary geneticists have generally used line crosses to map QTLs. The parental “lines” are typically chosen because they differ in some notable phenotype or represent distinct taxonomic units, i.e., divergent populations or species (e.g., Bradshaw et al. 1998; Colosimo et al. 2005; Gleason et al. 2005). Under these conditions, QTLs are necessarily estimated as cross-specific fixed effects (Lynch and Walsh 1998, ch. 15).

QTL allele frequencies are the essential variables relating marker-based quantitative genetics to the evolution of demes (contiguous natural populations). If we ignore dominance, epistasis, and linkage disequilibria, and assume two alleles per QTL, the components of the G-matrix can then be written as:

  • image((1a))
  • image((1b))

where VA[x] is the additive genetic variance for trait X, CA[x,y] is the additive genetic covariance between traits X and Y, qi is the frequency of the first allele at QTL i, ai[x] is the additive effect of that allele on trait X, and the summations are taken over all loci affecting the trait (or traits). The aggregate quantity 2qi(1 −qi)a2i[x] is the genetic variance contributed by a QTL, whereas 2qi(1 −qi) ai[x]ai[y] is the corresponding covariance. Equations (1) are highly simplified but illustrate a fundamental feature of G-matrix components. Geneticists routinely identify a QTL as “major” if the estimated absolute value for ai[x] is large in a particular cross. However, such a QTL will make a distinctly minor contribution to genetic (co)variation if qi is close to 0 or 1.

Evolutionary biologists estimate genetic variances and covariances for a diversity of purposes. Principal objectives are to (1) quantitatively predict multitrait evolution and diversification (Hazel 1943; Lande 1979; Grant and Grant 1995; Steppan 1997), (2) provide a more general and qualitative assessment of evolutionary potential and constraint (Arnold 1992; Houle 1992), and (3) infer the evolutionary forces acting on trait variation (Mitchell-Olds et al. 2007). The inclusion of QTL mapping data can advance each of these objectives, although most effectively if allele frequencies are considered in concert with allelic effects. This can be done by estimating genetic (co)variances at the scale of individual QTL, and ideally, further decomposing these (co)variances into their component parts, allelic effects and frequencies. In the following sections, I first discuss how QTL covariance estimates can advance work toward each of the principle objectives listed above and then review experimental methods to obtain such estimates.

DYNAMICS OF THE G-MATRIX UNDER SELECTION

  1. Top of page
  2. Abstract
  3. DYNAMICS OF THE G-MATRIX UNDER SELECTION
  4. DIVERSIFICATION OF THE G-MATRIX OVER LONGER TIME SCALES
  5. VARIANCE COMPONENTS AND THE MAINTENANCE OF GENETIC VARIATION
  6. EXPERIMENTAL DESIGN FOR ESTIMATION OF QTL (CO)VARIANCES
  7. HYPOTHESIS TESTING
  8. DIRECT ESTIMATION OF QTL ALLELE FREQUENCIES
  9. SUMMARY AND CONCLUSIONS
  10. ACKNOWLEDGMENTS
  11. LITERATURE CITED
  12. Appendices

The most obvious and perhaps compelling reason to estimate G-matrix components is that these “whole-genome statistics” are adequate to predict immediate response to selection under quite general conditions (Lush 1937; Bulmer 1980; Turelli and Barton 1994). Additive genetic (co)variances are useful abstractions regardless of the number of QTL, the distribution of allelic effects, and even interactions (epistasis) or associations (linkage disequilibria) among loci. Beyond immediate response, however, a difficulty emerges. Trait means and genetic (co)variances are functions of the same underlying variables, QTL allele frequencies. Thus, while changes in trait means can be predicted from genetic (co)variances, these predictor variables are also evolving.

Lande (1979, p. 405) emphasized that trait means typically change much more rapidly than (co)variances, particularly when variation is due to many minor QTL. However, the separation of time scales for means versus (co)variances begins to break down if there are QTLs with large contributions to (co)variation. Genetic (co)variances will also change rapidly under sustained selection if variation is due to rare alleles, even if all QTLs make small contributions (e.g., table 1 in Kelly 2008). This dependence of (co)variance evolution on the features of individual QTL, and not whole-genome aggregate statistics, motivates a genomic decomposition of the G-matrix.

Table 1.  A summary of hypothesis tests appropriate to Known Regions QTLs: x and y index traits whereas q, i, and j index loci. The null hypothesis for I.(3) is specific to the case of two traits (X and Y) and the proportionality constant, τ, must be estimated from the data. For the null hypothesis of II.(2), the sum of λq across all QTLs and the genetic background is 1. See text for definitions of the (co)variance parameters.
I. A single QTL considered in relation to the genetic background
  QuestionNull hypothesis
  1. Does a QTL contribute to genetic (co)variation?Vq[x]=Cq[x,y]=0 for all x, y
  2. Is a QTL sufficient to explain genetic (co)variation?C*A[x,y]=0
  3. Is the genetic (co)variation contributed by a QTL proportional to thatVq[x]V*A[x], Vq[y]V*A[y], Cq[x,y]C*A[x,y]
   of the background genome?
II. Multiple QTLs considered in relation to each other
  QuestionNull hypothesis
  1. Are QTLs equivalent?Vi[x]=Vj[x]
 Ci[x,y]=Cj[x,y] for all i, j, x and y
  2. Are the genetic (co)variances of different QTLs proportional?Vq[x]q VA[x], Cq[x,y]q CA[x,y] for all q
  3. Do QTLs share a common correlation structure?Cq[x,y]Vq[x]Vq[y] for all q

The quantitative relationship between QTL features and G-matrix evolution is complicated. For a single quantitative trait under selection, the rate of change in allele frequency and mean effect of the QTL under selection should be at least roughly proportional to the QTL variance (Griffing 1960). Clearly, VA[x] will evolve more rapidly if variation in trait X is caused by a few major, rather than many minor, QTLs. However, the magnitude and direction of change in a specific QTL variance depends on the higher moments of the distribution of effects, e.g., skew and kurtosis (Barton and Turelli 1987). For example, a biallelic QTL with additive effects will contribute the same variance regardless of whether the high allele has frequency 0.2 or 0.8. However, selection for higher trait values will increase VA[x] with p = 0.2 (positive skew in the QTL effect distribution), but decrease VA[x] with p = 0.8 (negative skew). Extending to multiple traits, the numerical study of Bohren et al. (1966) suggests that genetic covariances may be even more sensitive than variances to changes in gene frequency brought about by selection. These authors argued that allele frequency asymmetry (sensu Falconer and Mackay 1996, pp. 212–213), and the consequent evolution of VA[x], is the most important cause of asymmetric responses to artificial selection.

When considering multiple traits, unambiguous analytical results are limited (see Turelli 1988). However, some simple qualitative arguments identify the nature and extent of pleiotropy as a critical feature of QTL. If pleiotropy is consistent across QTLs, for example if alleles that increase trait X also invariably increase trait Y, then directional selection on a single trait has predictable effects: CA[x,y] will change in magnitude but not direction as allele frequencies change. The situation is a bit more complicated with multitrait selection, particularly if favored traits are negatively correlated. Here, pleiotropy and selection impose fitness compromises on QTL alleles. Even if pleiotropy is consistent in direction, differences among QTL in the magnitude of allelic effects on traits can cause the high allele for trait X to increase in frequency at some QTL but decrease in frequency at others. These changes have conflicting effects on CA[x,y].

If the direction of pleiotropy is variable across QTL, selection can even change the sign of CA[x,y], that is a positive genetic correlation can become negative and vice versa. With this kind of pleiotropy, CA[x,y] is a net balance between QTLs contributing positively and contributing negatively. As selection alters allele frequencies, this balance shifts. Variable pleiotropy has been documented in QTL mapping of interpopulation crosses (e.g., Hall et al. 2006; Albert et al. 2008) and is also likely to be prevalent for intrapopulation variation (Mackay 1996). Given the central importance of variability among QTL in their respective contributions to (co)variation, I outline a series of simple tests of “QTL consistency” in the section Hypothesis Testing section of this essay.

At least one important caveat applies to the arguments of the preceding paragraphs. The overall genetic (co)variance is a simple sum across QTL only if there is no epistasis and no linkage disequilibrium. If epistasis is absent, the linkage disequilibria among QTL have tractable effects on genetic (co)variances (e.g., Bulmer 1980, ch. 9). Predictably, few general results are available with epistasis. However, QTL mapping techniques at least provide the opportunity for more detailed characterization of genetic interactions. For example, in a study of Arabidopsis thaliana, Kroymann and Mitchell-Olds (2005) show that the high allele at a QTL for biomass accumulation in one genetic background becomes the low allele when assayed in another genetic background. This sort of epistasis, where the average effects of alleles at a QTL will change as the genetic background evolves, would seem most likely to accelerate changes in G-matrix components. An important challenge for future studies, both theoretical and experimental, is to determine how genomic information on QTL–QTL interactions can be directly brought to bear on the rate and pattern of quantitative character evolution.

DIVERSIFICATION OF THE G-MATRIX OVER LONGER TIME SCALES

  1. Top of page
  2. Abstract
  3. DYNAMICS OF THE G-MATRIX UNDER SELECTION
  4. DIVERSIFICATION OF THE G-MATRIX OVER LONGER TIME SCALES
  5. VARIANCE COMPONENTS AND THE MAINTENANCE OF GENETIC VARIATION
  6. EXPERIMENTAL DESIGN FOR ESTIMATION OF QTL (CO)VARIANCES
  7. HYPOTHESIS TESTING
  8. DIRECT ESTIMATION OF QTL ALLELE FREQUENCIES
  9. SUMMARY AND CONCLUSIONS
  10. ACKNOWLEDGMENTS
  11. LITERATURE CITED
  12. Appendices

The preceding section concerned immediate response to selection based on standing variation. Over longer time scales, we must also consider the input of novel variation via mutation and its loss due to genetic drift. Given the difficulty of directly discriminating the joint effects of mutation, selection, and genetic drift, researchers have resorted to comparative studies of G-matrix evolution. A substantial set of statistical techniques have been developed and subsequently applied to contrast G-matrix estimates from closely related populations and/or species (Steppan et al. 2002; Bégin and Roff 2003).

G-matrix contrasts between populations can be substantially refined by inclusion of QTL mapping data. If alternative alleles at a QTL can be discerned, it is straightforward to decompose interpopulation differences in a genetic (co)variance into a component attributable to the QTL and that owing to the remainder of the genome. This can be accomplished by genotyping each individual in the breeding design that is used to estimate G-matrices at the QTL. This genotyping directly estimates QTL allele frequencies within each population and the cosegregation of QTL variation with phenotypic variation allows allelic effects to be estimated (see below). Given ai[x] for all relevant x, interpopulation differences in allele frequency (qi) combined with equation (1) immediately identify the contribution of QTL i to differences in G-matrices. Clearly, if we find that change at a single QTL substantially alters the genetic correlation, then such a correlation might represent a rather weak constraint on multitrait diversification over longer time scales.

Several complications related to comparing QTL (co)variances among G-matrices merit comment. First, ai[x] can differ between taxa if there is epistasis. Fortunately, this is a testable proposition if the QTL is polymorphic in both populations, or can be made polymorphic by introgression. Then we have a simple test for interaction: Does the difference between QTL genotypes change with genetic background? A more general problem is that most mapping studies do not identify QTL alleles, but instead the genomic region that harbors a QTL (the distinction between Known Alleles and Known Regions is elaborated below in the Hypothesis Testing section). If a particular region contributes to (co)variation in one population but not another, then this QTL clearly contributes to a difference in their G-matrices. However, there are a number of distinct ways that a QTL can contribute to (co)variation in both populations, but in different ways. A QTL-specific difference might result from epistasis (the same polymorphism in different genetic backgrounds), from a difference in allele frequency between populations, or because different, but closely linked, loci are polymorphic within each population. Here, the inclusion of an interpopulation cross with associated genotyping of the QTL region may clarify the nature of the difference.

Concurrent with the accumulation of comparative studies, theoretical biologists have been investigating the long-term balance of mutation, selection, and genetic drift on G-matrix evolution using stochastic simulation (e.g., Jones et al. 2003; Revell 2007; see references in Arnold et al. 2008). Arnold et al. (2008) recently reviewed this literature and concluded that important regularities in multitrait evolution can be predicted from “macroscopic statistics” such as the adaptive landscape (Simpson 1944; Lande 1979) and the mutational variance–covariance matrix (the M-matrix, Lande 1980). The proposition that long-term evolution, and consequent phenomena such as adaptive radiation, can be understood without recourse to the dynamics of individual QTL is certainly attractive. However, the extent of this understanding—exactly what can be predicted and how accurately?—as well as the generality of predictions across different models of mutation and selection remains an open question. A second important direction for future simulation studies will be to fully articulate predictions that can be tested with QTL mapping experiments. Mapping can directly address predictions about the genetic basis of segregating variation and divergence among taxa.

VARIANCE COMPONENTS AND THE MAINTENANCE OF GENETIC VARIATION

  1. Top of page
  2. Abstract
  3. DYNAMICS OF THE G-MATRIX UNDER SELECTION
  4. DIVERSIFICATION OF THE G-MATRIX OVER LONGER TIME SCALES
  5. VARIANCE COMPONENTS AND THE MAINTENANCE OF GENETIC VARIATION
  6. EXPERIMENTAL DESIGN FOR ESTIMATION OF QTL (CO)VARIANCES
  7. HYPOTHESIS TESTING
  8. DIRECT ESTIMATION OF QTL ALLELE FREQUENCIES
  9. SUMMARY AND CONCLUSIONS
  10. ACKNOWLEDGMENTS
  11. LITERATURE CITED
  12. Appendices

Most of the interest in the G-matrix centers on its use for prediction. However, other questions focus on the components themselves. Most quantitative characters are linked to fitness in some way, although this relationship may be very complicated and indirect. As a consequence, hypotheses about the maintenance of variation can be broadly classified as either mutation–selection models or balancing selection models. Both categories have mutation as the ultimate source of variation but differ in the role of natural selection. Selection is purifying in mutation–selection models, as it has a net negative effect on variability. By definition, balancing selection actively maintains polymorphism at QTLs through a variety of hypothesized mechanisms, for example simple overdominance, antagonistic pleiotropy, environmental variation combined with G×E interaction, etc. Here, it is worth noting that balancing selection is entirely different than stabilizing selection, despite that these terms are often confused in the literature. Because stabilizing selection at the phenotypic scale tends to erode genetic variation, it is more naturally categorized as a mutation–selection model than a balancing selection model.

Mutation–selection and balancing selection models differ in testable predictions about the relative magnitudes of different genetic (co)variances. The integration of QTL data can greatly increase the power of variance-based model tests and thus shed new light on the old question of why organisms vary. For example, most balancing selection models invoke some kind of genetic trade-off. An allele that is advantageous in one environment or with respect to one fitness component is disadvantageous under alternative circumstances (environments or fitness components). If such trade-offs are consistent across loci, then these models can be scaled up to predict negative genetic covariances. This has spurred researchers to estimate genetic covariances between different fitness components (e.g., Rose and Charlesworth 1981) and between the same component across different environments (e.g., Rausher 1984). Unfortunately, a number of practical difficulties confront the use of whole-genome covariances (CA[x,y]) to infer trade-offs at individual loci. For example, variation in general vigor can easily overwhelm the signal of individual QTL (Fry 1993).

The trade-offs hypothesized by balancing selection models are directly evaluated by estimating genetic covariances at the scale of QTL. An appropriate experiment can isolate the contribution of particular QTL, not only from the confounding environmental effects that cause the “general vigor problem,” but also from the effects of other genomic regions. The latter is important because even if balancing selection is maintaining most of the genetic variation in a quantitative trait, it is not likely that the same trade-off applies across QTL. For example, antagonistic pleiotropy might maintain the polymorphism at one locus affecting a trait, whereas G×E interaction maintains variation at another. With variation in the nature of fitness trade-offs across QTL, prevalent balancing selection might still produce only weak whole-genome covariances.

A different approach to the question takes mutation–selection balance as the null model and asks whether it is sufficient to explain variation. Mutation–selection models make direct predictions regarding the absolute magnitudes of additive genetic (co)variances, their magnitudes relative to nonadditive variance components, and the response of populations to inbreeding and/or artificial selection (Morton et al. 1956; Deng and Lynch 1996; Curtsinger and Ming 1997; Kelly 1999). Application of these methods to fitness-related traits in both fruit flies (Charlesworth and Hughes 2000; Charlesworth et al. 2007) and monkeyflowers (Kelly and Willis 2001; Kelly 2003) indicate an excess of additive genetic variation relative to the expectation under mutation–selection balance. The implication is that there are intermediate frequency polymorphisms—yet undiscovered—affecting these traits and thus worth mapping.

EXPERIMENTAL DESIGN FOR ESTIMATION OF QTL (CO)VARIANCES

  1. Top of page
  2. Abstract
  3. DYNAMICS OF THE G-MATRIX UNDER SELECTION
  4. DIVERSIFICATION OF THE G-MATRIX OVER LONGER TIME SCALES
  5. VARIANCE COMPONENTS AND THE MAINTENANCE OF GENETIC VARIATION
  6. EXPERIMENTAL DESIGN FOR ESTIMATION OF QTL (CO)VARIANCES
  7. HYPOTHESIS TESTING
  8. DIRECT ESTIMATION OF QTL ALLELE FREQUENCIES
  9. SUMMARY AND CONCLUSIONS
  10. ACKNOWLEDGMENTS
  11. LITERATURE CITED
  12. Appendices

QTL (co)variances can be estimated by incorporating molecular marker genotyping into the analysis of breeding designs initiated from randomly selected genotypes. By necessity, human geneticists have developed and subsequently applied a variety of statistical methods to mapping QTL within outbred full-sibling families (Elston and Spence 2006; Visscher et al. 2007). The “variance component method” of QTL detection directly estimates the genetic (co)variance contributed by a QTL (Goldgar 1990; Blangero et al. 2001). Given that these analytical techniques produce estimates that are directly related to the G-matrix, I suggest that they should be used more fully by evolutionary biologists.

In this section, I review the variance component method and discuss how it can be melded with the replicated genotype and/or inbred line procedures that are routinely used in evolutionary genetics. In essence, the general linear model from statistical mathematics is employed to distinguish the genetic covariance among relatives that is caused by a specific genomic region from that due to the remainder of the genome. The phenotypic value is parsed into three components: QTL, background genome, and environmental effect. Written as an equation,

  • image(2)

where z is the vector of phenotypic values for an individual, g′ is the vector of QTL effects on each trait, g* is the vector of background genotypic values, and e is the vector of environmental deviations. As emphasized by Goldgar (1990), g′ can characterize the aggregate effects of an entire genomic region and not merely a single genetic locus. These regions might be as large as entire chromosomes or chromosome arms, with full haplotypes as the analog of alternative alleles.

Variance–covariance matrices are associated with both g′ and g*, each of which can be decomposed into additive and dominance components (eq. 2 assumes no epistasis). Let G′ denote the additive genetic covariance matrix associated with the QTL. In this matrix, the QTL variances for each trait, Vq[x], are the diagonal elements, and the covariances, Cq[x,y], are on the off-diagonal. Likewise, let G* denote the additive genetic covariance matrix associated with the remainder of the genome: V*A[x] on the diagonal and C*A[x,y] on the off-diagonal. Relating this notation to equation (1), VA[x]=V*A[x]+Vq[x] and CA[x,y]=C*A[x,y]+Cq[x,y]. When fitting equation (2) to data, the (co)variance contribution of the QTL, Vq[x] or Cq[x,y], is diagnosed by closely linked genetic markers that indicate the immediate parentage of alleles. QTL effects elevate the similarity of siblings that share alleles Identical by Descent (see Lynch and Walsh 1998, ch. 16; Blangero et al. 2001).

Equation (2), coupled with a methodology for model fitting such as maximum likelihood, has been applied primarily to outbred populations. However, naturally synthesized outbred families are neither essential nor optimal for estimating QTL (co)variances. Instead, it is the random sampling of parents that is essential. As explained below, experimental designs based on random founders but that employ controlled crosses and/or inbreeding have favorable statistical features. An example is the Replicated F2 design depicted in Figure 2. The Replicated F2 design elaborates the standard inbred line cross: A collection of randomly extracted inbred lines are individually crossed to a common reference line (RL, also fully homozygous). Randomly extracted lines are representative of the background natural population in terms of allele frequencies at QTL. Each cross produces F1 and F2 progeny, which are subsequently genotyped at markers closely linked to the QTL, and measured for the relevant quantitative traits. QTL covariances can then be estimated from the cosegregation of marker alleles with phenotypic variation—the relevant covariances among relatives for this design are described in Appendix 1.

image

Figure 2. The Replicated F2 mapping design: A collection of n randomly extracted inbred lines is each crossed to a common reference line (RL). Each cross produces F1s that are subsequently intracrossed (or self-fertilized) to produce an F2 family. Arrows represent transmission of a gamete.

Download figure to PowerPoint

One advantage of the Replicated F2 design concerns “marker informativeness,” which is often limiting for QTL mapping in outbred sib families (Lynch and Walsh 1998, ch. 16). As there are four distinct QTL alleles potentially segregating in sib families (two from each parent), a fully informative marker locus requires four discernable alleles within each family. Except with the most highly polymorphic marker loci, this is an infrequent occurrence. In contrast, the F2 families of Figure 2 will have only two alleles segregating at both QTL and linked markers because the parents are fully homozygous. Here, the ideal marker locus is one where the RL is homozygous for an allele that is rare in the entire population. At such a locus, the RL will mismatch the great majority of random lines, and as a consequence, this marker effectively “tags” the QTL in most F2 families. Even without such an ideal marker, a small collection of moderately polymorphic markers in close proximity to the QTL should suffice to estimate Vq[x] and/or Cq[x,y].

A second important feature of the Replicated F2 design is that three of the four sub-family types are internally homogeneous. Individuals within the RL, within each Random Line, and within each F1 are genetically equivalent. This allows replicated measurements of phenotype for the same genotype, reducing environmental noise, and subsequently increasing power for both estimation and hypothesis testing (Tanksley 1993). Inbreeding also typically increases the genetic variance and hence the signal associated with a QTL. Of course, it also complicates the covariance of relatives when there is dominance (Cockerham and Weir 1984) and mapping methods need to accommodate these complications (e.g., Verhoeven et al. 2006).

The Replicated F2 Design provides a nice illustration because it is a natural extension of the standard method for interspecies QTL mapping (e.g., Fishman et al. 2002). However, the basic advantages noted above are not specific to the Replicated F2 design. Other breeding schemes may be more appropriate in specific situations. For example, the Maize Diversity Group has extended the design of Figure 2 by developing Recombinant Inbred Lines from each replicate cross to the RL (http://www.panzea.org; Liu et al. 2003; Yu et al. 2008). Apart from the inability to estimate dominance, this design is almost certainly more powerful than the Replicated F2. Mott and Flint (2002) investigate a design in which an inbred RL is crossed to a large number of outbred individuals from a genetically heterogenous population. If a RL is not available (or desired), then a mapping population can be produced by intercrossing randomly extracted inbred lines (e.g., Kelly and Arathi 2003; Churchill et al. 2004).

HYPOTHESIS TESTING

  1. Top of page
  2. Abstract
  3. DYNAMICS OF THE G-MATRIX UNDER SELECTION
  4. DIVERSIFICATION OF THE G-MATRIX OVER LONGER TIME SCALES
  5. VARIANCE COMPONENTS AND THE MAINTENANCE OF GENETIC VARIATION
  6. EXPERIMENTAL DESIGN FOR ESTIMATION OF QTL (CO)VARIANCES
  7. HYPOTHESIS TESTING
  8. DIRECT ESTIMATION OF QTL ALLELE FREQUENCIES
  9. SUMMARY AND CONCLUSIONS
  10. ACKNOWLEDGMENTS
  11. LITERATURE CITED
  12. Appendices

This section outlines a simple hypothesis testing framework that can be used to relate genetic estimates to evolutionary questions. For this aim, it is first necessary to delineate two distinct situations, denoted “Known Alleles” and “Known Regions,” that differ in what is known about the QTL. Known Alleles describes cases in which mapping identifies alternative alleles at the QTL. This clearly applies when the QTL has been fine mapped to causal sites or closely linked diagnostic markers (e.g., Frary et al. 2000; Palsson and Gibson 2004; Stinchcombe et al. 2004; Hoekstra et al. 2006). However, it also includes situations in which one can directly score the alternative states of a polymorphic inversion or other chromosomal features with phenotypic effects (Dobzhansky 1970, ch. 5). Simple phenotypic polymorphisms can also be treated as Known Alleles QTL. For example, a flower color polymorphism in the common Morning Glory has pleiotropic effects on quantitative fitness traits and the alternative genotypes are visually discernable (Coberly and Rausher 2008; see also Levin and Brack 1995; Schemske and Bierzychudek 2001).

For a Known Region QTL, mapping has established a genomic region containing a QTL but markers do not identify alternative QTL alleles. This situation is the most immediate product of most mapping studies. Cosegregation of markers with phenotypes within a line cross identifies a genomic segment (often defined by flanking markers) that contains a QTL. However, the associations between marker alleles and the causal polymorphism are specific to that cross. In the background population from which the line-cross parents were sampled, the same marker allele may be associated with high trait values in some families but low trait values in others. Genotyping of random individuals at marker loci thus does not identify the specific QTL genotype. This is a frequently noted problem with outbred population mapping. It is a primary reason for why QTL mapping of intrapopulational variation is more challenging than the characterization of fixed differences between taxa.

The estimation of QTL (co)variances, as well as any hypothesis tests conducted on these estimates, differs between Known Alleles and Known Regions. One can directly estimate the components of QTL (co)variances with Known Alleles, that is allelic effects and frequencies. In model fitting, g′ of equation (2) is actually treated as a vector of fixed effects (the ai[*] from eq. 1). Genotyping of the parents provides a direct estimate of population allele frequencies (qi from eq. 1), as long as the parents are randomly sampled. In contrast, g′ must be treated as a vector of random variables with Known Regions. QTL variances (Vq[x]) and covariances (Cq[x,y]) are directly estimated as parameters (Goldgar 1990; Blangero et al. 2001). They cannot be decomposed into ai[*] and qi without additional information, an idea elaborated in the section Direct estimation of QTL allele frequencies.

For Known Regions, there is already a substantial body of statistical theory focused on optimizing experimental design to determine whether Vq[x] is nonzero for a single quantitative trait (e.g., Knott and Haley 1992; Luo 1993; Muranty 1996). However, several interesting hypotheses are specific to multitrait data. Line-cross studies routinely identify major QTL with pleiotropic effects (e.g., Hall et al. 2006; Albert et al. 2008). One can then ask whether such a polymorphism is sufficient to explain the genetic covariance between traits. In contrast to the QTL-specific tests, this idea is formalized as a null hypothesis by imposing restriction on the parameters for the genetic background (g* of eq. 2). With two traits (X and Y), the hypothesis test contrasts a model in which C*A[x,y] is constrained to zero with the more general alternative model that allows C*A[x,y] nonzero values (see table 1). If the genetic background does contribute to genetic covariance, one can ask if QTL and background have consistent effects on each trait. In other words, is the QTL contribution to traits proportional to the contribution of the background genome? The parameter constraints for the null model with two traits (X and Y) and one QTL are Vq[x]V*A[x], Vq[y]V*A[y], and Cq[x,y]C*A[x,y]. Here, τ is a constant to be estimated from the data.

Equation (2) is easily generalized to treat multiple QTL (Blangero et al. 2001, p. 155), and although most mapping studies focus on one QTL at a time, evolutionary biology can profit from a broader perspective on hypothesis testing. This can start with very simple models for the entire genetic (co)variance. Perhaps the simplest model one could pose is the so-called “hypergeometric model” (Kondrashov 1985; Zeng 1987; Barton 1992; Shpak and Kondrashov 1999), in which QTLs are equivalent in allele frequencies and genotypic effects. Statistically, this can be imposed by constraining QTL across the genome to equivalent parameter values (table 1, lower panel). Similar constraints can be defined among different QTLs to provide whole-genome models that are less restrictive than the hypergeometric model. For example, one can allow the magnitudes of (co)variances to vary among QTLs, but with constraint on their relative proportions. One step more permissive would be to allow the trait-specific variances to vary among QTLs, but impose a common correlation coefficient. In all cases, the sufficiency of reduced “null” models are evaluated by comparison to more general, unconstrained, models, using likelihood ratios or other model comparison techniques.

The series in table 1 is certainly not exhaustive. It is merely a sampling of simple hypotheses about the “architecture” of quantitative genetic variation. The key difference from previous summaries about how QTL mapping can address questions about genetic architecture is that the parameters in table 1 are population statistics. Vq[x] and Cq[x,y] are the constituents of the G-matrix. They explicitly link allele frequency to phenotypic variation and evolution (eq. 1). Because evolutionary processes (mutation, selection, genetic drift, and migration) are typically characterized at the scale of allele frequency change, this framework provides a means to connect QTL experiments with larger evolutionary questions.

DIRECT ESTIMATION OF QTL ALLELE FREQUENCIES

  1. Top of page
  2. Abstract
  3. DYNAMICS OF THE G-MATRIX UNDER SELECTION
  4. DIVERSIFICATION OF THE G-MATRIX OVER LONGER TIME SCALES
  5. VARIANCE COMPONENTS AND THE MAINTENANCE OF GENETIC VARIATION
  6. EXPERIMENTAL DESIGN FOR ESTIMATION OF QTL (CO)VARIANCES
  7. HYPOTHESIS TESTING
  8. DIRECT ESTIMATION OF QTL ALLELE FREQUENCIES
  9. SUMMARY AND CONCLUSIONS
  10. ACKNOWLEDGMENTS
  11. LITERATURE CITED
  12. Appendices

The preceding sections argue that QTL (co)variances are a worthwhile target for experimental estimation. For some questions however, we would like to go beyond the QTL covariance and estimate its component parts, allelic effects and population frequencies. For example, the simplest and most general way to distinguish mutation–selection balance from balancing selection is by estimating QTL allele frequencies. Mutation–selection balance predicts that variation is caused mainly by rare alleles while balancing selection predicts intermediate allele frequencies.

With the advent of high-throughput genotyping methods, a new set of techniques has emerged that allow simultaneous estimation of QTL effects and allele frequencies. Advanced intercross methods are similar to line-cross designs except that (1) more than two parents initiate the mapping population, and (2) there are multiple, oftentimes many, generations of intercrossing prior to marker–trait association (e.g., Mott and Flint 2002; Valdar et al. 2006). Macdonald and Long (2007) recently employed an advanced intercross design to map variation in bristle number of fruit flies. Their two mapping populations were each synthesized from eight ancestral inbred lines and subsequently allowed to undergo many rounds of recombination. With a high density of informative markers, these authors were able to determine the ancestor for each chromosomal region and estimate allele frequency by determining what fraction of the original eight chromosomes were high versus low in their effects.

Advanced intercross methods are designed to combine fine mapping with a broad sampling of genetic variation (Churchill et al. 2004). As emphasized in the Introduction, sampling is a critical issue if these methods are to be adopted by evolutionary biologists. Genetic effects are statistical random effects, and thus informative about the G-matrix, only if the founders of a breeding design are randomly sampled from the natural population. If founders are sampled based on location or phenotype, then genotypic differences must be treated as fixed effects and the resulting breeding design will not quantitatively reflect any real population.

Agricultural geneticists have developed procedures to estimate allele frequency without fine mapping (Bovenhuis and Weller 1994; Mackinnon and Weller 1995; Weller et al. 2002). The Replicated F2 design (Fig. 2) also yields simple estimators for both allele frequency and allelic effects when the QTL is known to have two alleles. If the RL allele has population frequency q and additive effect a[x] on trait X, then Vq[x]= 2(1 −q)q a[x]2 and h0[x]=a[x] (1 −q). The parameter h0[x], the average phenotypic difference between Reference and Random lines owing to the QTL, is estimated as a fixed effect in this design (see Appendix 1). Inverting these equations yields estimators for a[x] and q from Vq[x] and h0[x]. Measuring multiple traits affected by the QTL, and thus also estimating Cq[x], improves the accuracy of estimation.

The efficiency of QTL allele frequency estimation from the Replicated F2 design is illustrated by Figure 3. Data were repeatedly simulated for an experiment with 200 families, each composed of six individuals from the Random Line, six F1s, and 12 F2 individuals (see Fig. 2). Two traits were measured per individual combined with genotyping of fully informative markers for a Known Region QTL. For each simulated dataset, equation (2) was fit by likelihood (methods described in Appendix 2). Figure 3 considers a QTL of moderate effect with two different values for allele frequency: the RL QTL allele is uncommon in the background population (q= 0.05, red bars) or the two alleles are equally frequent (q= 0.5, green bars). With q= 0.05, this polymorphism explains about 1% of the variance in the first trait and 5% of the variance in the second. With q= 0.5, the QTL explains about 6% and 20%, respectively.

image

Figure 3. The distribution of estimates is given for two distinct sets of simulations of the Replicated F2 design. Each distribution is composed of 200 replicate experiments. Red bars denote results where the true frequency of the reference line (RL) allele is 0.05 whereas green bars are for cases in which q= 0.5.

Download figure to PowerPoint

The substantial gap between the distributions of q estimates in Figure 3 implies that intermediate frequency polymorphisms can be clearly distinguishable from rare-allele polymorphisms. Simulations based on only 40 families per replicate (about 1000 individuals) indicate that smaller experiments are sufficient to discriminate intermediate frequency polymorphisms from rare alleles. However, there is greater variance in the sampling distributions of estimates (unpublished results). Although these results are clearly promising, the critical assumption for this method is that the target region contains a single biallelic causal polymorphism. The extent to which fine mapping is necessary for allele frequency estimation is an important question for future work.

SUMMARY AND CONCLUSIONS

  1. Top of page
  2. Abstract
  3. DYNAMICS OF THE G-MATRIX UNDER SELECTION
  4. DIVERSIFICATION OF THE G-MATRIX OVER LONGER TIME SCALES
  5. VARIANCE COMPONENTS AND THE MAINTENANCE OF GENETIC VARIATION
  6. EXPERIMENTAL DESIGN FOR ESTIMATION OF QTL (CO)VARIANCES
  7. HYPOTHESIS TESTING
  8. DIRECT ESTIMATION OF QTL ALLELE FREQUENCIES
  9. SUMMARY AND CONCLUSIONS
  10. ACKNOWLEDGMENTS
  11. LITERATURE CITED
  12. Appendices

In this essay, I have argued that QTL allele frequencies are the essential variables relating the G-matrices to QTL mapping experiments. Integration thus requires experimentalists to consider allele frequency. This consideration may be implicit, simply by insuring appropriate sampling of parents so that a breeding design reflects the genetic composition of a natural population. Ideally, it is explicit, wherein marker–trait associations are used to directly estimate population allele frequencies.

The oft-stated goal of QTL mapping is to identify the genes, and ultimately the nucleotide level differences (QTN), that contribute to variation in quantitative traits (Mackay 2001, p. 319). This is a worthwhile but very difficult objective to achieve. The most immediate consequence of most mapping experiments is to fracture genomic variation into components. Although these components are large relative to individual coding genes (mapped QTL typically contain hundreds of genes), they may be quite small relative to the genome as a whole (usually 2–10% depending on the recombinational map length of the organism). Important questions can be addressed at this intermediate stage of genomic decomposition. Is the whole, that is the genetic (co)variance of traits, a simple sum of its genomic parts? Are these parts similar or dissimilar in their respective genetic features? These features, which include the magnitude of phenotypic effects, the extent and nature of pleiotropy, and the distribution of allele frequencies and dominance relationships, are critical determinants of sustained response to selection. They are essential to predict how the G-matrix will evolve with noninfinitesimal changes in allele frequency. As a consequence, the integration of QTL mapping methods into G-matrix estimation can provide a concrete, genetically based experimental program to investigate the evolvability of natural populations.

Associate Editor: M. Rausher

ACKNOWLEDGMENTS

  1. Top of page
  2. Abstract
  3. DYNAMICS OF THE G-MATRIX UNDER SELECTION
  4. DIVERSIFICATION OF THE G-MATRIX OVER LONGER TIME SCALES
  5. VARIANCE COMPONENTS AND THE MAINTENANCE OF GENETIC VARIATION
  6. EXPERIMENTAL DESIGN FOR ESTIMATION OF QTL (CO)VARIANCES
  7. HYPOTHESIS TESTING
  8. DIRECT ESTIMATION OF QTL ALLELE FREQUENCIES
  9. SUMMARY AND CONCLUSIONS
  10. ACKNOWLEDGMENTS
  11. LITERATURE CITED
  12. Appendices

I thank A. Scoville, S. Macdonald, V. Koelling, J. Mojica, J. Willis, Y. W. Lee, J. Gleason, M. Rausher, S. Arnold, and an anonymous reviewer for constructive and insightful comments on early drafts of this article. J. Sukumaran and M. Holder provided access to computing resources for simulation. This research was supported by NIH grant GM073990 and NSF grant DEB-0543052.

LITERATURE CITED

  1. Top of page
  2. Abstract
  3. DYNAMICS OF THE G-MATRIX UNDER SELECTION
  4. DIVERSIFICATION OF THE G-MATRIX OVER LONGER TIME SCALES
  5. VARIANCE COMPONENTS AND THE MAINTENANCE OF GENETIC VARIATION
  6. EXPERIMENTAL DESIGN FOR ESTIMATION OF QTL (CO)VARIANCES
  7. HYPOTHESIS TESTING
  8. DIRECT ESTIMATION OF QTL ALLELE FREQUENCIES
  9. SUMMARY AND CONCLUSIONS
  10. ACKNOWLEDGMENTS
  11. LITERATURE CITED
  12. Appendices
  • Albert, A. Y. K., S. Sawaya, T. H. Vines, A. K. Knecht, C. T. Miller, B. R. Summers, S. Balabhadra, D. M. Kingsley, and D. Schluter. 2008. The genetics of adaptive shape shift in stickleback: pleiotropy and effect size. Evolution 62:7685.
  • Arnold, S. 1992. Constraints on phenotypic evolution. Am. Nat. 140:S85S107.
  • Arnold, S. J., R. Bürger, P. A. Hohenlohe, B. C. Ajie, and A. G. Jones. 2008. Understanding the evolution and stability of the G-matrix. Evolution 62:24512461.
  • Barton, N. H. 1992. On the spread of new gene combinations in the 3rd phase of Wright shifting-balance. Evolution 46:551557.
  • Barton, N. H., and M. Turelli. 1987. Adaptive landscapes, genetic distance and the evolution of quantitative characters. Genet. Res. 49:157174.
  • Bégin, M., and D. A. Roff. 2003. The constancy of the G matrix through species divergence and the effects of quantitative genetic constraints on phenotypic evolution: a case study in crickets. Evolution 57:11071120.
  • Blangero, J., J. T. Williams, and L. Almasy. 2001. Variance component methods for detecting complex trait loci. Adv. Genet. 42:151181.
  • Bohren, B. B., W. G. Hill, and A. Robertso. 1966. Some observations on asymmetrical correlated responses to selection. Genet. Res. 7:44–&.
  • Bovenhuis, H., and J. I. Weller. 1994. Mapping and analysis of dairy-cattle quantitative trait loci by maximum-likelihood methodology using milk protein genes as genetic-markers. Genetics 137:267280.
  • Bradshaw, H. D. Jr, K. G. Otto, B. E. Frewen, J. K. Mckay, and D. W. Schemske. 1998. Quantitative trait loci affecting differences in floral morphology between two species of monkeyflower (Mimulus). Genetics 149:367382.
  • Bulmer, M. G. 1980. The mathematical theory of quantitative genetics. Clarendon Press, Oxford .
  • Charlesworth, B., and K. A. Hughes. 2000. The maintenance of genetic variation in life history traits. Pp. 369392 in R. S.Singh and C. B.Krimbas, eds. Evolutionary genetics from molecules to morphology. Cambridge Univ. Press, Cambridge , U.K.
  • Charlesworth, B., T. Miyo, and H. Borthwick. 2007. Selection responses of means and inbreeding depression for female fecundity in Drosophila melanogaster suggest contributions from intermediate-frequency alleles to quantitative trait variation. Genet. Res. 89:8591.
  • Churchill, G., D. Airey, H. Allayee, J. Angel, A. Attie, J. Beatty, W. Beavis, and E. Al. 2004. The collaborative cross, a community resource for the genetic analysis of complex traits. 36:11331137.
  • Coberly, C. L., and M. D. Rausher. 2008. Pleiotropic effects of an allele producing white flowers in Ipomoea purpurea. Evolution 10761085.
  • Cockerham, C. C., and B. S. Weir. 1984. Covariances of relatives stemming from a population undergoing mixed self and random mating. Biometrics 40:157164.
  • Colosimo, P. F., K. E. Hosemann, S. Balabhadra, G. Villarreal, M. Dickson, J. Grimwood, J. Schmutz, R. M. Myers, D. Schluter, and D. M. Kingsley. 2005. Widespread parallel evolution in sticklebacks by repeated fixation of ectodysplasin alleles. Science 307:19281933.
  • Curtsinger, J. W., and R. Ming. 1997. Non-linear selection response in Drosophila: a stategy for testing the rare-alleles model of quantitative genetic variability. Genetica 99:5966.
  • Deng, H.-W., and M. Lynch. 1996. Estimation of genomic mutation parameters in natural populations. Genetics 144:349360.
  • Dobzhansky, T. 1970. Genetics of the evolutionary process. Columbia Univ. Press, New York .
  • Elston, R. C., and M. A. Spence. 2006. Advances in statistical human genetics over the last 25 years. Statistics in medicine. 25:30493080.
  • Falconer, D. S., and T. F. C. Mackay. 1996. Introduction to quantitative genetics. Prentice Hall, London .
  • Fishman, L., A. J. Kelly, and J. H. Willis. 2002. Minor quantitative trait loci underlie floral traits associated with mating system divergence in Mimulus. Evolution 56:21382155.
  • Frary, A., T. C. Nesbitt, S. Grandillo, E. Van Der Knaap, B. Cong, J. Liu, J. Meller, R. Elber, K. B. Alpert, and S. D. Tanksley. 2000. Fw2.2: a quantitative trait locus key to the evolution of tomato fruit size. Science 289:8588.
  • Fry, J. D. 1993. The “general vigor” problem: can antagonistic pleiotropy be detected when genetic covariances are positive? Evolution 47:327333.
  • Gleason, J. M., J.-M. Jallon, J.-D. Rouault, and M. G. Ritchie. 2005. Quantitative trait loci for cuticular hydrocarbons associated with sexual isolation between Drosophila simulans and D. Sechellia. Genetics 171:17891798.
  • Goldgar, D. 1990. Multipoint analysis of human quantitative genetic-variation. Am. J. Hum. Genet. 47:957967.
  • Grant, P. R., and B. R. Grant. 1995. Predicting microevolutionary responses to directional selection on heritable variation. Evolution 49:241251.
  • Griffing, B. 1960. Theoretical consequences of truncation selection based on the individual phenotype. Aust. J. Biol. Sci. 13:309343.
  • Hall, M. C., C. J. Basten, and J. H. Willis. 2006. Pleiotropic quantitative trait loci contribute to population divergence in traits associated with life-history variation in Mimulus guttatus. Genetics 172:18291844.
  • Hazel, L. N. 1943. The genetic basis for constructing selection indexes. Genetics 476490.
  • Hoekstra, H. E., R. J. Hirschmann, R. A. Bundey, P. A. Insel, and J. P. Crossland. 2006. A single amino acid mutation contributes to adaptive beach mouse color pattern. Science 313:101104.
  • Houle, D. 1992. Comparing evolvability and variability of quantitative traits. Genetics 130:195204.
  • Ilanko, S., and S. M. Dickinson. 1999. Asymptotic modelling of rigid boundaries and connections in the Rayleigh-Ritz method. J. Sound Vibrat. 219:370378.
  • Jones, A. G., S. J. Arnold, and R. Bürger. 2003. Stability of the G-matrix in a population experiencing pleiotropic mutation, stabilizing selection, and genetic drift. Evolution 57:17471760.
  • Kelly, J. K. 1999. An experimental method for evaluating the contribution of deleterious mutations to quantitative trait variation. Genet. Res. 73:263273.
  • Kelly, J. K. 2003. Deleterious mutations and the genetic variance of male fitness components in mimulus guttatus. Genetics 164:10711085.
  • Kelly, J. K. 2008. Testing the rare alleles model of quantitative variation by artificial selection. Genetica 132:187198.
  • Kelly, J. K., and H. S. Arathi. 2003. Inbreeding and the genetic variance of floral traits in Mimulus guttatus. Heredity 90:7783.
  • Kelly, J. K., and J. H. Willis. 2001. Deleterious mutations and genetic variation for flower size in Mimulus guttatus. Evolution 55:937942.
  • Kingsolver, J. G., H. E. Hoekstra, J. M. Hoekstra, D. Berrigan, S. N. Vignieri, C. E. Hill, A. Hoang, P. Gibert, and P. Beerli. 2001. The strength of phenotypic selection in natural populations. Am. Nat. 157:245261.
  • Kirkpatrick, M., and R. Lande. 1989. The evolution of maternal characters. Evolution 43:485503.
  • Knott, S. A., and C. S. Haley. 1992. Maximum likelihood mapping of quantitative trait loci using full-sib families. Genetics 132:12111222.
  • Kondrashov, A. S. 1985. Deleterious mutations as an evolutionary factor.2. Facultative apomixis and selfing. Genetics 111:635653.
  • Kroymann, J., and T. Mitchell-Olds. 2005. Epistasis and balanced polymorphism influencing quantitative trait variation. Nature 435:9598.
  • Lande, R. 1979. Quantitative genetic analysis of multivariate evolution applied to brain:body allometry. Evolution 33:402416.
  • Lande, R. 1980. The genetic covariance between characters maintained by pleiotropic mutations. Genetics 94:203215.
  • Lande, R., and S. Arnold. 1983. The measurement of selection on correlated characters. Evolution 37:12101226.
  • Lander, E. S., and D. Botstein. 1989. Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121:185199.
  • Levin, D. A., and E. T. Brack. 1995. Natural selection against white petals in Phlox. Evolution 49:10171022.
  • Liu, K., M. Goodman, S. Muse, J. S. Smith, E. Buckler, and J. Doebley. 2003. Genetic structure and diversity among maize inbred lines as inferred from DNA microsatellites. Genetics 165:21172128.
  • Luo, Z. W. 1993. The power of two experimental designs for detecting linkage between a marker locus and a locus affecting a quantitative character in a segregating population. Genet. Selection Evol. 25:249261.
  • Lush, J. L. 1937. Animal breeding plans. Iowa state press, Ames , Iowa .
  • Lynch, M., and B. Walsh. 1998. Genetics and analysis of quantitative characters. Sinauer associates, Sunderland , MA .
  • Macdonald, S. J., and A. D. Long. 2007. Joint estimates of quantitative trait locus effect and frequency using synthetic recombinant populations of Drosophila melanogaster. Genetics 176:12611281.
  • Mackay, T. F. C. 1996. The nature of quantitative genetic variation revisited: lessons from Drosophila bristles. Bioessays 18:113121.
  • Mackay, T. F. C. 2001. The genetic architecture of quantitative traits. Annu. Rev. Genet. 35:303339.
  • Mackay, T. F. C. 2004. Genetic dissection of quantitative traits. Pp. 5173 in R. S.Singh and M. K.Uyenoyama, eds. The evolution of population biology. Cambridge Univ. Press, Cambridge .
  • Mackinnon, M. J., and J. I. Weller. 1995. Methodology and accuracy of estimation of quantitative trait loci parameters in a half-sib design using maximum-likelihood. Genetics 141:755770.
  • Mezey, J. G., and D. Houle. 2005. The dimensionality of genetic variation for wing shape in Drosophila melanogaster. Evolution 59:10271038.
  • Mitchell-Olds, T., J. H. Willis, and D. B. Goldstein. 2007. Which evolutionary processes influence natural genetic variation for phenotypic traits? Nat Rev Genet 8:845856.
  • Morton, N. E., J. F. Crow, and H. J. Muller. 1956. An estimate of the mutational damage in man from data on consanguineous marriages. Proc. Natl. Acad. Sci. USA 42:855863.
  • Mott, R., and J. Flint. 2002. Simultaneous detection and fine mapping of quantitative trait loci in mice using heterogeneous stocks. Genetics 160:16091618.
  • Muranty, H. 1996. Power of tests for quantitative trait loci detection using full-sib families in different schemes. Heredity 76:156165.
  • Palsson, A., and G. Gibson. 2004. Association between nucleotide variation in EGFR and wing shape in Drosophila melanogaster. Genetics 167:11871198.
  • Paterson, A. H., E. S. Lander, J. D. Hewitt, S. Peterson, S. E. Lincoln, and S. D. Tanksley. 1988. Resolution of quantitative traits into Mendelian factors by using a complete linkage map of restriction fragment length polymorphisms. 335:721726.
  • Press, W. H., S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. 1992. Numerical recipes in C. Cambridge Univ. Press, Cambridge .
  • Rausher, M. D. 1984. Tradeoffs in performance on different hosts: evidence from within- and between-site variation in the beetle Deloyala guttata. Evolution 38:582595.
  • Revell, L. J. 2007. The G matrix under fluctuating correlational mutation and selection. Evolution 61:18571872.
  • Rose, M. R., and B. Charlesworth. 1981. Genetics of life history in Drosophila melanogaster. I. Sib analysis of adult females. Genetics 91:175186.
  • Schemske, D. W., and P. Bierzychudek. 2001. Perspective: evolution of flower color in the desert annual Linanthus parryae: Wright revisited. Evolution 55:12691282.
  • Searle, S. R., G. Casella, and C. E. Mcculloch. 1992. Variance components. John Wiley and sons Inc., New York .
  • Shpak, M., and A. S. Kondrashov. 1999. Applicability of the hypergeometric phenotypic model to haploid and diploid populations. Evolution 53:600604.
  • Simpson, G. G. 1944. Tempo and mode in evolution. Columbia Univ. Press, New York .
  • Steppan, S. J. 1997. Phylogenetic analysis of phenotypic covariance structure. II. Reconstructing matrix evolution. Evolution 51:587594.
  • Steppan, S. J., P. C. Phillips, and D. Houle. 2002. Comparative quantitative genetics: evolution of the G matrix. Trends Ecol. Evol. 17:320327.
  • Stinchcombe, J. R., C. Weinig, M. Ungerer, K. M. Olsen, C. Mays, S. S. Halldorsdottir, M. D. Purugganan, and J. Schmitt. 2004. A latitudinal cline in flowering time in Arabidopsis thaliana modulated by the flowering time gene Frigida. Proc. Natl. Acad. Sci. USA 101:47124717.
  • Tanksley, S. D. 1993. Mapping polygenes. Annu. Rev. Genet. 27:205233.
  • Turelli, M. 1988. Population genetic models for polygenic variation and evolution in B. S.Weir, E. J.Eisen, M. M.Goodman and G.Namkoong, eds. Proceedings of the second international conference on quantitative genetics. Sunderland , MA .
  • Turelli, M., and N. H. Barton. 1994. Genetic and statistical-analyses of strong selection on polygenic traits—what, me normal. Genetics 138:913941.
  • Turelli, M., and N. H. Barton. 2004. Polygenic variation maintained by balancing selection: pleiotropy, sex-dependent allelic effects and G x E interactions. Genetics 166:10531079.
  • Valdar, W., J. Flint, and R. Mott. 2006. Simulating the collaborative cross: power of quantitative trait loci detection and mapping resolution in large sets of recombinant inbred strains of mice. Genetics 172:17831797.
  • Verhoeven, K. J. F., J.-L. Jannink, and L. M. Mcintyre. 2006. Using mating designs to uncover QTL and the genetic architecture of complex traits Heredity. 96:139149.
  • Visscher, P. M., S. Macgregor, B. Benyamin, G. Zhu, and E. Al. 2007. Genome partitioning of genetic variation for height from 11,214 sibling pairs. Am. J. Hum. Genet. 81:11041110.
  • Weller, J. I., H. Weller, D. Kliger, and M. Ron. 2002. Estimation of quantitative trait locus allele frequency via a modified granddaughter design. Genetics 162:841849.
  • Yu, J. M., J. B. Holland, M. D. Mcmullen, and E. S. Buckler. 2008. Genetic design and statistical power of nested association mapping in maize. Genetics 178:539551.
  • Zeng, Z.-B. 1987. Genotypic distribution at the limits to natural and artificial selection with mutation. Theor. Popul. Biol. 32:90113.

Appendices

  1. Top of page
  2. Abstract
  3. DYNAMICS OF THE G-MATRIX UNDER SELECTION
  4. DIVERSIFICATION OF THE G-MATRIX OVER LONGER TIME SCALES
  5. VARIANCE COMPONENTS AND THE MAINTENANCE OF GENETIC VARIATION
  6. EXPERIMENTAL DESIGN FOR ESTIMATION OF QTL (CO)VARIANCES
  7. HYPOTHESIS TESTING
  8. DIRECT ESTIMATION OF QTL ALLELE FREQUENCIES
  9. SUMMARY AND CONCLUSIONS
  10. ACKNOWLEDGMENTS
  11. LITERATURE CITED
  12. Appendices

Appendix 1

This section describes the expectations, variances, and covariances among measurements collected from a Replicated F2 design. I assume additive gene action. The Known Alleles case is treated first, followed by modifications necessary with Known Regions.

THE RESEMBLANCE OF RELATIVES WITH KNOWN ALLELES

With Known alleles, g′ of equation (2) is estimated as a collection of fixed effects and we have a standard mixed-model (Searle et al. 1992). For the replicated F2 design, there are four different “types”: individuals of the reference line (P0), individuals from each random line (Pi), the F1 hybrids from the cross of random line i to the reference line (F1(i)), and the corresponding F2s (F2(i)). The effect of the genetic background, g*, is random for each Pi but is a fixed effect for P0. For simplicity, the population mean of g* (across Random Lines) is μ[x] for trait X and we define a vector b0 as the set of trait deviations specific to the reference line. Let A0 represent the allele homozygous in the reference line. Allele A0 has additive effect a0[x] on trait X. With these conventions, the conditional expected phenotypic values for trait X are

  • image((A1))
  • image((A2))
  • image((A3))

where k is the number of A0 copies at the QTL in the individual. The background genotypic variances for trait X are:

  • image((A4))
  • image((A5))
  • image((A6))

Here, V*A[x] is additive background variance for trait X and VS[x] is the “segregational variance” associated with gamete production of the RL. If many loci contribute to the genetic background effect, VS[x]V*A[x]/4. The phenotypic variance for each type is the genotypic variance plus VE[x], the environmental variance for trait X. The genotypic covariances (within an individual between traits) are similar in form to equations (A4–A6), but with the appropriate additive covariance (C*A[x,y]) replacing V*A[x] and the segregational covariance (CS[x,y]) replacing VS[x]. The corresponding phenotypic covariance is augmented by the addition of CE[x,y], the environmental covariance.

Comparisons among relatives can be classified as within-type (e.g., between two F1s from the same cross) or between-type (e.g., between the random line a descendant F2). The Pi and F1(i) subfamilies are genetically homogeneous (all individuals are genetically identical), and as a consequence, the genetic covariance of distinct individuals equals the corresponding genetic variance (eqs A4 and A5). In contrast, F2s are internally heterogeneous and

  • image((A7))

where F2.1 and F2.2 are distinct F2 individuals from the same subfamily. The between-type within family covariances are

  • image((A8))
  • image((A9))

The genotypic covariances (for distinct traits of distinct individuals) are similar in form to equations A7–A9 except that the additive covariance of traits X and Y (C*A[x,y]) replaces V*A[x].

THE RESEMBLANCE OF RELATIVES WITH A KNOWN REGION

In this case, g′ of equation (2) must be treated as a vector of random effects. The covariances among relatives are now a function of the number of copies of the QTL allele that is derived from the random line (and not the RL) in each individual. Let j and k denote the number of QTL alleles descended from the random line in two distinct individuals of a family. The genotypic variance of trait x is increased by j2Vq[x]/2 for the first individual and by k2Vq[x]/2 for the second individual. The covariance between these individuals is incremented by j k Vq[x]/2. The increments to genotypic covariances among traits are the same except with Cq[x,y] replacing Vq[x]. The formulas for variance components associated with genetic background are the same as in the Known Alleles case. The conditional expectations for trait X are:

  • image((A10))
  • image((A11))
  • image((A12))

where k is the number of QTL allele copies derived from the RL (k= 1 for all F1 individuals). Here, h0[x] is the deviation of the RL from the mean of random lines owing to its genotype at the QTL. The grand mean, μ[x], absorbs the average QTL effect in the background population and thus depends implicitly on a0[x]. To summarize the Replicated F2 Design, the full model for one trait (X) involves three fixed effects (μ[x], h0[x], and b0[x]) and four variance components (VE[x], V*A[x], V*S[x], and V*q[x]). With two traits and one QTL, there are 18 parameters (the seven associated with each trait plus CE[x,y], C*A[x,y], C*S[x,y], and Cq[x,y]).

Appendix 2

This section describes the data simulation and model fitting to produce Figure 3. A single QTL exists in a Known Region, but the two alleles are not specifically identified by genetic markers. This QTL affects two traits (X and Y). The allele of the RL (A0) has frequency q in the background population. The example of Figure 3 considers q= 0.05 versus q= 0.5, and for both cases the QTL effects are a0[x]= 0.5 and a0[y]= 1.0 (the notation follows from Appendix 1). Data are simulated by first drawing the background genotypic values for each Random Line. Each g* is sampled from a bivariate normal distribution with means zero and the appropriate (co)variance matrix (see eq. A4). Each Random Line is assigned a homozygous QTL genotype randomly: A0/A0 with probability q, A1/A1 with probability 1 –q. Phenotypes for each Random Line individual were produced by adding a binormal environmental effect to genotypic values with means zero and the (co)variance matrix of environmental effects (VE[x], VE[y], and CE[x,y]).

The F1 and F2 individuals for each family were generated conditional on the genotypes of the Random Line and RL. For simplicity, I set b0[x]= b0[y]= 0 for the simulations of Figure 3 and assumed that the segregational (co)variances were equal to one-fourth of the associated additive (co)variances (see Appendix 1). Given this, the F1 genotype for both background and QTL are fully specified by the parental lines. An environmental effect is added to each individual measured in the experiment. Segregation occurs in production of F2 progeny, and as a consequence, random variables are drawn for both background and QTL. For the background, the expected genotypic value of each trait is equal to that of the F1. A residual is drawn from a binormal distribution with means zero and contingent on the segregational (co)variance matrix (eq. A6). If the F1 is A0/A1 at the QTL, then F2 are assigned a genotype randomly: A0/A0 with probability 0.25, A0/A1 with probability 0.5, and A1/A1 with probability 0.25. Finally, an environmental deviation is added to the genotypic value of all measured F2s by the same procedure described previously. For Figure 3, I also included 96 individuals of the RL in each simulated experiment. The variance component parameters were set to VE[x]=VE[y]=V*A[x]=V*A[y]= 1.0, CE[x,y]=C*A[x,y]= 0.0.

Given a simulated dataset, the statistical genetic model (eq. (2) and Appendix 1) was fit by likelihood (Searle et al. 1992, ch. 6). To maximize the likelihood, I used a combination of deterministic and stochastic search implemented in a C program (available upon request). Powell's algorithm (Press et al. 1992) was initially applied starting with parameter values corresponding to the “truth”, that is the set of parameter values from which the data were simulated. After the Powell search converges, the algorithm performs a stochastic search for 4000 steps. If this stochastic search finds a single likelihood higher than Powell, the cycle is repeated from this maximum. This two-part sequence, Powell then Stochastic search, is repeated until the Stochastic search fails to find any improvement on the preceding Powell search. When maximization is complete, parameter estimates and the likelihood are stored. For Figure 3, the procedure was applied to 200 simulated dataset for each case.

Likelihood maximization is subject to feasibility constraints. Variances cannot be negative and the magnitude of a covariance cannot exceed the geometric mean of the variances. In the search described above, I used penalty functions—an increment is subtracted from the likelihood if the search steps out of the feasible range for a parameter (Ilanko and Dickinson 1999)—but these penalties do not affect the final likelihood value. Because Figure 3 actually considers the specific case in which the five QTL parameters (h0[x], h0[y], V*q[x], V*q[y], and Cq[x,y]) are determined by only three quantities (q, a0[x], and b0[y]), I fit the latter directly.