GENE FLOW FROM DOMESTICATED SPECIES TO WILD RELATIVES: MIGRATION LOAD IN A MODEL OF MULTIVARIATE SELECTION

Authors


Abstract

Domesticated species frequently spread their genes into populations of wild relatives through interbreeding. The domestication process often involves artificial selection for economically desirable traits. This can lead to an indirect response in unknown correlated traits and a reduction in fitness of domesticated individuals in the wild. Previous models for the effect of gene flow from domesticated species to wild relatives have assumed that evolution occurs in one dimension. Here, I develop a quantitative genetic model for the balance between migration and multivariate stabilizing selection. Different forms of correlational selection consistent with a given observed ratio between average fitness of domesticated and wild individuals offsets the phenotypic means at migration–selection balance away from predictions based on simpler one-dimensional models. For almost all parameter values, correlational selection leads to a reduction in the migration load. For ridge selection, this reduction arises because the distance the immigrants deviates from the local optimum in effect is reduced. For realistic parameter values, however, the effect of correlational selection on the load is small, suggesting that simpler one-dimensional models may still be adequate in terms of predicting mean population fitness and viability.

Many domesticated species, through interbreeding, spread their genes back into populations of wild relatives from which they originate. For example, among 13 of the world's most important crop species, 12 interbreed with wild relatives (Ellstrand et al. 1999). Among animals, hybridization occurs between escaped domestic and wild American mink (Kidd et al. 2009) and between dogs and wolves (Verardi et al. 2006). Over the last three decades, domestication and farming of marine species of fish in open net pens, in particular Atlantic salmon but also different species of gadoids, has increased dramatically and now amounts to more than 40% of all fish consumed worldwide (Naylor et al. 2005). Escapes from net pen operations are known to occur both through low-level “leakage” and through episodic events such as storms. This has led to concerns over possible detrimental effects on wild populations (Hindar et al. 1991; Bekkevold et al. 2006).

Domestication of all these species has inevitably lead to genetic changes, either as a result of natural selection within the domestic environment or though artificial selection for certain economically desirable traits. An example of the latter is farmed Atlantic salmon that since the initiation of the main Norwegian breeding program at the beginning of the 1970s, has undergone more than nine generations of artificial selection, mainly for increased growth performance. Because natural selection in the absence of other evolutionary forces, under quite general conditions, is expected to maximize a population's intrinsic rate of increase and mean relative fitness (Lande 1982; Lande 2007), any change away from a natural optimum, including phenotypic changes resulting from artificial selection, should lead to a reduction in fitness in the wild. For salmon, several studies have indeed found that fish of farmed origin when released into the wild, suffer from a reduction in survival, reproductive success, and overall fitness relative to wild fish (Fleming et al. 2000; Mcginnity et al. 2003). For species with a longer history of domestication, outbreeding depression, including sterility, thought to occur as result of coadapted gene complexes (Templeton 1986), has been observed (Ellstrand et al. 1999).

With continual migration of domesticated individuals into a population of wild relatives, a balance between natural selection and migration will be established. Because the traits involved are most likely polygenic (Lande 1982), theoretical models of this process cannot be based on simple single-locus population genetic theory (e.g., Crow and Kimura 1970; Slatkin 1987, ch. 6.5) as suggested by (Ellstrand et al. 1999, p. 552), but must instead build on quantitative genetic theory. A reasonable starting point is the so-called infinitesimal model (Fisher 1918; Bulmer 1980; Turelli and Barton 1994) that can be seen as a simple approximation of the underlying genetic details, valid when the number of loci is large and suitable for modeling evolution over a small number of generations. Under this model, the within-family distribution of breeding values is normal with a constant variance, although the whole population may be nonnormal as a result of linkage disequilibrium. The joint effects of selection, immigration, and reproduction can therefore easily be computed numerically (Turelli and Barton 1994; Tufto 2000).

An alternative approach is taken by Wolf et al. (2001) who modeled extinction risk of populations of wild sunflowers through hybridization with domestic relatives using a simple model with frequencies of natives, hybrids and invaders as state variables and with hybrids as an absorbing class. Similarly, Hindar et al. (2006) developed a model for the effect of farmed salmon escapees using the frequencies of farmed fish, hybrids, and wild fish as state variables, assigning equal halves of hybrid-farm and hybrid-wild backcrosses to the hybrid and farm, and hybrid and wild categories, respectively. These approaches, however, ignore genetic variation and selection within each class (Jonsson et al. 2006, p. 44) and the fact that different backcrosses in subsequent generations are best modeled not as discrete set but rather a continuum of genotypes.

Building on the infinitesimal model and the assumption that selection and migration are weak, a joint evolutionary and demographic model was developed by Tufto (2001). A critical parameter in this model is the distance the immigrants deviates from the phenotypic optimum in the wild. If this distance exceeds 2.8 genetic standard deviations, and if the immigration rate exceeds a certain threshold m*, the negative effect of maladaptation on the equilibrium size of a wild population can exceed the positive direct supplementation effect of immigrants. Through a positive feedback loop between mean population size and the rate of gene flow, so-called migrational meltdown (see e.g. Lenormand 2002), several equilibria can also emerge for more extreme parameter values. Other similar generic models include Kirkpatrick and Barton's (1997) model on evolution of species range (Holt and Gomulkiewicz 1997; Ronce and Kirkpatrick 2001).

The above models are all based on the assumption that evolution occurs in one dimension only. However, the ubiquity of pleiotropy and associated genetic correlations (Lande 1979; Arnold 1994) implies that artificial selection (as well as other forms of selection) will create an indirect response in an unknown number of other unobserved traits. In farmed salmon, such indirect responses are indeed observed in traits such as survival, age of maturation, behavioral traits, and morphology (Fleming and Einum 1997; Solem et al. 2006). If such indirectly selected genetically correlated traits, or indeed, genetically uncorrelated traits, are involved in determining fitness in the wild, the evolutionary changes as a result of selection and migration may no longer occur along a single dimension but instead along a curved trajectory (as shown by Guillaume and Whitlock 2007, Fig. 3) and correlational selection and genetic correlations may offset the phenotypic means at selection–migration balance.

Figure 3.

The 75%-probability contours of the breeding values before selection at selection–migration balance (solid contours) for m= 0.4, ρ= 0, ρ=−0.5, and ρ=−0.95 (upper, middle, and lower row) and s22/s11= 1 for the SMR and MSR life cycles (left and right column). The dashed contours are 75% probability contours of the breeding value distribution among wild fish unexposed to immigration and farmed escapees, respectively. The ellipses (dotted lines) represent contours of relative fitness in increments of 0.1.

The main objective of this article is to investigate how properties of a population at migration–selection balance depend on these complications. Rather than attempting to predict effects on population size by building a complete evolutionary and demographic model, I will focus only on the reduction in mean relative fitness in the wild population, the migration load. As we will see, for realistic parameter values, as long as the relative mean fitness of the immigrants is kept constant, the effects of different possible forms of multivariate selection on this load, and hence further effects on population size depending on the (largely unknown) details of density regulations, are relatively small.

The model developed should be applicable to any organism that has been subject to mainly artificial selection during the domestication process and for which some degree of gene flow back into populations of wild relatives occurs. I assume a relatively short history of domestication such that outbreeding depression due to coadapted gene complexes is negligible. In the numerical examples, I will focus on Atlantic salmon, in part, because estimates of some important model parameters are available from the literature, and, in part, because of the economic importance and conservation status of Atlantic salmon. The rate of immigration of farmed salmon into many wild populations is also extremely high, reaching levels between 11% and 35% in many Norwegian rivers (Naylor et al. 2005).

Model

RATIONALE

Although an indirect response may have occurred in several traits, I restrict the model to n= 2 traits P1 and P2 and assume that the joint optimum in the wild is at P= (P1, P2)T= (0, 0)T=0. Artificial selection on P1 and indirect selection on the correlated trait P2 within a completely isolated breeding line has changed the mean within the domesticated population from this optimum toward some point inline image. For farmed salmon in particular, the total response in the directly selected trait P1 (growth performance in the case of salmon) inline image, at present is known to be about four genetic standard deviations (I. Olesen, pers. comm.). The total indirect response inline image, however, is an unknown parameter in the model. In the case of no genetic correlation between Z1 and Z2, the total indirect response would be inline image. As will be discussed in greater detail later, restricting the model to two dimensions has little impact on the main conclusions.

A parsimonious, discrete-time model of a local fitness optima in the wild is to assume that multivariate Gaussian selection operates on the vector of phenotypes P= (P1, P2)T. Assuming that the individual phenotype vector is the sum P=Z+E of the vector of breeding values Z and a vector of environmental effects E and that E is multivariate normal and independent of Z (no environment-genotype correlations or interactions), it follows that mean relative fitness of individuals with breeding value Z is also multivariate Gaussian with fitness given by

image(1)

where

image(2)

SP describes Gaussian selection of the same form on the phenotypes, and V is the covariance matrix of E. The elements S11 and S22 describe the strength of stabilizing selection along the first and second axis, respectively, whereas S12 represent the degree to which the two traits interact.

Note further that the inverse of S, S−1, can be thought of as a covariance matrix because (1) has the same form as a multivariate normal density. I will therefore refer to the inverse of such matrices as selection covariance matrices and the corresponding correlation coefficient ρ as a selection correlation coefficient. In statistics, the inverse of a covariance matrix is referred to as precision matrix (Wasserman 2004). Building on this terminology, I will refer to matrices such as S as selection precision matrices.

We will also consider stabilizing selection acting on a single linear combination of P1, P2 only, orthogonally to an axis of maximum fitness, so-called ridge selection (see Fig. 5 for examples). This form of selection arises in the limiting case of inline image such that |ρ| → 1 (keeping in mind that S is not invertible at this limit). More general forms of quadratic selection such as disruptive selection, evolutionary saddle points, and rising ridges (see e.g., Phillips and Arnold 1989) will not be considered here.

Figure 5.

The distribution of breeding values at migration–selection balance for ridge selection (ρ=−1) for s22= 0, s22/s11= 1, and s22/s11= 3. The axis x of evolution in the degenerate one-dimensional model and the effective distance to the optimum inline image is indicated by the arrows.

Also note the somewhat unconventional use of uppercase letters for both vectors and matrices; lowercase letters will be reserved for scaled, dimensionless variables and parameters are introduced in the next subsection.

To model the underlying genetic details, I will rely on an extension of the infinitesimal model to two dimensions throughout this article. The robustness of the infinitesimal model will be dealt with elsewhere (also see Tufto 2000). Just as the genetic variance at linkage equilibrium is constant in the standard one-dimensional infinitesimal model, so is the genetic covariance matrix G at linkage equilibrium when the model is extended to two dimensions. Similarly, the within-family distribution of breeding values becomes multivariate normal with mean vector equal to the mean of the parental breeding value vectors and covariance matrix equal to G/2.

TRUNCATION SELECTION AND SCALING OF VARIABLES

To reduce the number of parameters we need to work with, it will be useful to introduce a linear transformation of the variables. First, however, to motivate our choice of linear transformation, consider the effect of artificial truncation selection. Defining the relative fitness of individuals with phenotypic value P1 greater than some threshold C as 1 and assigning a fitness of zero to all other individuals, the expected fitness of an individual with breeding value Z= (Z1, Z2)T becomes

image(3)

where φ is the standard normal cumulative distribution. Under truncation selection, selection on the breeding values is thus independent of Z2 regardless of any covariance between environmental effects E1 and E2.

This implies that a direct response to selection inline image in inline image in any particular generation will be accompanied by an indirect response inline image in inline image determined by the current slope of the regression of Z2 against Z1. Even though truncation selection will generate negative linkage disequilibrium deflating the expressed genetic variance in Z1 (Bulmer 1971) and other elements of the covariance matrix G (Villanueva and Kennedy 1990) (see Fig. 1), the regression slope will not change and the total indirect response in inline image will simply be given by

image(4)

as is shown in Appendix A.

Figure 1.

The 75%-probability contours of the distribution of breeding values (computed numerically) under repeated truncation selection before (left plot) and after (right plot) the transformation of variables given in Appendix A. Although the Bulmer effect implies that the variances of and covariance between the directly and indirectly selected trait Z1 and Z2 become deflated (left plot), the regression of Z2 against Z1, E(Z2|Z1) (dashed lines) remains constant.

This result suggest a simple shearing transformation and subsequent scaling of the breeding values from Z to z, which changes the mean among the immigrants from inline image to inline image (see Fig. 1) and the genetic covariance matrix at linkage equilibrium to the identity matrix I, thus eliminating the unknown indirect response inline image and the genetic covariance matrix from the model (again, see Appendix A). The dimensionless parameter inline image is the direct response to artificial selection measured in genetic standard deviations.

Fitness as function of the vector of transformed breeding values z becomes

image(5)

where s is dimensionless selection precision matrix determined by S and G (see Appendix A).

It can be noted that if selection acts independently on the transformed variables (s diagonal) this corresponds to a certain form of correlational selection (described by S) on the original traits but the orientation of the selection surface (described by the eigenvectors of S−1, see Phillips and Arnold (1989)) will only to some extent align with the eigenvectors of the genetic covariance matrix G.

LIFE CYCLE AND NUMERICAL METHODS

Relying on the infinitesimal model, the effects of stabilizing selection, reproduction, and migration on the bivariate distribution of breeding values ψ(z) can be computed numerically using the methods given in Turelli and Barton (1994) and Tufto (2000) extended to two dimensions. In practice, rather than working with continuous distribution referred to below, we instead keep track of the distribution on a discrete grid with a suitable resolution.

Viability selection has a simple effect on the breeding values changing the distribution from ψ(z) to

image(6)

Migration at rate m, in turn, changes the distribution from ψ(z) to

image(7)

where inline image, representing the distribution among immigrants, is multivariate normal with mean vector inline image and covariance matrix I if assuming linkage equilibrium among the immigrants.

Under the infinitesimal model, the vector of breeding values z of an offspring, given the parental breeding values z1 and z2, is

image(8)

where ε is multivariate normal with zero mean vector and covariance matrix I/2. If we assume random mating, the distribution of breeding values after reproduction, ψ‴(z), is therefore a simple convolution between the bivariate distributions of the terms in (8). In practice, this convolution can be computed efficiently using the method of fast discrete Fourier transforms in two dimensions, for example, the fft function in R (R Development Core Team 2008).

An illustration of the method is given in Figure 2. In general, the distribution at equilibrium between selection and migration can be computed by iterating (6), (7), and (8) until, say, the norm of the change in the grid values used for representing ψ(z) is sufficiently small. The numerical error in the computed load for cases in which mean fitness is known analytically from (9) and (B4) was typically of the order of 10−7 using a grid resolution of 0.04 (in units of z1 and z2) and a convergence tolerance of 10−16.

Figure 2.

The distribution of breeding values across one generations after selection (upper plot), migration at rate m= 0.5 and inline image (middle row), and reproduction (lower row). The ellipses (dotted lines) in the plots in upper row represent contours of relative fitness equal in increments of 0.1 based on a selection precision matrix given by s11=s22= 0.38 and s12= 0.19.

In discrete time models like this, it is well known that the order of events of the life cycle matters. I will therefore investigate two alternative life cycles; selection followed by migration and reproduction (SMR) and migration followed by selection and reproduction (MSR). In reality, there may be episodes of selection both before and after migration; the two life cycles considered here are therefore two extreme ends of a continuum. We can therefore expect the conclusion that can be drawn from these two special cases to hold also more generally for life cycles with several episodes of selection.

MEAN FITNESS AND DEFINITION OF THE LOAD

The distribution of breeding values will not in general remain multivariate normal (see Fig. 2) and mean fitness in the population Ew(z) must therefore in general be computed numerically. In cases in which we can assume that the breeding values follow a multivariate normal distribution, for example for a population not exposed to immigration or within the breeding program (considered in the next subsection), a useful expression for mean fitness, obtained using (5), is given by

image(9)

where inline image is the vector of mean breeding values and g is the expressed genetic covariance matrix.

The migration load is now defined as the fraction by which mean relative fitness in the population E w(z) is reduced at migration–selection balance relative to the mean fitness the population would attain if there was no migration, that is,

image(10)

where g0, the genetic covariance matrix in a population subject to only stabilizing selection and not any migration, is given by (B4).

SELECTION MATRICES CONSISTENT WITH EMPIRICAL DATA

In general, I will assume that the ratio q between mean fitness among immigrants located at inline image and mean fitness in a wild population located at inline image is given. For example, empirical studies of fitness differences between wild salmon and farmed fish from the fifth generation of the breeding program (Fleming et al. 2000) leads, based on extrapolation, to a present-day fitness ratio of about q= 0.11 for inline image (see Appendix C). Given q and inline image, the three parameters s11, s22, and s12 of the selection matrix s can be chosen with two degrees of freedom. Figure 3 shows three different choices of selection matrix, each consistent with the observed ratio between mean fitness of immigrants and natives.

A method for finding selection matrices consistent with q and inline image is given in Appendix B. To a good approximation, the strength of stabilizing selection along the z1-axis, s11, turns out to be determined to a large extent by q and inline image and is only weakly dependent on the choice of s12 and s22.

Results

INTRODUCTORY NUMERICAL EXAMPLES

For the values of q and inline image given in the previous subsection, properties of the migration–selection balance become functions of the rate of migration m and the choice of selection matrix s as well as the present-day total response to artificial selection inline image. The contours (solid lines) in Figure 3 represent the distribution of breeding values at migration–selection balance for three different selection matrices, both life cycles, and for increasing values of the selection correlation coefficient ρ. The corresponding parameter values are given in Appendix C.

With correlational selection, evolution can no longer be understood as a one-dimensional process. Instead, selection offsets the balance point in the direction of the gradient of the fitness function. For ρ=−0.5 the effects on inline image and inline image are both moderate (Fig. 2, middle plots). For strong correlational selection, however, (ρ=−0.95, Fig. 2, lower plots), the predicted value for inline image is much closer to the mean inline image among immigrants and the predicted value of inline image is offset by as much as minus one genetic standard deviation.

EFFECTS OF CORRELATIONAL SELECTION ON THE LOAD

Relying on the parameter values given in Appendix C, the relationship between the load and the rate of migration m for different values of the selection correlation coefficient ρ, keeping s22/s11= 1, is shown in Figure 4 for the SMR and MRS life cycles (see previous section). Given that the equilibrium phenotypic means are offset further away from the optimum, it is somewhat surprising that there is a small reduction in the load with increasing values of |ρ|. The load is reduced by at most about 10% for ρ= 1. The same effects appears for the MSR life cycle only to a slightly lesser degree. For more moderate values of ρ up to 0.75, the reduction in load is almost negligible, however. Importantly, although the load is not a strictly nonincreasing function of ρ for all parameter values, for a given value of s22/s11 the limiting case of ρ= 1 appears to generally represent a lower bound on the load.

Figure 4.

Load as function of m and increasing values of the selection correlation coefficient ρ (see legend) and s22/s11= 1 for the SMR (left plot) and MSR (right plot) life cycles.

Further numerical analysis not included here showed that the effect of correlational selection remains small also for other values of inline image, and that the effect of correlational selection, unsurprisingly, is present also in the transient phase before migration–selection balance is attained, although to a somewhat lesser extent in the initial generations.

EQUIVALENCE TO ONE-DIMENSIONAL MODELS

With no correlational selection between z2 and z1 (Fig. 3, upper plots) the phenotypic mean inline image will always evolve along the first axis and the model essentially behaves as a simpler one-dimensional model.

Stabilizing selection on z2, however, will to some degree deflate the genetic variance in z2 below the value of one at linkage equilibrium but this effect will be small relative to the effects of migration and reproduction, which will pull the genetic variance g22 back toward 1.

Stabilizing selection on z2 will also reduce mean fitness in the population by a certain factor but this effect will be present both in a population receiving immigrants and in a population not exposed to any immigration. This suggests that the migration load, as defined by (10), should be nearly independent of the strength of stabilizing selection on z2, s22. This is confirmed by numerical computations similar to the ones in Figure 4 but with the selection correlation coefficient kept constant at ρ= 0 and instead varying s22.

THE LIMIT OF RIDGE SELECTION

Some insights into why the load is reduced by correlational selection can be gained by considering the limiting case of ridge selection (|ρ| = 1). Together with the second constraint imposed by q this implies that the parameters of the selection matrix s can then be chosen with one degree of freedom, by varying the value of, say, s22/s11. Figure 5 shows the distribution of breeding values at migration–selection balance for three different possible selection matrices s.

As is evident from this figure, the load at migration– selection balance can in this case be predicted entirely from a one-dimensional degenerate model of evolution along an axis x orthogonal to the ridge (indicated by the arrows in Fig. 5). From this it can be seen that correlational selection, in the form of ridge selection, in effect reduces the distance of the immigrants from the local optimum (measured in genetic standard deviations) from inline image to an effective distance inline image (represented by the length of the arrows in Fig. 5).

The first of two special cases is that of no selection on z2. The effective distance is then inline image (Fig. 5, upper plot). As noted in the previous subsection, in terms of the load, this case is equivalent to cases with any strength of selection on z2 as long as selection acts independently on z1 and z2 as in Figure 3, upper plots.

In terms of selection in the original two-dimensional model, the constraint imposed on the selection matrix by q leads to an upper bound on possible values of s22/s11 and a corresponding bound on the orientation of the ridge (Fig. 5, lower plot), the second of the two special cases. In the degenerate model of evolution along the axis x, there is a corresponding lower bound on inline image. In the limit, as inline image goes to inline image, the corresponding strength of stabilizing selection consistent with q, acting orthogonally to the ridge, goes to infinity.

In this limit of infinitely strong stabilizing selection, q becomes equal to the ratio between the corresponding probability densities at the point x= 0 prior to selection,

image(11)

which solved for inline image gives

image(12)

The strongest form of ridge selection consistent with q= 0.11, and inline image, then, corresponds to a degenerate one-dimensional model in which the immigrants deviate a distance inline image from the optimum. An expression for the load in this limit, obtained by considering the trimodal distribution of breeding values after migration and reproduction but before selection, is given by equation (D1).

Figure 6 shows the load under ridge selection for possible values of inline image in the range between its minimum and maximum value, inline image. In general the load decreases as the effective distance inline image approaches inline image but the reduction is at most about 15% for intermediate rates of migration.

Figure 6.

The migration load in the degenerate one-dimensional model corresponding to ridge selection as function of the effective distance inline image the immigrants deviates from the optimum and for different rates of migration m.

WEAK SELECTION–MIGRATION APPROXIMATION

So far, we have only considered moderate or strong selection and high rates of migration. This means that recombination only to some extent will have time to break down linkage disequilibrium created by migration. If we consider the extreme case of an asexual model with no recombination, it is clear that the migration load in such a model would only depend on relative mean immigrant fitness and the rate of migration. This raises the question of whether the small effect on the load of correlational selection seen in Figure 4 is mostly a result of the assumption of strong selection on a high rate of migration. To investigate this, I therefore develop an approximation valid for weak selection and migration.

If selection and migration is weak, we can ignore linkage disequilibrium and the genetic covariance matrix g will remain equal to the identity matrix I. Using (5) and (6), we find that selection changes the mean vector from inline image to

image(13)

Migration in turn changes inline image to

image(14)

Setting inline image and solving for inline image we find that the mean vector at equilibrium is

image(15)

Substituting (15) into (9), we can then compute the load from (10).

Figure 7 shows the load as function of the migration rate m—for small m and weak selection (by setting q= 0.995). Again, although the load is reduced by correlational selection, the reduction is at most about 15% for |ρ| ≤ 0.75.

Figure 7.

Same as Figure 4 but with the load computed using the weak selection–migration approximation (15), (10), and (9) and using q= 0.995.

Discussion

MAIN FINDINGS

Relying on the infinitesimal model and using a simple model of Gaussian multivariate selection and keeping relative mean immigrant fitness fixed, we have seen that correlational selection, for relatively strong forms of selection, in almost all cases, reduces the migration load (Fig. 4). A quantitatively similar effect of correlational selection on the load appear to hold also for weak selection and migration (Fig. 7). In general, however, for moderate and probably realistic parameter values (|ρ| ≤ 0.75), the effect on the load is small and limited to about 4% (Fig. 4). Thus, in terms of predicting the consequences of gene flow from domesticated species on mean population fitness and more generally, viability of populations of wild relatives, simpler models of selection–migration balance in one trait only (see Introduction) will in many cases be adequate. This work thus provide an important validation of these simpler models.

A criticism of earlier work (Tufto 2001) explicitly modeling evolution in one trait only (growth performance) raised by some authors has been that growth performance not necessarily in itself is an important fitness trait (e.g. Jonsson et al. 2006, p. 46). The relevance of models using this as the only trait undergoing selection may therefore be questionable. These simpler models, however, can be said to implicitly model evolution also in indirectly selected traits, because the one-dimensional simpler models in fact are equivalent to or a good approximation of the more general model presented here, provided that stabilizing selection operates independently on each trait (Fig. 3, upper plots) after transformation of variables (see Fig. 1) or if correlational selection on the transformed traits is not too strong.

Although the effects of correlational selection are small, it is interesting that the load in general appears to be reduced. Some intuitive insights into why this occurs can be gained by considering the extreme case of ridge selection (|ρ| = 1). The distance the immigrants deviates from an optimal phenotype (at the ridge) is in this case in effect reduced (Fig. 5), making it easier for the population to reach the optimum, which in turn reduces the load. In the more general case (|ρ| < 1), the pattern is less consistent but a similar reduction in the load occurs in most cases (Fig. 3) only to a lesser extent. Although the phenotypic means are offset away from the global optimum, regions of suboptimal but still high fitness become more accessible. It can also be noted that a minute increase of about 1% in the load appears for small rates of migration and intermediate values of ρ for both the weak migration–selection approximation and the strong migration–selection case.

MODEL ASSUMPTIONS

The strength of stabilizing selection associated with the lower bound on the load in the case of ridge selection is unrealistic for two reasons. First, selection acts on the vector of phenotypic values P; hence, the covariance matrix V of the environmental effects E sets an upper bound on the strength of apparent selection on the breeding values through equation (2). Given that heritabilities of most fitness related traits are low (Merilä and Sheldon 1999), this implies that the lower bound found here is unlikely to be attained in practice. Second, stabilizing selection cannot be very strong because this would impose an unrealistic high genetic load in the population.

The limited amount of empirical data that are available indicate that some correlational selection occurs between almost all traits and that the effects of correlational selection described here therefore should be equally common. Estimates of coefficients of correlational selection, for studies with sufficiently large sample sizes, are in the range between −0.1 and 0.1 and of the same order of magnitude as estimates of quadratic selection (Kingsolver et al. 2001). From a theoretical point of view, there is also little reason to believe that selection on each trait in an arbitrarily defined set will operate exactly independently.

Although I have focused only on n= 2 traits here, selection may of course act on a large, albeit not unlimited (Orr 2000), number of traits. If correlational selection with respect to several other traits occurs, this raises the question of whether there will be a further reduction in the load below the lower bound in the two-dimensional model. Somewhat surprisingly, however, approximate and incomplete numerical analysis for n≥ 3 (see Appendix D) suggests that the lower bound on the load remains unchanged. The minimum value appears to be associated with multivariate selection in the form of stabilizing selection acting along a single axis orthogonally to a (n− 1)-dimensional hyperplane of maximum fitness, essentially the same situation as in the lower plot in Figure 5.

THE DOMESTICATION PROCESS

The result that the total indirect response over several generations, inline image, under the infinitesimal model can be predicted entirely from the genetic covariance matrix at linkage equilibrium in the base population and the total response to direct selection inline image(4) is to my knowledge a new result. This is not entirely obvious because truncation selection will create changes in the genetic covariance matrix of the traits that are directly (Bulmer 1971) and indirectly (Bennett and Swiger 1980) selected. This will potentially change the slope in the regression of Z2 against Z1 and hence the direction of the joint direct and indirect response. Relying also on the infinitesimal model, assuming normality, and assuming that the same selection intensity is applied repeatedly, Villanueva and Kennedy (1990) have previously shown that the limiting value of the regression slope takes the same value as in the initial base population and that the indirect response as a proportion of the direct response thus is the same in the limit as in the first generation.

A complication that may invalidate (4) is natural selection within the breeding program environment, possibly creating an additional direct response in traits such as antipredator behavior as well as an indirect response in growth performance (see e.g. Fleming et al. 2002). For farmed salmon, it seems reasonable to assume that the effect of this will be small, however, because artificial selection within the breeding program is likely to be a much stronger force.

In addition, many breeding programs typically do not use simple individual truncation selection considered here but instead employ more elaborate forms of selection by incorporating information from relatives. At least for family selection, that is, selection of families rather than individuals with a phenotypic mean above a certain threshold (Falconer 1981, p. 227), the argument leading to (4) can still be seen to hold, however, by considering the effects of truncation selection, not on distributions of individual breeding, environmental, and phenotypic values, but rather on the distributions of corresponding family means.

In contrast to what has been assumed here, ongoing artificial selection will probably continue to increase the total response in directly selected traits in generations to come so that no selection–migration balance will actually be attained. However, if we are concerned with the sustainability of, for example, fish farming, it is relevant to ask what the long-term consequences of the present-day management regime are if the genetic composition of farmed fish remain unchanged (at, say, inline image) in the foreseeable future. A slowly changing or constant value of inline image may also be more realistic if taking into account that the response to selection most likely has slowed down when the breeding goal was broadened to include more, possible negatively correlated traits.

OTHER IMPLICATIONS

Although the fitness effects of correlational selection appear to be small, the approach developed here may have other important implications for the interpretation of data from programs monitoring morphological changes in impacted populations. Solem et al. (2006) notes that knowledge about morphological means in impacted populations can potentially be used to assess the degree of introgression of farmed escapees if data on the same morphological means are available for farmed fish and unaffected populations, respectively. The model presented here, however, suggests that the interpretation of such data is not going to be straightforward, because equilibrium values as well as transient values of the phenotypic means in an impacted population (see Fig. 3) will be depend not only on observed fitness differences between wild and farmed fish but also on the exact pattern of correlational selection described by a large number of unknown parameters describing multivariate selection.


Associate Editor: C. Goodnight

ACKNOWLEDGMENTS

I thank L-M Chevin, R. Lande, J. Huisman, I. Olesen, K. Hindar, H. Bentsen, C. Goodnight, and four anonymous reviewers for useful criticism and comments and Division of Biology, Imperial College London for its hospitality during my sabbatical leave there in 2008/2009. This study was funded by grant 184007/S30 from the Research Council of Norway's Miljø 2015 program.

Appendices

Appendix A

TRANSFORMATION OF VARIABLES

Result (4) can be proven easily by first transforming Z to Z using a shearing transformation parallel to the Z2 axis,

image((A1))

This transforms the covariance matrix (at linkage equilibrium) from G to

image((A2))

Note that truncation selection remains independent of the second trait, Z2 under this transformation and that the total response in inline image. In the absence of any genetic covariance, this implies that the distribution of breeding values about the Z1 axis will remain symmetric throughout the breeding program (see Fig. 1, right plot). Hence, the total indirect response in Z1 to truncation selection on inline image, has to be zero. Transforming back we arrive at (4).

Transforming Z to z using the additional scaling transformation

image((A3))

changes the mean among the escapees to

image((A4))

where the quantity inline image, as already noted, is known from the breeding program. The genetic covariance matrix (at linkage equilibrium) G changes to the identity matrix I.

Letting A denote the matrix product of the transformation matrices in (A1) and (A3) so that z=AZ, we find, using (1), that the relative expected fitness of individuals with breeding value vector z is given by (5) where s=A−1TSA−1.

Appendix B

SELECTION MATRICES CONSISTENT WITH EMPIRICAL DATA

First note that the fitness ratio q defined in the main text refers to differences in mean fitness (9) in the domesticated and wild population, respectively, and not the difference in fitness at the corresponding phenotypic means. Let inline image denote the phenotypic means in the domesticated species at the time of these fitness measurements.

In a population not exposed any immigration, the mean vector will stay at the optimum inline image and the distribution of breeding values will remain multivariate normal because neither stabilizing selection nor reproduction generates deviations from normality. Using (6) we find that stabilizing selection reduces the expressed genetic covariance matrix from g to g given by

image((B1))

the sum of the selection precision matrix and genetic precision matrix before selection. Using (8) and assuming random mating, reproduction in turn increases the covariance matrix from g to g given by

image((B2))

Setting g″=g and substituting (B1) into (B2) yields a quadratic matrix equation

image((B3))

for the covariance matrix g at equilibrium. The method for solving (B3) is completely analogous to solving an ordinary quadratic equation by the method of completing the square. The relevant solution is

image((B4))

where the above matrix square root function is defined based on the diagonalization of its matrix argument.

Repeated truncation selection during domestication operating independently of z2 (eq. 3) will have created negative linkage disequilibrium only with respect to z1 deflating the genetic variance g11 to some value inline image. For farmed salmon in particular, based on a simulation study by Gjerde et al. (1996), inline image seems realistic for the type of trait considered here with a relatively high heritability up to 0.4. Although truncation selection will also create deviations from normality (Bulmer 1971) it is reasonable, for the present purposes, to rely on (9) as an approximation. We let

image((B5))

then, denote the covariance matrix among farmed fish.

What is known empirically is then the ratio

image((B6))

or, using (9) and (B4),

image((B7))

For given choices of inline image and say, s12 and s22 or, alternatively, s11/s22 and ρ, equation (B7) can be solved numerically for s11 using some suitable root finding algorithm, for example, function uniroot in R.

Appendix C

CURRENT RELATIVE MEAN FITNESS OF FARMED SALMON

Fleming et al. (2000) and Mcginnity et al. (2003) provide empirical estimates of mean fitness in the wild of farmed relative to wild salmon (q). In particular, following the spawning success and subsequent offspring survival of 12 farm and nine native salmon released in 1993, Fleming et al. (2000) estimated the lifetime fitness of fifth generation farmed salmon from the main Norwegian breeding line to 16% of that of wild fish. At this point in time, the total response in growth performance measured in genetic standard deviations was inline image (B. Gjerde, pers. comm.). In terms of the vector of transformed breeding values, the total response to selection in z2 at the same point in time was inline image (see Appendix A). This fitness estimate, however, includes nongenetic effects or the farm rearing environment. Making the admittedly arbitrary assumption that the nongenetic effects are responsible for half the reduction in spawning success of farm relative to wild fish estimated to an average of 30.0% for both sexes by Fleming et al. (2000) and that the farm rearing environment had no effect on subsequent relative offspring survival for farmed fish (53.3%), gives a fitness estimate of farmed relative to wild fish (due to genetic differences) equal to inline image.

Elements of the selection matrix corresponding to different choices of ρ and with s22/s11 (see Fig. 3) obtained by solving (B7) are given in Table C1. Because artificial selection has changed the total response in growth inline image beyond the value inline image at the time of measuring fitness, present-day relative mean fitness must be computed by extrapolation based on the Gaussian model of stabilizing selection. Although mean fitness inline image along the z1 axis depends largely on s11, which, in turn, depend mostly on the observed value of q and inline image, keeping q given by (B6) constant does not make the relative mean fitness of immigrants (at inline image) completely independent of ρ (see Table C1, last column). To avoid making the conclusions depend on these rather nongeneral details of the estimation procedure, I therefore instead use a value of inline image and inline image in the main text so that immigrant fitness is held truly constant. This only leads to negligible changes in the values of s11, s12, s22 given in Table C1.

Table C1.  Element of the selection matrix (computed by solving (B7) numerically for s11), the corresponding value of ρ and present-day relative mean immigrant fitness for inline image and inline image.
s11s12s22 ρinline image
0.3700.37 00.1094
0.3960.1980.396−0.50.1100
0.5130.4880.513−0.950.1129

Appendix D

MORE THAN n = 2 TRAITS

If we consider the general n-dimensional case with an indirect response having occurred in Z2, Z3, …, Zn to direct artificial truncation selection on Z1, a transformation similar to (A1) and (A3) also involving rotation of Z2, … , Zn around the Z1-axis is available which, again, transforms the genetic covariance to the identity matrix. Tracking the distribution of breeding values on an n-dimensional grid is no longer feasible for n≥ 3; however, if relying on normality as in Tufto (2000), recursion equations for the mean and the covariance matrix of the vector of breeding values are available. Based on iterations of these approximate equations, approximate values for the load at equilibrium can be computed from (9) and (10) for any selection matrix s. Doing this for a large number of random selection matrices (see Fig. 8 for an example) suggests that the load, in the general n-dimensional case, remains above the lower bound

image((C1))

associated with the minimum value of inline image given by (12) in the limiting case of ridge selection.

Figure 8.

The load for a model with evolution in n= 5 traits computed for 200 random selection matrices scaled to be consistent with q= 0.11. The solid line represent the lower bound (D1) for the case of ridge selection.

To ensure that we span important parts of the parameter space, the above random selection matrices were constructed by first simulating randomly directed eigenvectors (computed from a deviate from a Wishart distribution with parameters n and I), then simulating the eigenvalues independently from a gamma distribution with shape parameter equal to 0.1, and finally multiplying the resulting selection matrix by a scaling factor to obtain the correct value of q.

In Figure 8, a discrepancy can be seen between the minimum values of the load and the theoretical lower bound given by (D1). Analysis comparing the load computed using exact and approximate numerical methods in the two-dimensional case, suggests that this discrepancy is most likely a result of the normal approximation used.

Inspection of the eigenvalues of the selection matrices of some of the cases close to the lower bound in Figure 8 and further numerical experiments suggest the lower bound is associated with strong selection acting along a single axis with all other eigenvalues being close to zero, that is, essentially stabilizing selection acting on a single trait orthogonally to (n− 1)-dimensional hyperplane of maximum fitness.

Ancillary