Effect of migration and environmental heterogeneity on the maintenance of quantitative variation: a simulation study

The paradox of high genetic variation observed in traits under stabilizing selection is a longstanding problem in evolutionary theory, as mutation rates are 10-100 times too low to explain observed levels of standing genetic variation under classic models of mutation-selection balance. Here, we use individual-based simulations to explore the effect of various types of environmental heterogeneity on the maintenance of genetic variation (VA) for a quantitative trait under stabilizing selection. We find that VA is maximized at intermediate migration rates in spatially heterogeneous environments, and that the observed patterns are robust to changes in population size. Spatial environmental heterogeneity increased variation by as much as 10-fold over mutation-selection-balance alone, whereas pure temporal environmental heterogeneity increased variance by only 45% at max. Our results show that some combinations of spatial heterogeneity and migration can maintain considerably more variation than mutation-selection balance, potentially reconciling the discrepancy between theoretical predictions and empirical observations. However, given the narrow regions of parameter space required for this effect, this is unlikely to provide a general explanation for the maintenance of variation. Nonetheless, our results suggest that habitat fragmentation may affect the maintenance of VA and thereby reduce the adaptive capacity of populations.

The paradox of high genetic variation observed in traits under stabilizing selection is a 2 longstanding problem in evolutionary theory, as mutation rates are 10-100 times too low to 3 explain observed levels of standing genetic variation under classic models of mutation-selection 4 balance. Here, we use individual-based simulations to explore the effect of various types of 5 environmental heterogeneity on the maintenance of genetic variation (V A ) for a quantitative trait 6 under stabilizing selection. We find that V A is maximized at intermediate migration rates in 7 spatially heterogeneous environments, and that the observed patterns are robust to changes in 8 population size. Spatial environmental heterogeneity increased variation by as much as 10-fold 9 over mutation-selection-balance alone, whereas pure temporal environmental heterogeneity 10 increased variance by only 45% at max. Our results show that some combinations of spatial 11 heterogeneity and migration can maintain considerably more variation than mutation-selection 12 balance, potentially reconciling the discrepancy between theoretical predictions and empirical 13 observations. However, given the narrow regions of parameter space required for this effect, this 14 is unlikely to provide a general explanation for the maintenance of variation. Nonetheless, our 15 results suggest that habitat fragmentation may affect the maintenance of V A and thereby reduce 16 the adaptive capacity of populations. 17 As genetic variation is the fundamental basis upon which evolution acts, it is important to 19 understand how variation is maintained in order to provide a foundation for answering various 20 questions in biology and related fields, such as missing heritability (Maher, 2008), conservation 21 of biodiversity (Cook & Sgrò, 2017), and population potential to respond to change (Houle, 22 1992). And yet, the relative importance of factors that influence variation and the mechanism(s) 23 under which it is maintained are not wholly understood (Barton & Turelli, 1989;Mackay et al., 24 2009). The majority of quantitative traits experience stabilizing selection, which in theory should 25 erode genetic variation. However, high levels of standing variation and heritability of 26 quantitative traits are consistently observed in nature (Johnson & Barton, 2005). This paradox -a 27 high degree of genetic variation maintained in the face of stabilizing selection -remains a 28 longstanding, unsolved problem in evolutionary biology and quantitative genetics theory. 29

30
The most widely studied explanation for this paradox is mutation-selection balance (henceforth 31 referred to as MSB), the appeal of which lies in its intuitive logic: mutation, as the ultimate 32 source of genetic variation, provides enough input to offset the eroding effect of selection, 33 leading to a state of equilibrium. Under such models, stabilizing selection is assumed according 34 to a Gaussian fitness function with parameter V S setting the strength of selection on genotypes 35 (where large values result in weaker selection). Multiple MSB models have been proposed, most 36 notably the continuum-of-alleles model from Kimura and Crow (1964)  The continuum-of-alleles model makes the basic assumption that for a locus with a continuous 40 distribution of possible alleles, mutation can result in a new allele with an effect different from 41 the pre-existing one. In general, the two approximations differ in the way each handles this 42 assumption. For example, for an arbitrary diploid locus i with an allele of effect x i and a mutation 43 with effect y i , under the Gaussian approximation the mutated allele would take on a new value 44 which is conditional on the previous state (x i + y i ). This results in a predicted equilibrium genetic 45 where n is the total number of loci that compose the quantitative 46 trait, μ the per locus mutation rate, and α 2 the variance of the distribution of mutational effects. 47 Under the House-of-Cards approximation, x i may take on any effect independent from x i+1 …x n , 48 and mutation results in the replacement of x i by y i . In this case, the predicted equilibrium 49 variance, is independent of the distribution of mutation effects. In contrast with 50 the continuum-of-alleles model, there is the diallelic model (Bulmer, 1972;Barton, 1986), which 51 assumes only two possible values at a locus with equal forward and backward mutation rates, 52 such that mutation causes a flip between states (e.g., xi = {0,1}). In this case, the resulting 53 equilibrium variance has been shown to be generally comparable to V G (HC). 54 Although these models have been extensively studied, conflicting evidence over their 55 applicability in relation to realistic biologic parameters has left debate open (Barton & Turelli, 56 1989;Johnson & Barton, 2005;Zhang & Hill, 2005). The assumptions and requirements of these 57 models are unrealistic: primarily, they require extreme mutation rates or too many loci per trait 58 (Johnson & Barton, 2005). This can be most simply illustrated by considering two empirical 59 observations: heritability, h 2 , for the majority of traits are relatively high (h 2 = 0.2 ~ 0.6; 60 Mousseau and Roff 1987), and stabilizing selection is typically relatively strong in nature, as 61 evidenced by the median value of estimates of reported quadratic selection gradient (γ = -0.1; 62 (Kingsolver et al., 2001). The value γ = -0.1 implies that the ratio of selection to phenotypic 63 variation is V S / V P = 5 (Kingsolver et al., 2001;Johnson & Barton, 2005), which can be 64 rearranged to account for typical estimates of heritability, yielding an expected empirical range of 65 values for variation maintained in a natural population of approximately 0.04 < V A / V S < 0.12. 66 Putting this in the context of Turelli's (1984) V G (HC) model under the assumption that V G ~ V A , 67 then 0.01 < ݊ ߤ < 0.03. If there are 100 loci underlying a given trait, then this would require per-68 locus mutation rates on the order of ~10 -4 , which is 10 to 100 times higher than most estimates 69 from quantitative genetic studies, typically thought to be in the range of 10 -6 to 10 -5 2000). 71 More complex extensions of these models have been investigated ( B u rger et al., 1989;Zhang 72 & Hill, 2002;Zhang et al., 2002Zhang et al., , 2004, namely by extending to multiple traits, such that 73 'apparent stabilizing selection' is generated through pleiotropic effects. Pure pleiotropy and joint 74 effects models have been studied, as well as other extensions to include factors such as 75 dominance or balancing selection. Nonetheless, these models have yet to provide a sufficient 76 R e s u l t s 155 Pure Spatial Homogeneity, Set 1 (θ 1 = θ 2 = 1) 156 We first describe patterns in the homogeneous environment, which acts as a control 157 demonstrating the effect of finite population MSB processes in a two-patch model, to study the 158 effects of mutation rate (μ), selection strength (V S ), and population size (N), on the maintenance 159 of variation ( Figure 1A). As expected from quantitative genetic theory (Falconer and Mackay 160 1996), variance was somewhat higher in larger populations, although the difference is minimal 161 under most parameters ( Figure S1). Also following expectations, V A increases with μ , with the 162 effect of mutation rate approximately linear, and independent of selection. Similarly, V A 163 increases with V S (selection weakens), which qualitatively matches predictions of Turelli (1984), 164 Burger et al. (1989, and as observed by Yeaman and Guillaume (2009), although some 165 deviations occur due to differences in the mutation model. While from Figure 1 there appears to 166 be an interaction between selection and mutation rate, the relative effect of a ten-fold increase in 167 mutation (from μ = 10 -5 to μ = 10 -4 ) is similar for V S = 2 versus V S = 10 (V A changes by a factor 168 of 8.87 vs 8.59 for V S = 2 vs V S = 10; m = 0.1). 169 Notably, these patterns are qualitatively independent of N ( Figure S1), and so the results below 170 are presented for the subset of N = 1000 only. There is minimal response to rate of migration 171 between the two patches, except for a small increase at low to intermediate m, which is consistent 172 with predictions of Goldstein and Holsinger (1992), and arises due to an interaction between 173 effects of genetic redundancy and genetic drift (hereafter, referred to as the "redundancy-drift 174 effect"; see Discussion). 175 Pure Spatial Heterogeneity, Set 2 (θ 1 = +1; θ 2 = -1) 176 With spatial environmental heterogeneity, two key differences arise when compared to the 177 control case (Set 1): an overall large increase in the amount of V A maintained, and the appearance 178 of a threshold at high migration, above which the divergence among populations and amount of 179 V A maintained both decrease (which is qualitatively consistent with the critical migration 180 threshold predicted by Yeaman and Otto, 2011). While the migration rate at which V A peaks 181 varies as a function of V S , the maximum V A reached under these parameters (≅ 0.32) is similar 182 among levels of V S . Given the number of loci, mutation effect sizes, and mutation rates that we 183 used, spatial environmental heterogeneity generates up to a ~10-fold increase in V A compared to 184 the homogenous case (for the same parameters, V A [homogeneous] = 0.033), as shown in Figure  185 1B. This demonstrates that the redundancy-drift effect described previously by Goldstein and 186 Holsinger (1992) is small relative to the effect of migration and spatial heterogeneity, at least 187 under these conditions. Additionally, the effect of mutation becomes relatively less pronounced 188 under spatial heterogeneity, as most of the variation is maintained by migration in this case. 189 Whereas a ten-fold change in μ (from μ = 10 -5 to μ = 10 -4 ) results in an increase by a factor of 190 8.87 in Figure 1A, this increase only changes V A by a factor of 1.09 under spatial heterogeneity 191 (V S = 2, m = 0.1). 192 Pure Temporal heterogeneity, Set 3 (θ t,1 = θ t,2 ) 193 In contrast to the case with spatial heterogeneity, under a temporally fluctuating optimum there is 194 in general little change in V A in comparison to Set 1, the homogenous case. As shown in Figure  195 2A, a similar qualitative pattern is observed in response to migration rate, with an increase at low 196 to intermediate m and rapid drop with m > 0.001, akin to the redundancy-drift effect observed in 197 the homogeneous case. The resulting maximum peak in V A however is approximately 45% 198 greater than the homogeneous case (V S = 5, μ = 10 -4 , m = 2.5x10 -4 ), suggesting that temporal 199 heterogeneity can only marginally increase the importance of this effect. 200 Multiple lengths for the temporal oscillation of the environment (denoted as P) were simulated to 201 test the influence of the period length on the qualitative patterns observed, and complex 202 interactions were found ( Figure S2.1). Setting P ≤ 100 resulted in a response to migration similar 203 to the homogeneous case and an overall increase in V A with μ , as expected. However, the 204 behavior for P = 100 deviates under very strong selection (V S = 2), with the redundancy-drift 205 effect causing a peak in V A at lower migration rates than for shorter periodicities (see Figure 2A  206 and S2.1). Across changes in all parameter combinations, the redundancy-drift effect causes a 207 maximum increase in V A of 28% relative to the V A maintained under m = 0.1, which was 208 observed for P = 100 and V S = 10. Under a very long cycle (P = 1000 generations), there is 209 minimal response to migration rate (i.e., no effect of redundancy-drift) and very little increase in 210 variance when μ = 10 -6 relative to simulations with shorter periods, but much more increase in 211 variance under long periods at μ = 10 -4 (see Figure S2.1; V A under P = 1000 is 1.3 ~ 2.5-fold 212 higher than P = 100 under μ = 10 -4 ). The degree of increase in V A under temporal heterogeneity 213 relative to homogeneity therefore depends on mutation, period, and selection, but is typically 214 considerably less than under spatial heterogeneity with intermediate migration. 215

Combination of Spatial and Temporal heterogeneity, Set 4 216
For spatial and temporal heterogeneity combined (θ t,1 = θ t,2 + 2; see SI for illustration), the 217 amount of V A and patterns that arise are qualitatively similar to those in the case of pure spatial 218 heterogeneity. Similar threshold behavior is observed, and similar levels of V A are maintained, 219 peaking between migration rates of 0.01 ~ 0.1, again depending on V S ( Figure 2B). Under this 220 form of combined spatial and temporal heterogeneity, there is a max 31% increase in V A 221 compared to pure spatial heterogeneity (at V S = 5, μ = 10 -4 , m = 0.001), which is very small when 222 compared to the 9.75-fold difference in V A for the pure spatial heterogeneity vs. homogeneity 223 under the same parameters V S and μ . 224 In contrast, the complex patterns involving periodicity described above that were observed in the 225 purely temporal set do not appear to carry over under combined spatial and temporal 226 heterogeneity, likely because they are swamped out by the relatively larger effect of migration, 227 which is relatively consistent across different period lengths (see S2.2). There is an overall 228 increase in V A maintained, which appears more similar to the case of pure spatial heterogeneity 229 than pure temporal heterogeneity or homogeneity. This result is perhaps unexpected, as spatial 230 and temporal effects of heterogeneity might be expected to work synergistically. 231 Figure 3 shows a comparison amongst the six different patterns of environmental variation for a 233 subset of the total parameter combinations used. For a given parameter set, there is a general 234 trend of increased V A maintained with an increase in heterogeneity, particularly with the degree 235 of spatial heterogeneity between patches. Temporal heterogeneity on its own does little to 236 increase V A compared to the homogeneous set. When temporal heterogeneity is combined with 237 spatial heterogeneity, the increase in V A is most pronounced in sets where the temporal 238 fluctuations in the two patches are most out of phase ( Figure 3; see SI for relation between θ t,1 and 239 θ t,2 ). Of the three combinations investigated, subset 4B -where θ t,1 ≠ θ t,2 for any t -maintains a 240 considerable higher level of V A compared to subset 4A, which in turn maintained more V A than 241 in 4C, which was most in phase between patches. This further supports the emphasis on spatial 242 heterogeneity over temporal heterogeneity as a stronger force for maintaining variation. 243 To explore whether environmental heterogeneity could yield more realistic ratios of V A / V S (ie, 244 0.04 < V A / V S < 0.12), we compared values for this ratio under homogeneous vs. heterogeneous 245 simulation scenarios with all other parameters held constant (Figure 4). Without any spatial 246 heterogeneity, none of the parameter combinations result in V A that falls within this range for 247 expected typical values (see S4, S5 for simulation sets that did not fall into range). However, with 248 spatial heterogeneity, the combination of high migration (m For a range of migration rates (dependent on V S and μ ) spatial environmental heterogeneity 266 substantially increases the variation a population can maintain. However, migration alone, even 267 between two environmentally similar patches, can increase the variation maintained at very low 268 rates of migration. This result is due to the interaction of genetic redundancy and the stochastic 269 force of genetic drift, as described by Goldstein and Holsinger (1992) sufficiently connected such that new genotypes can be introduced by migration at a high enough 281 rate to affect genetic variance, but not overly connected such that drift is no longer acting 282 (relatively) independently in each patch. This genetic redundancy-drift effect is evident in the 283 control case, where there is an observable -albeit minimal -response to low migration, which 284 persists regardless of population size, rate of mutation, or strength of selection. This effect is 285 somewhat increased under temporal heterogeneity, resulting in larger peaks over the same range 286 of low dispersal as the homogeneous case, particularly under high mutation and strong selection 287 (μ = 0.0001, V S = 2). The increase of this effect may be due to the temporally variable selection 288 within each patch causing different alleles to change in frequency more rapidly than migration 289 homogenizes differences, resulting in a larger pool of differentiated loci than would be generated 290 Nonetheless, this redundancy-drift effect is small relative to the effect of spatial heterogeneity. In 293 this case, selection acts to pull each subpopulation towards its optima, resulting in different allele 294 frequencies within in each patch. With migration, the alleles that would otherwise be purged by 295 selection are persistently reintroduced into the other patch, increasing the V A maintained in the 296 population. However, there is a threshold behavior to this system -a critical migration rate 297 beyond which polymorphism can no longer be maintained (Felsenstein 1976;Yeaman and Otto 298 2001). Past this point, the allele frequencies of each patch are similar enough that migrants are 299 much less likely to re-introduce novel/differentiated alleles, and the effect of migration on V A is 300 reduced. 301 302 Most populations experience some form of temporally variable selection, be it seasonal 303 fluctuations or some otherwise unstable environment. Due to this, there has been much interest 304 around fluctuating selection and its effect(s) on the maintenance of variation (Kondrashov & 305 Yampolsky, 1996;Bürger & Gimelfarb, 2002;Siepielski et al., 2009;Morrissey & Hadfield, 306 2012;Gulisija & Kim, 2015). Difficulties arise when attempting to draw comparisons across such 307 studies due to a variety of different assumptions, such as the way in which the moving optimum 308 is modelled and the nature of the underlying genetic architecture. A simple sinusoidal wave is 309 generally used, but varying parameters of this function (amplitude, center, periodicity) appear to 310 have complex interactions with each other as well as other parameters such as mutation rate and 311 genetic redundancy. 312 Burger and Gimelfarb (2002) found that in a diallelic multi-locus model with recurrent mutation, 313 the relative genetic variance increased with period length (typically ≥ 24 generations), but the 314 magnitude of change and pattern of response to period depended on the amplitude of the 315 oscillations. They observed the highest levels of genetic variance when the amplitude was set 316 such that the optimum cycled between the most extreme possible genotypes, and that greater 317 amplitudes, where the optimum fell beyond the maximum possible genotypic values, resulted in a 318 decline in V A at high period lengths (≥ 52 generations). However, given the small number of loci 319 (2 -6) and scaling of allelic effects, this represents a scenario with highly non-redundant genetics. 320 By contrast, Kondrashov and Yampolsky (1996)  by Kondrashov and Yampolsky (1996) constitute much stronger selection than normally 335 observed. Thus, while they found that temporal heterogeneity could maintain much more 336 variance than observed in our simulations, this was likely a consequence of drastic changes in 337 fitness causing the underlying alleles to cycle rapidly in frequency, with increases in V A as the 338 alleles reached intermediate frequency. Thus, the capacity for temporal heterogeneity to maintain 339 substantial amounts of variance seems to depend upon either very extreme environmental 340 fluctuations or a narrow range of rapid fluctuations with strict assumptions on the amount of 341 genetic redundancy available. As empirical data become available for quantitative analyses of a 342 population undergoing temporal variation, this will provide a realistic standard set of parameter 343 values to be used in theory studies, allowing for a more quantitative assessment of the likely 344 effects of temporal fluctuations, and further evaluation of the relative importance of the above 345 models. 346 347 Spatial v. Temporal v. Combined heterogeneity 348 Interestingly, it appears that spatial heterogeneity has a much stronger effect than temporal, both 349 in isolation and when combined. This is evidenced by the difference in magnitude of V A 350 maintained in Sets 2 vs. 3 (pure spatial and pure temporal heterogeneity, respectively), as well as heterogeneous conditions, and examined the degree to which heterogeneity increased this ratio. 372 The shift towards this window for simulations with spatial heterogeneity suggests that this is a 373 viable mechanism for the maintenance of variation, but is limited to populations experiencing 374 strong selection and high migration rates. 375 The results presented in Figure 4 are intended to illustrate how the maintenance of realistic levels 376 of V A is possible under environmental heterogeneity, rather than to represent the full span of 377 parameter combinations that result in realistic behavior. Little is known about the true values for 378 many parameters involved, and changes in some values, such as the number of loci, are likely to 379 alter the quantitative results. Instead, these results illustrate that for a given set of assumptions 380 about the genetic basis of a trait, considerably more variation can be maintained by migration-381 selection balance than mutation-selection balance. Moreover, the magnitude of this change can be 382 sufficient to explain empirical observations in some cases, indicating environmental 383 heterogeneity is a viable mechanism to consider for the maintenance of variation. However, we 384 stress that given the sensitivity of this result to the balance between selection and migration and 385 the narrowness of the region that maintains significantly more variance, it seems unlikely that 386 environmental heterogeneity will generally resolve the "paradox of variation". 387

Limitations 388
A major limitation to the interpretation of this study is the difficulty in comparing results to 389 established analytic models due to differences in the mutation scheme. The majority of theory 390 and analytic models assume either a single (or few with equal effect) diallelic loci, or quantitative 391 traits with a normally distributed continuum of alleles. Our simulations attempt to combine the 392 two -varying allelic effects across loci (a i ), but with a single |a| per locus, and a Gamma 393 distribution of possible values a i . We chose this approach to reflect a (potentially) more 394 biologically realistic scheme of mutation. For example, evidence suggests that the distribution of 395 allelic effects at a locus is more likely to follow a gamma rather than a normal distribution, with a 396 small number of loci of large effect and many loci of small effect (Orr, 2003). However, this 397 leads to a less straight-forward parameterization of V m or α 2 , which causes challenges when 398 attempting analytical comparisons. 399 Another limiting factor is the difficulty of obtaining measurements of migration rates in natural 400 populations, which is necessary to understand if the range where this effect of environmental 401 heterogeneity occurs is realistic. Evidence of migration maintaining quantitative genetic variation 402 in spatially heterogeneous environments was found in lodgepole pine (Yeaman & Jarvis, 2006), 403 but genomic signatures at individual loci may be difficult to detect. Though we did not 404 specifically quantify this here, presumably there is some effect on heterozygosity, though such 405 effects could be difficult to detect in empirical data. 406

Implications 407
We have shown that spatial environmental heterogeneity and migration has the potential to 408 increase genetic variance substantially, while temporal heterogeneity has a much more modest 409 effect. As local adaptation is common (Hereford, 2009) legitimate] concerns about reduced diversity, which is often used as a proxy measure for the 420 'health' of a population of concern in conservation biology. Here, we showed that a 10-fold 421 change in population size had less effect on V A than a reduction in migration, which suggests that 422 migration may also be an important factor to consider when assessing and managing the genetic 423 health of populations of concern. Indeed, the related concepts of gene flow and connectivity have 424 been previously suggested for consideration and implementation in conservation management, 425 albeit with some contention, as too much migration can result in a reduction of variability (Weeks 426 et al., 2011;Urban et al., 2012;Cook & Sgrò, 2017). Long-term maintenance of evolvability may 427 depend in some cases on an intact and interconnected environment, rather than endogenous 428 generation of variation (mutation-selection), which would depend less on environment. 429 Therefore, a move towards improving connectivity between subpopulations and protection 430 against habitat fragmentation may be a pertinent consideration for the maintenance of 431 quantitative genetic variation, in addition to the well-recognized problems of reducing inbreeding 432 and mitigating demographic fluctuations. 433

434
In conclusion, our results demonstrate a clear effect of migration on the maintenance of variation 435 in all four investigated scenarios, and highlight the potential for environmental heterogeneity to 436 substantially increase V A . However, it seems unlikely that a single mechanism best explains the 437 maintenance of variation that we see in nature; rather the many concepts put forth over the past 438 decades may be viable explanations in only certain scenarios, just as has been demonstrated here 439 with environmental heterogeneity. Further investigation into potential mechanisms, particularly 440 into those scenarios under which they succeed and fail, and how biologically relevant successful 441 scenarios may be, is needed in order to get a complete picture of the fundamental process that is 442 the maintenance of variation. 443 L i t e r a t u r e Barton, N. & Turelli, M. 1989. Evolutionary Quantitative Genetics: How Little Do We Know. 444 Annu. Rev. Genet. 23: 337-370. 445 Barton, N.H. 1986. The maintenance of polygenic variation through a balance between mutation 446 and stabilizing selection. Genet.  Benson, J. Weeks, A.R., Sgro, C.M., Young, A.G., Frankham, R., Mitchell, N.J., Miller, K.A., et al. 2011. 508 Assessing the benefits and risks of translocations in changing environments: A genetic 509