Quantifying natural seasonal variation in mutation parameters with mutation accumulation lines

Abstract Mutations create novel genetic variants, but their contribution to variation in fitness and other phenotypes may depend on environmental conditions. Furthermore, natural environments may be highly heterogeneous. We assessed phenotypes associated with survival and reproductive success in over 30,000 plants representing 100 mutation accumulation lines of Arabidopsis thaliana across four temporal environments at a single field site. In each of the four assays, environmental variance was substantially larger than mutational variance. For some traits, whether mutational variance was significantly varied between seasons. The founder genotype had mean trait values near the mean of the distribution of the mutation accumulation lines in all field experiments. New mutations also contributed more phenotypic variation than would be predicted, given phenotypic and sequence‐level divergence among natural populations of A. thaliana. The combination of large environmental variance with a mean effect of mutation near zero suggests that mutations could contribute substantially to standing genetic variation.

Arabidopsis thaliana, they are maintained with the minimum effective population size of one. Consequently, there is minimal bias in the fixation of deleterious versus beneficial mutations within the lines (Lynch & Walsh, 1998). The amount of among-line variance due to genetic sources can be used to estimate mutation rate for a phenotype, and by assaying the premutation founder simultaneously with the MA lines, one can estimate the distribution of mutation effects, and whether they increase or decrease the value relative to the founder phenotype (e.g., Bataillon, 2000;Heilbron, Toll-Riera, Kojadinovic, & MacLean, 2014;Lynch et al., 1999;MacKenzie, Saadé, Le, Bureau, & Schoen, 2005;Morgan, Ness, Keightley, & Colegrave, 2014;Rutter et al., 2010;Shaw, Geyer, & Shaw, 2002). It is typically assumed that the vast majority of mutants that affect fitness will decrease fitness relative to the founder (Keightley & Lynch, 2003).
To date, most studies utilizing the MA line approach have been conducted under controlled or laboratory conditions. Despite the elegance of these laboratory studies, many mutations with fitness consequences may be cryptic, when factors that typically deleteriously affect an organism, such as disease or predation, are minimized in controlled conditions. Controlled experiments also imply that environmental conditions are uniform and their effects are also minimized, perhaps leading to overestimates of mutation effects on the phenotype relative to environmental effects. Controlled experiments will also fail to detect season-to-season or site-to-site variation as sources of environmental effects that influence mutation parameters. There is some evidence suggesting that season-toseason variation is an important component of balancing selection (Delph & Kelly, 2014;Schemske & Bierzychudek, 2001) and may be involved in the maintenance of genetic variation and the role of mutations thereof (Connallon & Clark, 2015;Haldane & Jayakar, 1963).
Previously mutation parameters had been quantified in one field and one greenhouse planting for 100 A. thaliana MA lines developed by Shaw et al. (2002) from a Columbia founder (Rutter et al., 2010).
The fates of five of these same 100 MA lines were described from multiple field environments (Rutter et al., 2012). These five lines were also sequenced. Based on the MA line variance and performance relative to the founder, we quantified a high rate of beneficial mutations and a high genomic mutation rate for fitness (haploid mutation rate = 0.12). A large contribution of environment to phenotype resulted in a low estimate of the contribution of mutation to standing genetic variation for fitness (h 2 m = 0.0001) (Rutter et al., 2010). A significant genotype × environment interaction was found for the five MA lines (Rutter et al., 2012). For 50 of these lines grown across two spatial environments, significant variance genotype by environment interactions was also found (Roles et al., 2016).
Here, we extend the previous findings of Rutter et al. (2010Rutter et al. ( , 2012 using the same 100 A. thaliana MA lines, with phenotypes measured across a temporal environmental scale. We present the mutational variances and the distribution of mutation effects across four natural field environments, for fitness and fitness components. We ask: (1) Is the observation of many mutations that are not deleterious found consistently across replicated experiments? (2) How do mutation effects scale to environmental variation? We planted our experiments in two fall and spring seasons in a nearly identical location to test the effects of mutations. Our planting times correspond to the two life histories exhibited by A. thaliana in its native range: winter annual (fall planting) and spring ephemeral (spring planting).
Here, we focus on variance G × E that is whether the mutation parameters having to do with genetic variation introduced by mutation vary among seasons.

| ME THODS
The 100 lines of A. thaliana used in these experiments were the products of 25 generations of mutation accumulation. Methods for generating the lines are described previously (Rutter et al., 2010(Rutter et al., , 2012Shaw et al., 2002). Briefly, all lines were derived from a single progenitor of the Columbia accession (the progenitor individual differs from the sequenced Col-0 line by a few mutations ;Ossowski et al., 2010). Six sublines were created from the progenitor line and were kept in cold storage (4.5°C). Individual MA lines were propagated by choosing a single seed at random to found the next generation. This procedure minimizes the potential for selection for or against mutations during the accumulation period. The lines were propagated for 24 generations at the University of Minnesota and for a 25th generation at the University of Maryland.
After mutation accumulation, field sublines were created for all 100 MA lines, along with field sublines of each progenitor subline.
All field sublines were grown simultaneously in a greenhouse at the University of Maryland. The field subline plants grown in this single event produced seed that were used in all subsequent field experiments. Multiple field sublines were used for each MA line and for the progenitor line to dilute the idiosyncrasies of environmental effects on seed production by a particular maternal plant. Thus, seed from all lines in an experiment was of the same age and from maternal plants grown in the same environment.
Field experiments were carried out in spring 2004, fall 2004, spring 2005, and fall 2005, with the following previously described methods (Rutter et al., 2010(Rutter et al., , 2012. Before each experiment, seeds were placed on moist potting soil in groups of 20 per germination pot, cold-treated at 4°C, and moved to a greenhouse for germination. Ten days after removal from the cold, germination was assessed in each pot, and seedlings were transplanted from germination pots Three to five field sublines represented each MA line, and six field sublines were used for each of the six progenitor sublines. However, due to variability in seed production across maternal plants, for the MA lines, field subline identity was not perfectly identical across experiments, although multiple field sublines represented each MA line in every experiment. This set of 7,504 plants (7,000 MA plants and 504 progenitor plants) was used as a basic framework across all four field experiments. As described below, some additional Col-0 plants were also grown in two experiments.
At approximately 15 days after plants had been removed from the cold, they were transplanted into a field site at Blandy Experimental Farm, located at Boyce, Virginia, USA (39°03′45.1″N, 78°03′30.5″W). At this point, seedlings typically had two to four true leaves and were robust to transplant shock. To simulate the early successional habitat of A. thaliana, the field sites were treated with herbicide 3 months before planting and were plowed 3-6 weeks be- Biomass, counts of total full and aborted fruits (measuring less than 5 mm) and a sample of fruit size were taken from dried plant material from each plant. The fall 2004 experiment was an exception in which less data were collected, as described below. We report analyses on these measures as well as on a cumulative fitness measure that combines survival and full fruit produced by counting any plant that did not survive to reproduce as having zero fitness. In the fall 2004 experiment, the plants were very large and overall fruit numbers were extremely high. In this case, we found that fruit number was highly correlated with biomass (r 2 = .96) in a randomly selected subset of the plants (Rutter et al., 2012) and we thus estimated fruit number from biomass for the fall 2004 planting rather than measuring fruit number directly.

| Statistical analysis
In this study, mutational variance is estimated by the among-line variance of the MA lines. We used a mixed-modeling approach to estimate and test the significance of mutational variance. Due to unequal sampling of progenitor lines and MA lines, we performed univariate analyses to estimate variance on the MA lines alone.
Similarly, we analyzed each experiment separately. All analyses were The full model for a given experiment included the random effect of block, the random effect of line, the random effect of field subline nested within line, and residual error. For each response variable, we evaluated which explanatory variables to include in the final MCMCglmm model with a parametric bootstrap using the maximumlikelihood fit; this approach takes advantage of the robust estimates from MCMCglmm while accounting for the instability of such models when using a prior for random effects with low variance. Model fitting of effects allowed us to (1) evaluate the importance of covariates (such as field subline or block) and (2) identify and address instability introduced by some explanatory variables. First, we constructed a series of glmer model pairs with one model including and the other excluding an explanatory variable and we calculated the observed likelihood ratio of the deviance for each model pair. Next, we ran 1,000 replicates of the parametric bootstrap, fitting the simulated data to the model pair and constructing the likelihood ratio of the deviance for each replicate. Finally, we calculated the probability of our observed likelihood ratio as the fraction of replicates in which the simulated likelihood ratio was larger than the observed likelihood ratio. If this p-value was below .05, then we considered the focal variable to have significant explanatory power and retained the variable for the MCMCglmm analysis. Having determined the explanatory variables to retain in the model for each response variable, we then ran that model using MCMCglmm and used the results to estimate our parameters of interest. We used MCMCglmm parameter estimates because these models produce more robust estimates of the random effects for non-Gaussian data (Hadfield, 2010).
For each response variable, the per-generation increase in genetic variance due to mutation (mutational variance, V m ) was calculated as where V l is the among-line variance and t is the number of generations of divergence (Lynch & Walsh, 1998). Mutational heritability (h 2 m ), which is the rate of increase in heritability due to new mutations, was calculated as Houle, Morikawa, & Lynch, 1996).
We also calculated the mutational coefficient of genetic variation CV m , which is scaled to the mean by computing 100( where x is the trait mean for Gaussian response variables (Lande, 1975;Lynch & Walsh, 1998). For Poisson variables, CV m was calculated as T tests involving Columbia were restricted to the two plantings for which this accession was included.

| RE SULTS
The four planting seasons differed substantially in plant performance. In both spring seasons, adult reproductive plants were small.

| Distribution of mutational effects
There was never a significant difference between founder trait values and the overall mean trait values of the MA lines in any trait in any season. Similar means between the founder and mutant lines are not surprising for the cases for which there was no detectable mutational variance. However, when mutational variance was present, we still found no evidence of a decline in overall mean fitness. These findings were true for traits that are direct components of fitness, such as fruit production and survival. Similar results have been found previously for these mutation accumulation lines (Roles et al., 2016;Rutter et al., 2010;Shaw et al., 2002) and independently derived MA lines from the same founder (MacKenzie et al., 2005). In total, these studies represent a collection of nine assays of lines derived from this founder: six in the field (across two sites and four seasons) and three in greenhouse conditions. Clearly, the similar performance of MA lines and founders is not an idiosyncrasy of a single assay.
As another measure of wild-type performance, Columbia lines that were obtained directly from the Arabidopsis stock center actually had lower trait values for fitness in the field than either the founder line or the MA lines, primarily due to lower survivorship. However, because these plants originated from seed directly provided by the Arabidopsis Biological Resource Center, phenotypic differences might reflect effects of seed age or the seed production environment. The Columbia seeds representing the founder, on the other hand, were generated at the same time as the MA line seeds. In addition, although the founder was generated from the Columbia accession, it does carry several mutations that differ from the Columbia reference (Ossowski et al., 2010). Our findings are an exception to the widely held assumption that nearly all mutations are deleterious (Bataillon, 2003;Keightley & Lynch, 2003). There are other reports of high frequencies of beneficial mutations that are either induced or spontaneous (Hall, Mahmoudizad, Hurd, & Joseph, 2008;Perfeito, Fernandes, Mota, & Gordo, 2007;Schaack, Allen, Latta, Morgan, & Lynch, 2013;Zhang, Azad, & Woodruff, 2011).
Given that our founder genotype originates from Northern Germany and has experienced a series of inbreeding events throughout its laboratory cultivation, and our experiments were conducted in the novel environment of the Virginia Blue Ridge, our observation of a lack of an overall deleterious effect of mutation may be consistent with Fisher's (1930) geometric model. In this case, the assayed population may be far from its optimum in the new environment, increasing the likelihood a mutation will be beneficial, as has been observed with experimental studies of microorganisms (Burch & Chao, 1999;Khan, Dinh, Schneider, Lenski, & Cooper, 2011;Kryazhimskiy, Rice, Jerison, & Desai, 2014;MacLean, Perron, & Gardner, 2010;Perfeito, Sousa, Bataillon, & Gordo, 2014) and in one field study with A. thaliana (Stearns & Fenster, 2016). It is also possible that there was within-plant selection during the propagation of the mutation accumulation lines (Otto & Orive, 1995). However, new explanations may be required to describe the conditions that lead to the phenomenon of a symmetric distribution of mutational effects.

| Variance G × E
Given that the MA lines shared nearly identical sets of mutations across all of the assay environments, differences in MA line variance are likely due to the different expression of mutations in each assay environment (Latta et al., 2015); for example, epigenetic differences between assays or between MA lines may explain some MA line differences (Jiang et al., 2014). As mutational variance approaches zero, there is decreasing potential for selection to act on the new mutations. Even mutations that are deleterious in some environments are likely to be neutral in environments with very small mutational variance. If environments in which mutations have little effect are common, standing genetic variation could be maintained even when there is strong phenotypic selection.
The contrasting contribution of mutations and environment to phenotypic variation for any of the traits in our experiment, including fitness, is striking. The environmental effects typically had three to four orders of magnitude greater effect on producing variation than mutation. In this context, mutation effects are very small and mutations are more likely to be maintained within the population as standing genetic variation (Charlesworth & Charlesworth, 2010).
Seasonal variation in environmental variances was much more evident for some traits than others. For example, for biomass, fruit number, the number of flowers, and the number of aborted fruit the environmental variance differed by over an order of magnitude across seasons and by several orders of magnitude in some cases.
However, for fruit length, seasonal differences in environmental variance were much smaller. Such varied response across traits likely reflects differences in the environmental contribution to trait values.
Fruit length may be little influenced by environmental quality, while total fruit number may depend substantially on the quality of the environment. Trait means may be a useful proxy for describing environmental quality (e.g., higher mean fruit number would be expected in a higher quality or low stress environment), but changes in trait means did not clearly match changes in h 2 m , average performance in MA lines or mutational variances. Our finding is consistent with other broad surveys that indicate that environmental stresses do not change the strength of selection on new mutations in a consistent fashion-that is, making new mutations more deleterious or more beneficial on average (Agrawal & Whitlock, 2010;Martin & Lenormand, 2006). Poor environments may or may not allow selection to discriminate among new mutations, although there is evidence that the most stressful environments may magnify mutation effects (Kraemer et al., 2016).
Similarly, larger environmental variance may or may not prevent selection from acting on new mutations.

| Calibration of mutational variance with standing genetic variation
At the sequence level, mutation rates have been documented in an increasing number of organisms (e.g., Denver et al., 2012;Sung et al., 2012) and are typically about 1 × 10 −8 at the nucleotide level and about one new mutation at the gamete level. At the phenotype level, mutation rates vary from about 0.01 to 0.1 for each new gamete, with estimates of mutational heritability, h 2 m , ranging from 10 −4 to 10 −3 (Lynch et al., 1999). Comparing the mutational variance with the variance among populations in a species allows us to calibrate the amount of phenotypic variance generated each generation by mutation with an estimate of the total extant phenotypic variance generated over millions of years of evolution in a species (A. thaliana diverged from the closely related Arabidopsis lyrata ~10 million years ago) (Hu et al., 2011). In this study, nearly all of our estimates of the per-generation contribution of mutations to heritable genetic variance scaled to environmental variation (h 2 m ) are consistently around 10 −4 . If we scale up our estimates of V g to 25 generations of MA, then mutations have contributed on the order of 2 × 10 −3 V g scaled to V e for fitness. In contrast, studies of accessions grown in fieldlike settings, or RILs generated from an extreme cross of Italian and Swedish A. thaliana populations, similarly measured for fitness or fitness proxies found V g scaled to V e is much higher, on the order of 0.05-0.1 (Ågren, Oakley, McKay, Lovell, & Schemske, 2013;Rutter & Fenster, 2007;Samis et al., 2012;Stearns & Fenster, 2016), as one might expect. However, to put this into the context of mutational contributions to fitness variance, after only 25 generations of MA, the 100 lines have diverged such that this population of 100 MA lines has on the order of 25-fold less fitness variation than found in a survey of 21 worldwide A. thaliana accessions grown at a single site (Rutter & Fenster, 2007). Presumably, the extant genetic variation among populations is attributable to a wider sampling of TA B L E 3 Untransformed means and 95% confidence intervals for the MA lines, founder lines, and Col-0 lines for all traits and experiments  (Mackay, Fry, Lyman, & Nuzhdin, 1994;Mackay, Lyman, & Jackson, 1992) was greater than observed in a worldwide survey of D. melanogaster populations (Capy, Pla, & David, 1993).
We can also scale the contribution of mutations to heritable genetic variation for fitness phenotypes to the amount of sequence variation found among our MA lines as well as among accessions. The amount of sequence variation among the five sequenced MA lines is roughly three orders of magnitude less than among 80 sequenced natural accessions when focused solely on the private mutations (21 mutations versus 22,000 private alleles) Ossowski et al., 2010). Thus, the mutations in our study scale up to express heritable phenotypic variation about 40-fold more than expected based on sequence differences alone (as above, phenotypic differences between accessions are about 25 times greater among accessions than MA lines). Explanations for this observation include selective removal of deleterious mutations in native environments or epigenetic differences among the MA lines (Becker et al., 2011). Notably, similar differences between the amount of genetic and phenotypic variation were also found in a selection experiment with maize (Durand et al., 2015).

| CON CLUS ION
Our replicated field studies consistently demonstrate that the mutation accumulation lines have significant variance for life history and fitness characters, but do not differ on average from the founder for these traits. Such a result suggests that there is not an inherent preponderance of deleterious mutations in the A. thaliana Columbia background. Furthermore, while we quantified a relatively high haploid whole genomic mutation rate of 0.12 for fitness in one of the plantings (Rutter et al., 2010), we consistently observe h 2 m to be low, an order of magnitude lower than many studies conducted in the laboratory (Lynch & Walsh, 1998). The likely explanation is that the low h 2 m quantified here reflects elevated V e under field conditions. Relatively high mutation rates for fitness in the context of large inputs of V e combined with a high frequency of beneficial mutations all suggest that mutations can contribute substantially to standing genetic variation for fitness.

ACK N OWLED G M ENTS
We thank R. Shaw for providing seeds from the MA lines; R.
Strader, A. Royer, and R. Mobarec for help in the field at Blandy; J. Zerfass, T. Huebner, I. Khan, D. Tran, J. Shin, K. Agrawal, and S. Kasuba for help in data collection and greenhouse work. This project was supported by National Science Foundation Grants #0315972 and #1257902 to C.B.F. and #0307180 and #1258053 to M.T.R.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
MTR and CBF designed and conducted the experiments. MTR and AJR performed statistical analyses. MTR, AJR, and CMF wrote the manuscript.