FITNESS OF ARABIDOPSIS THALIANA MUTATION ACCUMULATION LINES WHOSE SPONTANEOUS MUTATIONS ARE KNOWN
Despite the fundamental importance of mutation to the evolutionary process, we have little knowledge of the direct consequences of specific spontaneous mutations to the fitness of the organism. Combining results of whole-genome sequencing with repeated field assays of survival and reproduction, we quantify the combined effects on fitness of spontaneous mutations identified in Arabidopsis thaliana. We demonstrate that the effects are beneficial, deleterious, or neutral depending on the environmental context. Some lines, bearing mutations disrupting known loci, differ strongly in fitness from the founder or premutation genotype. Those effects vary across environments, for example, a line with a major deletion spanning a transcription factor gene expressed lower fitness than the founder under most conditions but exceeded the founder's fitness in one environment. The large contribution of genotype by environment interaction (G × E) to mutation effects on fitness implies spatial and/or temporal variation in selection on new mutations and could contribute to the maintenance of standing genetic variation.
New mutations are the original source of all genetic variation. Mutation has a central role in evolutionary theory (Fisher 1930; Kondrashov 1988; Lynch et al. 1999; Johnson and Barton 2005; Martin and Lenormand 2006) and in conservation biology (Lande 1994; Lynch et al. 1995), but empirical challenges hamper the study of its contribution to adaptation and variation in natural populations. Although effects on the fitness of known mutations singly and in pairs have been evaluated for certain viruses (Sanjuan et al. 2004a,b), the cumulative effects of known spontaneous mutations on the fitness of multicellular organisms have never been directly quantified (Bataillon 2003).
Mutation accumulation (MA) lines, which are independent lines derived from a nearly homozygous founder, can be used to estimate both the rate of mutation and the distribution of their effects on traits and fitness (Mukai 1964; Bataillon 2000; Shaw et al. 2002). Shaw et al. (2002) and Rutter et al. (2010) used a set of MA lines derived from the reference Columbia accession of Arabidopsis thaliana to quantify the mutation rate and distribution of effects on reproductive components of fitness and on lifetime fitness (survivorship × reproduction), respectively. More recently, Ossowski et al. (2010) reported the complete genome sequences of five of the lines for which Rutter et al. (2010) obtained estimates of lifetime fitness in the greenhouse and in one field environment. Here, we present new performance results for five additional field environments; four of the five sequenced lines are represented in all five new environments whereas the fifth sequenced line's performance was measured in three of the five new environments. We synthesize the results obtained via statistical genetics and molecular approaches, respectively, to examine the dependence of the fitness of the five MA lines in different environments on sequence differences.
The methods of the sequence analysis of the five MA lines are presented in Ossowski et al. (2010), whereas the methods of the assessment of mutation rate and effect of mutations on fitness under the conditions of the one field experiment are presented in Rutter et al. (2010). Because results from only one of the six field trials are presented in Rutter et al. (2010), we briefly describe how the field tests were performed in all of the environments.
Using seed derived from the 25th generation of MA, incorporating sublines to account for maternal effects (five sublines per MA line), and employing a randomized block design (N= 14 blocks per experiment), we planted seedlings of 100 MA lines and the premutation genotype at the two-leaf stage, approximately two weeks post germination, into a secondary successional field at Blandy Experimental Farm (BEF) (39°N, 78°W), in the northern Blue Ridge Mountains of Virginia. The Columbia founder premutation genotype had been stored at 4°C whereas the MA lines were advanced. At the time when sublines of the MA lines were generated for these and other assays, sublines were also produced for the premutation founder by retrieving founder generation seed from cold storage and growing them for two generations. The five MA lines from this set that were sequenced by Ossowski et al. (2010) were included in our field tests. Seedlings were planted in the Spring and Fall of 2004 and again in the Spring and Fall of 2005. The Spring and Fall plantings allow the evaluation of mutation parameters for different life-histories of A. thaliana populations, depending on environment. The Spring-planting environment corresponds to a spring ephemeral life-history, where plants germinate and complete the life-cycle in the spring. The Fall-planting environment corresponds to a winter-annual life-history, with plants germinating in the fall, overwintering as rosettes, and flowering and fruiting in the ensuing spring. In separate experiments, the premutation founder and 50 MA lines, including four of the five sequenced MA lines, were planted as seeds in Fall of 2005, in replicate at BEF and the Kellogg Biological Station (KBS) (42°N, 85°W) in southern Michigan.
For all but one environment, the measure of plant fitness was lifetime fruit production, including zero values for plants that did not survive to reproduce, a commonly used measure in fitness assays and especially appropriate for a predominantly selfing annual plant, although it does not take into account variation in the survival and germination of seeds in the seedbank (seed plantings) or variation in the survival and germination of seeds (seedling plantings). In the Fall-2004 planting, fruit number was estimated from biomass, as biomass was highly correlated with fruit number in a random subset of this dataset (r2= 0.96) and overall fruit numbers were extremely high. We used aster modeling (Shaw et al. 2008) to compare the fitness of the MA lines and the founder within each of the environments and to quantify genotype by environment interaction (G×E) of performance of the lines across environments. The analysis of G×E was conducted with random block and subline effects. The comparison of the MA line performance with the founder line was performed without the use of random effects.
Results and Discussion
The five sequenced A. thaliana MA lines provided a point mutation rate estimate of 0.60/haploid genome and generation (0.70 including indels; Ossowski et al. 2010) similar to direct sequencing studies of MA lines in Caenorhabditis elegans and Drosophila melanogaster (Denver et al. 2004; Haag-Liautard et al. 2007). The MA lines were sequenced at the 30th generation, and had on average 20 point and 3-indel mutations. Assuming mutations accumulate linearly with generations, the 25th generation MA individuals, which we assayed for fitness, had on average 17 point and 2.5-indel mutations. Because the 25th generation should harbor over 80% of mutations found at the 30th generation, our fitness estimates represent most mutations captured by sequencing. Rutter et al. (2010) estimated the diploid whole genomic mutation rate for fitness measured in the field as 0.23, similar to a previous study of a subset of the same lines when grown in the greenhouse (Shaw et al. 2002), and similar to the diploid mutation rate of nonsynonymous and indel mutations occurring in coding regions, 0.2, estimated by Ossowski et al. (2010). However, unlike Shaw et al. (2002), Rutter et al. (2010) did not detect MA line divergence in the greenhouse.
Because many MA lines outperformed the premutation genotype in the field, Rutter et al. (2010) estimated a high proportion of these mutations, 43%, as enhancing fitness, similar to estimates for components of reproduction in Shaw et al. (2002; see also MacKenzie et al. 2005). Converting the 0.23 diploid mutation rate for fitness to a haploid estimate, 0.12, and comparing with the haploid sequence level point mutation rate, we deduce that 20% (0.12/0.60) of the mutations determined by sequencing have fitness effects; 9% (0.43 × 0.20) of the spontaneous mutations are beneficial, 11% (0.57 × 0.20) are deleterious, and the rest, 80%, are neutral or nearly neutral (Nes < 1, Wright 1931). Our estimate of the deleterious mutation rate of 11% or 0.07/haploid generation is less than the value of 0.14/haploid generation derived from sequence comparisons of A. thaliana and A. lyrata in Ossowski et al. (2010). One of many explanations for this difference is that estimates of the whole genomic deleterious rate using sequence comparison approaches assume mutations to be deleterious or neutral, whereas we detected a relatively high proportion of beneficial mutations.
Supporting the existence of an appreciable proportion of beneficial mutations, the five sequenced MA lines frequently outperformed the founder in five additional field trials under an array of conditions (Table 1). Others have recently detected high rates of beneficial mutations, up to 15% in viruses (Burch and Chao 1999, 2000; Silander et al. 2007), up to 13% in yeast (Hall et al. 2008, Hall and Joseph 2010), and a significant number in D. melanogaster (Azad et al. 2010), counter to the view of Keightley and Lynch (2003).
Table 1. Mean fruit production of the five MA lines and the founder premutation line across a range of environments. Fitness were estimated using an aster model including survival (binomial) and fruit number (Poisson). P-values (*P < 0.05, **P < 0.01, ***P < 0.001) represent MA-founder comparisons using separate analyses of aster for each MA-founder comparison. P-values were calculated by likelihood ratio tests, and validated using a parametric bootstrap. Means in bold represent a significant difference following within experiment sequential Bonferroni correction (P < 0.05). A summary of the mutations found in each line is presented in the last column. Spring 2004 is part of a larger data set published in Rutter et al. (2010). Model abbreviations are BEF for Blandy Experimental Farm and KBS for Kellogg Biological Station.
|Founder||287.7||19||30.0||0.44||17.7||28.6|| || || |
|29||314.7||16.4|| || ||18.6||26.3||21|| 6||4|
Although previous greenhouse studies varying specific aspects of environment have demonstrated modest G×E for these MA lines (Chang and Shaw 2003; Kavanaugh and Shaw 2005), we found the performance of the MA lines in the field was highly variable among environments, which varied both temporally or spatially (Table 1). Based on the performance of the five sequenced MA lines across the four seedling plantings in BEF, two in the Fall environment and two in the Spring environment, we found a significant interaction between MA line and experiment (P < 0.0025). Comparing MA lines to the founder across all of the environments, for each MA line estimates of fitness were higher than that of the premutation line in at least one environment, and all MA lines (except line 119) had lower fitness than the premutation line in at least one environment (Table 1). Taken as a whole, it is clear that the effects of the mutations accumulated during the experiment varied across environments, generating a genotype by environment interaction for fitness.
Although we cannot isolate the effects of individual mutations found in any given line, and Ossowski et al. (2010) detected only the majority of all mutations, despite their sequence coverage of 90% of the genome, some general conclusions can be made regarding the mutational composition and the performance of the MA lines (Table 1 and Supporting information). Counting only mutations in UTRs, introns, and coding regions of putative genes, i.e., in or near putative genes, MA line 29 had the fewest nonsynonymous mutations (one), no deletions in coding regions, and performed most similarly to the founder across environments. This finding is consistent with expectations that nonsynonymous mutations and deletions would have the strongest fitness effects and their absence should result in little fitness change from the founder. Unexpectedly, the MA lines with the most mutations overall (line 59) and the most mutations associated with known coding loci (line 119) outperformed the founder in the greatest number of trials. However, this is consistent with our inference of a high rate of beneficial mutations (Shaw et al. 2002; Rutter et al. 2010). Because the five MA lines were a random sample of the 100 lines for which we tested performance, we expect them to be a representative sample of the 100 MA lines across the different environments. Indeed, across the six field environments the lines had intermediate performance, ranging from a mean 35th–75th percentile, with greater performance indicated by a lower percentile rank.
We discuss below specific mutations within particular lines in relation to the fitness profiles of those lines, acknowledging that the effects of multiple mutations within a line are confounded (Supporting information).
A mutation located in a locus encoding a gene associated with DNA replication and repair (AT4G 00070: RING/U box family) found in line 59 may account for this line having the most mutations. We do not have an explanation of why or which mutations were beneficial in the specific environmental context, with one possible exception. MA line 119 outperformed the founder line in all field experiments and had a large 600 bp deletion in a gypsy-class retrotransposon (AT3G60930). Such a mutation, occurring in a sequence that is likely to be deleterious, may be beneficial. In contrast, it is easier to hypothesize which mutations may have deleterious effects, keeping in mind that the significant G×E was often associated with the relative performance of individual MA lines changing with respect to the founder line. Line 49 had the largest deletion, spanning three loci (AT1G77420–450), and was the most consistently poorly performing line, frequently exhibiting the lowest (4 of 6) or second lowest (1 of 6) fitness of any line (including the premutation line), congruent with the expectation that deletions of coding loci are often deleterious. This deletion in line 49 included sequences coding for a hydrolase (AT1G77420), a protein degrading enzyme (AT1G77440), and a DNA binding transcription factor associated with transcriptional regulation (AT1G77450). Which, if any, of these gene deletions have deleterious effects on fitness cannot as yet be determined, but the mutation affecting the transcription factor is intriguing in view of the conjecture that mutations in the coding regions of trans-regulatory loci are more likely to have deleterious pleiotropic consequences than are mutations in other loci (Stern 2000; Carroll 2008).
However, even MA line 49 performed better than the founder in one environment, despite the three-locus deletion (Table 1, Fall-2005 planting), demonstrating that effects of mutations may differ substantially among environments (Kondrashov and Houle 1994; Shabalina et al. 1997; Davies et al. 1999; Yang et al. 2001; Kulheim et al. 2002; Roles and Conner 2008).
Our joint findings indicate that mutations affect fitness, both positively and negatively, and these effects appear to differ with environment. Rutter et al. 2010 found that mutation effects on fitness are small, when scaled to environmental variation (h2m of the mutations in one field environment = 0.0001). Together, the environment dependence of mutation effects on fitness (spanning deleterious to beneficial) (Gillespie and Turelli, 1989) and the relatively small effects of individual mutations on fitness may contribute to the great degree of standing genetic variation found in natural populations.
Associate Editor: J. Kelly
This work was funded by the Max Planck Society to D.W., NSF grant (DEB 0108354) to J.K.C., NSF grants (DEB 9629457 and DEB 9981891) to R.G.S., and NSF grants (DEB 0315972, DEB 0307180, and DEB 0845413) to M.R. and C.F. The manuscript was improved by comments from M. Dudash, J. Kostyun, and F. Stearns. We thank C. Geyer for consultation on the analysis. The UMD evolutionary genetics group provided additional helpful input.