Analysis of heteroplasmy in bank voles inhabiting the Chernobyl exclusion zone: A commentary on Baker et al. (2017) “Elevated mitochondrial genome variation after 50 generations of radiation exposure in a wild rodent.”

1Department of Ecology and Genetics, University of Oulu, Oulu, Finland 2CIBIO/InBIO, Research Center in Biodiversity and Genetic Resources, University of Porto, Vairão, Portugal 3Department of Biological and Environmental Science, University of Jyväskylä, Jyväskylä, Finland 4Ecologie Systématique Evolution, Université Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, Orsay Cedex, France 5Department of Biological Sciences, University of South Carolina, Columbia, SC, USA


| INTRODUCTION
Exposure to ionizing radiation is a well-established cause of mutation.
Given the global problem of accidental release of radionuclides into the environment (Lourenço, Mendo, & Pereira, 2016), it is essential to fully understand the genetic consequences of exposure to radionuclides. On 26 April 1986, a fire and explosion in Reactor 4 of the former nuclear power plant at Chernobyl (CNPP), Ukraine, released more than 9 million terabecquerels (TBq) of radionuclides over much (>200,000 km 2 ) of Europe and eastern Russia (see reviews on the effects, e.g., Mousseau & Møller, 2012;Møller & Mousseau, 2006). The Chernobyl Exclusion Zone (CEZ) was established at about a 30-km radius around the accident site to limit human exposure to radioactive fallout. The CEZ contains elevated levels of persistent radioisotopes, notably strontium-90 ( 90 S), caesium-137 ( 137 Cs) and plutonium-239 ( 239 Pu) that have half-lives of 28.8, 30.2 and 24,100 years, respectively. Wildlife inhabiting the CEZ provide clear models of the biological consequences of exposure to environmental radionuclides, with many reports of elevated levels of developmental instability, genetic damage and mutation rate associated with inhabiting areas contaminated by radionuclides. Hence, a meta-analysis revealed a strong effect of radiation upon mutation rate in organisms affected by Chernobyl fallout (data for 30 species in 45 published studies) (Møller & Mousseau, 2015). With this in mind, the report by Baker et al. (2017) in Evolutionary Applications of elevated levels of genetic diversity rates in bank voles inhabiting the CEZ appears consistent with the putative mutagenic effect of exposure to radionuclides.
The analysis by Baker et al. (2017) is promising for two principal reasons: (i) they have data from two time points and (ii) they use nextgeneration sequencing (NGS) to identify polymorphisms and thus bring studies of Chernobyl wildlife into the genomics era. Baker et al. (2017) sequenced whole mitochondrial genomes of samples of the bank vole Myodes glareolus to determine whether the bank voles inhabiting the CEZ have accumulated mutations as a consequence of exposure to elevated levels of radionuclides. The bank vole is a small rodent that is common in forest habitats in northern Europe. As this species is common within and around the CEZ, the bank vole has been widely studied as a model of the mammalian response to radionuclides This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2017 The Authors. Evolutionary Applications published by John Wiley & Sons Ltd (Boratyński, Lehmann, Mappes, Mousseau, & Møller, 2014;Chesser et al., 2000;Lehmann, Boratyński, Mappes, Mousseau, & Møller, 2016;Meeks, Chesser, Rodgers, Gaschak, & Baker, 2009;Meeks et al., 2007;Rodgers, Wickliffe, Phillips, Chesser, & Baker, 2001). Baker et al. (2017) found greater mitochondrial diversity in samples from two contaminated areas (Red Forest and Glyboke Lake) than in samples of bank voles from three uncontaminated (control) areas (Nedanchychy, Nezamozhnya and Oranoe) (see Figure 1 for sample locations). In the abstract, the authors state [that their data are] "consistent with the possibility that chronic, continuous irradiation resulting from the Chernobyl disaster has produced an accelerated mutation rate in this species over the last 25 years." However, Baker et al., 2017 did not fully discuss three important issues relating to their data: (i) sampling, (ii) bank vole population dynamics and (iii) heteroplasmy.

| SAMPLING: CORRECTION FOR VARIATION IN SAMPLE SIZES
Sample sizes used by Baker et al. (2017) vary between 11 and 20 (and one sample of three bank voles at Glyboke Lake), with the two largest samples from the Red Forest (i.e., contaminated site). Baker et al. (2017) discuss possible effects of unbalanced sample size on their conclusions and largely attempted to correct for uneven sample sizes by dividing estimates of genetic diversity by the sample size (Table 3 in Baker et al. (2017)). This method of correction is not appropriate (e.g., as genetic diversity and sample size have a nonlinear relationship, see Figure 2), and rarefaction is more typically used to compare estimates of genetic diversity among samples that differ in size (Petit, Mousadik, & Pons, 1998;Szpiech, Jakobsson, & Rosenberg, 2008). As an illustration, we obtained data for Baker et al.'s (2017) samples of bank voles for the mitochondrial locus ND4 (1,378 bp that had 60 variable sites) from DRYAD (https:// doi.org/10.5061/dryad.j11s7) and calculated the number of haplotypes per sample, corrected for sample size using rarefaction implemented by ADZE (Szpiech et al., 2008). We also found that mitochondrial diversity (as measured by the number of haplotypes) is higher in contaminated than in uncontaminated sites (Figure 2), reinforcing Baker et al.'s (2017) conclusions; moreover, high diversity is apparent in the small (n = 3) Glyboke Lake 2011 sample although these data were not included in the statistical comparison of population genetic diversity. Hence, the level of mitochondrial diversity is associated with the level of environmental radioactivity. But, do these data indicate a high mutation rate?

| SAMPLING: UNCLEAR CHOICE OF CONTROL SITES
Several studies have quantified mitochondrial diversity (at a 291-bp fragment of the control region and some adjacent tRNA) in bank voles inhabiting the CEZ and in uncontaminated sites in Ukraine (Matson, Rodgers, Chesser, & Baker, 2000;Meeks et al., 2007Meeks et al., , 2009Wickliffe et al., 2006): none of these studies concluded that there was a robust association between mutation rate and the level of environmental  RF 1998RF 2011GL 1998GL 2011OR 1998NZ 1998& 2011OR 2011ND 2011ND 1998 radionuclides. Rather, studies have highlighted the need for additional sampling (Matson et al., 2000;Wickliffe et al., 2006) or found mitochondrial diversity to be comparable between contaminated and uncontaminated sites that were located close to the CNPP (Meeks et al., 2007); moreover, genetic diversity was heterogeneous among samples of bank voles collected over a large area of Ukraine, with uncontaminated locations containing more unique haplotypes and a higher ratio of unique to total haplotypes ( Table 1 in Meeks et al., 2009). Variation in mitochondrial diversity in bank voles has been explained by demographic and ecological processes, rather than exposure to environmental radionuclides (e.g., Meeks et al., 2009).
These studies on bank vole mitochondrial genetic diversity were not addressed in detail by Baker et al. (2017) (Kozakiewicz, Chołuj, & Kozakiewicz, 2007). Choice of appropriate control sites is an important issue that, here, is complicated by a combination of prior information about variation in mitochondrial diversity in potential samples and a lack of knowledge about the population dynamics of wild bank voles.

| BANK VOLE POPULATION DYNAMICS
The possible influence of population history on spatial patterns of mitochondrial diversity should be reconsidered. Baker et al. (2017) imply that their data are inconsistent with the hypothesis of recolonization explaining the observed high mitochondrial diversity in contaminated sites because the surrounding areas (potential sources) have lower mitochondrial diversity. However, an area recolonized by several, genetically different, sources could exhibit an increase in genetic diversity (as discussed by (Matson et al., 2000)). The uncontaminated areas to the east of Chernobyl (Nedanchychy and Nezamozhynya) are genetically different from a distant control site (Korostychev) that is south of Chernobyl (Meeks et al., 2009). Also, bank voles from uncontaminated sites Nedanchychy, Krasnove and Paryshev (Figure 1), as well as the contaminated sites within CEZ, have unique (not found at any other site) mitochondrial haplotypes (Meeks et al., 2007(Meeks et al., , 2009Wickliffe et al., 2006). Indeed, there were no shared mitochondrial haplotypes among the CEZ and control regions (Figure 1 (Tables 3 and 4 in Baker et al. (2017), also see Figure 2), although there were greater nucleotide differences between temporal samples at the contaminated sites, but small sample size prevented any statistical inference (Table S4 in Baker et al. (2017)). A lack of temporal effect weakens the argument that exposure to low-dose radionuclides simply increases mutation. One implication is that most of the mitochondrial diversity (via mutation) may have accumulated at some point to affect the 1998 sample, but not subsequently. For example, most of the fallout from the Chernobyl accident was iodine-131. Exposure to 131 I conceivably might have had some initial, but not contemporary, impact on wildlife as this isotope dissipated rapidly (half-life of 8 days). Initial exposure to radiation can trigger a suite of cellular effects that persist for some time (Mothersill & Seymour, 2006). Exposure to 131 I is associated with elevated incidence of human thyroid cancers (Cardis et al., 2006), but its long-term effects on wildlife are not known.
Other principal radionuclides ( 90 S, 137 Cs and 239 Pu-see Introduction) within the CEZ are more persistent, and their effects are less likely to have dissipated recently. We might speculate that bank voles exhibit some adaptive response to radionuclide contamination, for example, via improved antioxidative measures and/or DNA repair that prevents further accumulation of mutations.

| HETEROPLASMY AS A MARKER OF MUTATION?
Inferring mutation rate from population genetic diversity itself is complicated by processes that determine whether (or not) a mutation is incorporated into the population at a detectable frequency: the probability that a mutation is retained within a population, for example, depends on strength of selection, population size, recombination (e.g., for nuclear DNA) and sample size. A complementary analysis of mutation could focus on mutations occurring within individuals independent of demography. One solution is to quantify heteroplasmy, the occurrence of more than one mitochondrial haplotype within an individual (Li, Schröder, Ni, Madea, & Stoneking, 2015;Li et al., 2010).
Heteroplasmy may be caused by paternal transmission of mitochondrial DNA or reflect mutations produced by DNA replication errors, inefficient DNA repair or oxidative damage (Kmiec, Woloszynska, & Janska, 2006): an increase in heteroplasmy therefore could be a potential signal of exposure to mutagens. Heteroplasmy has been explored as a biomarker of exposure to environmental radioactivity in bank voles from the CEZ, where exposure to radionuclides elicited a nonsignificant increase in heteroplasmy (Wickliffe, Chesser, Rodgers, & Baker, 2002). Next-generation sequencing (NGS) is well suited for detecting heteroplasmy as the high depth of coverage that can be readily achieved when (re)sequencing a small genome (e.g., mitochondrial DNA) allows for robust detection of intra-individual polymorphisms (Li et al., 2010;Tang & Huang, 2010;Wachsmuth, Hübner, Li, Madea, & Stoneking, 2016); for example, at a 1,000× coverage, a heteroplasmy occurring at 1% frequency is expected to be visible in about 10 reads, a signal that should be distinct from the numbers of mismatches derived from a ~0.1% error rate associated with Illumina HiSeq2000 chemistry (Glenn, 2011). Baker et al.'s (2017 NGS data present a powerful opportunity to examine the potential association between exposure to environmental radionuclides and heteroplasmy as (i) heteroplasmy is common in muscle (Li et al., 2015), the source of bank vole genetic material, and (ii) the authors achieved high depth of coverage (average coverage = 3,974, range = 64-7,841; table S1 in Baker et al. (2017)) over most mitochondrial genomes.
We obtained Baker et al.'s (2017) NGS data from NCBI's sequence read archive (https://www.ncbi.nlm.nih.gov/sra/, project accession SRX2515630). Only 122 of the 131 bank vole samples described by Baker et al. (2017) were archived. Sample information was not provided with the raw read data, so we assigned a putative origin by matching the count of raw reads in each file to the read data information provided in table S1 by Baker et al. (2017). Potential adaptors and poor quality reads were removed from the raw data using TRIMMOMATIC v.0.35 (Bolger, Lohse, & Usadel, 2014) (minimum length = 90, quality score = 20, sliding window size = 5). Paired reads were mapped to a bank vole mitochondrial genome (GenBank accession NC_024538) using BOWTIE2 v.2.2.9 (Langmead & Salzberg, 2012) (mapping options: -D 5 -R 1 -N 0 -L 22 -i S,0,2.50), as the mitochondrial reference for mapping by Baker et al. (2017) was not public at the time of analysis. Mapping data were sorted and converted to a MPILEUP file using SAMtools v.1.4 (http://samtools.github.io/hts-specs/SAMv1. pdf). Potential heteroplasmic sites were called using VARSCAN v.2.3.9 (Koboldt et al., 2012) on the basis of a minimum read frequency of 1%, but only when a minimum read depth of 500 was achieved and when at least 10% of the reads mapped to the alternate strand (to reduce numbers of false-positive sites arising from PCR artefacts (Scarcelli et al., 2016)). Variable sites were called between positions 220 and 15,793 of the reference genome due to low coverage at the beginning and end of the reference. This analysis allowed us to quantify heteroplasmy on the basis of (i) whether an individual contained at least one heteroplasmy or not and (ii) the total number of heteroplasmic sites found within an individual's mitochondrial genome. As seven samples had low sequencing coverage across the entire mitochondrial genome, we had a final sample of 115 individuals (n = 46 and 69 from contaminated and uncontaminated sites, respectively) for analysis of heteroplasmy in bank voles (Table 1).
As it is in many other animals (Kmiec et al., 2006), heteroplasmy appears to be common in bank voles as we identified 72 (63%) individuals with at least one heteroplasmic site, represented by 28 (61%) and 44 (64%) individuals from the contaminated and uncontaminated sites, respectively (Table 1) sites. Read mapping for these two individuals was visually inspected in TABLET v.1.14.11.07 (Milne et al., 2009). The potential heteroplasmic sites were scattered around the mitochondrial genome and represented by pairs of reads that had different insert sizes, implying that the heteroplasmy detection was not simply an artefact of PCR bias. Nonetheless, statistical analyses of variation in heteroplasmy with radionuclide contamination were made with and without the two "outlier" samples (note that both are from uncontaminated sites in 1998). We estimated whether levels of heteroplasmy differed between contaminated and uncontaminated sites using the generalized linear mixed model (GLMM) implemented by the GLMER function in LME4 (Bates, Mächler, Bolker, & Walker, 2015) Table S1). With all data (n = 115), the proportion of individuals with a heteroplasmy was lower in the contaminated sites and also in 2011, although neither effect was significant (p = .63 and .31, respectively; Table 2). The numbers of heteroplasmies in individuals were significantly lower in contaminated sites and in 2011 (p < .001 for both predictors; Table 2). The qualitative pattern of lower heteroplasmy in the contaminated areas and a possible temporal reduction in heteroplasmy (between 1998 and 2011) remains when the two outlier individuals are removed, but with no significant predictor for either measure (presence/absence or count) of heteroplasmy (Table 2) (Tables 1 and 2). Neither of these spatial nor temporal patterns is an expected consequence of a simple, positive association between chronic exposure to environmental radionuclides and the rate of mutation.
T A B L E 1 Heteroplasmy (Hp) estimates in each of the samples separately GLMM was run with all available individuals (n = 115) and a reduced data set with the two outlier individuals from uncontaminated sites removed (n = 113).

| DISCUSSION AND CONCLUSIONS
Understanding the biological effects of exposure to low-dose radiation is an important issue given that numerous human activities have left substantial amounts of radionuclides in the environment (Lourenço et al., 2016): reports of accelerated mutation rate have clear policy implications. While a high rate of mutation is characteristic of diverse taxa affected by Chernobyl fallout (Geras'kin, Fesenko, & Alexakhin, 2008), the specific responses to radionuclide exposure vary between taxa (Møller & Mousseau, 2015) and mammals are comparatively understudied. Application of NGS techniques represents a much needed scientific advance for studies of wildlife inhabiting the CEZ. However, sequence data for whole mitochondrial genomes (from Baker et al., 2017) are also consistent with the results of previous studies of bank vole mitochondrial diversity at the control region with the results being explained by processes other than mutation (e.g., Matson et al., 2000;Meeks et al., 2007Meeks et al., , 2009Wickliffe et al., 2006). Analysis of heteroplasmy in bank voles offers high power to detect low-frequency intra-individual mutations and can circumvent the uncertainty associated with inferring mutation from populations whose demographic histories are unknown. A lack of association between heteroplasmy and contamination by environmental radionuclides is important as occurrence of low-frequency, intra-individual mutations is presumably needed to generate the "raw material" for mutations that are later visible as "population genetic diversity." A recent meta-analysis has revealed an association between mutation rate and environmental radiation exposure in many species from Chernobyl (Møller & Mousseau, 2015). However, given our discussion about sampling, bank vole population history and heteroplasmy, we suggest that in addition to the report of high mitochondrial diversity in samples of bank voles inhabiting the CEZ, further studies are needed to demonstrate an accelerated mutation rate in this species.