Shifting the paradigm in Evolve and Resequence studies: From analysis of single nucleotide polymorphisms to selected haplotype blocks

Abstract For almost a decade the combination of whole genome sequencing with experimental evolution (Evolve and Resequence, E&R; Turner, Stewart, Fields, Rice, & Tarone, 2011) has been used to study adaptation in outcrossing organisms. However, complications caused by inversions and hitchhiking variants have prevented this powerful approach from living up to its potential. In this issue of Molecular Ecology, Michalak, Kang, Schou, Garner, and Loeschke (2018), provide an important step ahead by using a population of Drosophila melanogaster devoid of segregating inversions to identify the genetic basis of resistance to five environmental stressors. They further address the challenge of hitchhiking variants by reconstructing selected haplotype blocks. While it is apparent that the haplotype block reconstruction needs further refinements, their work underpins the potential of E&R studies in Drosophila to address fundamental questions in evolutionary biology.


Abstract
For almost a decade the combination of whole genome sequencing with experimental evolution (Evolve and Resequence, E&R; Turner, Stewart, Fields, Rice, & Tarone, 2011) has been used to study adaptation in outcrossing organisms. However, complications caused by inversions and hitchhiking variants have prevented this powerful approach from living up to its potential. In this issue of Molecular Ecology, Michalak, Kang, Schou, Garner, and Loeschke (2018), provide an important step ahead by using a population of Drosophila melanogaster devoid of segregating inversions to identify the genetic basis of resistance to five environmental stressors. They further address the challenge of hitchhiking variants by reconstructing selected haplotype blocks.
While it is apparent that the haplotype block reconstruction needs further refinements, their work underpins the potential of E&R studies in Drosophila to address fundamental questions in evolutionary biology.

K E Y W O R D S
drosophila, experimental evolution, haplotype-block, linkage disequilibrium, pool-seq break up the large selected haplotype blocks, the moderate number of recombination events in Drosophila experiments is not enough for this to occur. Thus, the combination of segregating inversions with selection on low-frequency haplotypes could explain the large number of candidate SNPs in D. melanogaster E&R studies Nuzhdin & Turner, 2014).
Another potential confounding factor contributing to the excessive number of candidate SNPs in E&R studies, which has not yet been studied in detail, is the widespread use of laboratory-adapted founder populations. Such populations have been maintained at rather large census population sizes for many years (e.g. Burke et al., 2010;Turner et al., 2011) to facilitate adaptation to laboratory conditions. While this procedure circumvents the problem of confounding adaptation to laboratory conditions with the adaptive response to the selection treatment, it creates the potential problem of reduced haplotype diversity in the founder population ( Figure 1). Michalak et al., (2018) studied the adaptive response of a freshly collected D. melanogaster population to five different selection treatments (heat shock, heat knockdown, starvation, cold shock F I G U R E 1 Reduction of haplotype diversity in populations maintained for many generations without selection. We simulated 1,037,324 SNPs on chromosome 2L in a population of 1,000 diploid individuals for 500 generations using 189 founder haplotypes Howie et al., (2018) and D. melanogaster recombination rate (Comeron et al., 2012). Computer simulations were performed using MimicrEE2 (Vlachos & Kofler, 2018). The number of haplotypes in 25-, 50-and 100-kb regions are shown. The reported haplotype diversity is conservative because haplotype blocks differing by only a single SNP are treated as distinct F I G U R E 2 Nonindependence of selected haplotype blocks reconstructed by (Michalak et al., 2018). (a,c) Manhattan plots of the negative log 10 -transformed p-values from CMH tests contrasting five replicate populations at F4 with F65 for (a) heat shock resistance selection (chromosome arm 2L) and (c) heat knockdown resistance selection (chromosome arm 3L). SNPs in reconstructed haplotype blocks (a: blocks 9-12, c: blocks 25, 30 and 32) are shown in block-specific colours. (b,d) Median allele frequency trajectories of SNPs with CMH negative log 10 -transformed p-value ≥20 (a) or ≥15 (c) in haplotype blocks in panels (a) and (c) (colour code corresponds to panels (a) and (c), respectively) in replicates 1-5. Despite different starting frequencies, the median trajectories of adjacent blocks resemble each other, suggesting linkage disequilibrium and possibly joint selection target(s) and desiccation). Unlike in previous studies, the founder population used by Michalak et al., (2018) Michalak et al., 2018). This confirms that haplotype-based analyses are more informative-rather than hundreds or thousands of putative selected targets, the selection response can be explained by tens to hundreds of adaptive alleles residing on selected haplotypes, as predicted before (Nuzhdin & Turner, 2014). Similar problems have been identified in experimental evolution studies using other species such as yeast and

Caenorhabditis elegans.
Nevertheless, the haplotype-based analysis of Michalak et al., (2018) requires further improvements; many different haplotype blocks are identified next to each other (figure 5 in Michalak et al., 2018). This problem was also noted by Barghi et al., 2019, who showed that selection targets with higher starting frequencies typically occur on multiple haplotypes. When too stringent clustering is applied (i.e. high correlation), multiple haplotype blocks are identified despite being affected by a single target of selection. Barghi et al., (2019) addressed this by a two-step clustering procedure and confirmed their clustering with experimentally phased haplotypes from evolved populations. We illustrate the possible nonindependence of adjacent haplotype blocks identified in Michalak et al., (2018) by plotting their frequency trajectories in two selection regimes ( Figure 2). This analysis shows that SNPs in these haplotype blocks have highly correlated allele frequency trajectories, suggesting that the number of selected targets is potentially considerably lower than implied by the clustering analysis of Michalak et al., (2018 (Bukowicki, Franssen, & Schlötterer, 2016). Third, evolved haplotypes can be phased experimentally by sequencing single F 1 individuals from crosses between the target strains and an inbred reference (Barghi et al., 2019;Franssen et al., 2015). Although highly accurate, this method requires live material for crosses. Finally, improving the correlation analysis of Franssen et al., (2017) could potentially increase the accuracy of identified target(s) of selection.
Regardless of the exact methods being used in future analyses of E&R studies, the study of Michalak et al., (2018) provides firm evidence that E&R using Drosophila bears a huge potential to provide unprecedented insights into the genetic architecture of adaptation.