Detecting selected haplotype blocks in evolve and resequence experiments

Abstract Shifting from the analysis of single nucleotide polymorphisms to the reconstruction of selected haplotypes greatly facilitates the interpretation of evolve and resequence (E&R) experiments. Merging highly correlated hitchhiker SNPs into haplotype blocks reduces thousands of candidates to few selected regions. Current methods of haplotype reconstruction from Pool‐seq data need a variety of data‐specific parameters that are typically defined ad hoc and require haplotype sequences for validation. Here, we introduce haplovalidate, a tool which detects selected haplotypes in Pool‐seq time series data without the need for sequenced haplotypes. Haplovalidate makes data‐driven choices of two key parameters for the clustering procedure, the minimum correlation between SNPs constituting a cluster and the window size. Applying haplovalidate to simulated E&R data reliably detects selected haplotype blocks with low false discovery rates. Importantly, our analyses identified a restriction of the haplotype block‐based approach to describe the genomic architecture of adaptation. We detected a substantial fraction of haplotypes containing multiple selection targets. These blocks were considered as one region of selection and therefore led to underestimation of the number of selection targets. We demonstrate that the separate analysis of earlier time points can significantly increase the separation of selection targets into individual haplotype blocks. We conclude that the analysis of selected haplotype blocks has great potential for the characterization of the adaptive architecture with E&R experiments.

is to sequence pools of individuals (Pool-seq) instead of individual genomes . While estimating population allele frequencies accurately, Pool-seq does not provide linkage information. Therefore, E&R studies typically treat individual SNPs as independent instead of incorporating the underlying haplotype structure, and frequently report an excess of outlier SNPs responding to selection (Burke et al., 2010;Griffin, Hangartner, Fournier-Level, & Hoffmann, 2017;Jha et al., 2015;Orozco-terWengel et al., 2012;Remolina, Chang, & Leips, 2012;Tobler et al., 2014;Turner et al., 2013;Turner, Stewart, Fields, Rice, & Tarone, 2011), which is not compatible with population genetic theory (Nuzhdin & Turner, 2013). Franssen, Nolte, Tobler, and Schlötterer (2015) shed some light on the excess of candidate loci by jointly analysing Pool-seq data and experimentally phased haplotypes from the same experiment. They pointed out that a high number of the candidate SNPs in Drosophila melanogaster studies were either located in large segregating inversions (Kapun, Van Schalkwyk, McAllister, Flatt, & Schlötterer, 2014) which suppress recombination or in genomic regions with reduced recombination rates. Another factor contributing to the large number of candidate SNPs is selection on low-frequency alleles. The moderate number of recombination events during the experiment is not sufficient to break up the association between the target of selection and linked neutral variants that were private to the selected low-frequency haplotype.
These results show that understanding the genomic architecture of adaptation is a very challenging task and individual haplotypes from evolved populations greatly facilitate it by providing linkage information.
A different approach to reconstruct selected haplotype blocks without information about the founder haplotypes was proposed by Franssen, Barton, and Schlötterer (2017). This approach uses window-based correlation analysis of allele frequency data across replicates and time points combined with hierarchical clustering.
Each cluster of SNPs corresponds to a selected haplotype block. Franssen, Barton, et al. (2017) focused on haplotype blocks starting from low allele frequencies (≤0.03), and marker SNPs (i.e. correlated SNPs increasing in frequency), which are mostly private to them, can be identified by strongly correlated allele frequency changes.
This approach successfully identified selected haplotype blocks up to several Mb in simulated and empirical Pool-seq data. Extending the approach of Franssen, Barton, et al. (2017) to haplotypes by including also alleles with higher starting frequencies,  successfully reduced over 50,000 outlier SNPs to 99 reconstructed haplotype blocks responding to selection in experimentally evolved Drosophila simulans populations. Both,  and Franssen, Barton, et al. (2017) relied on experimentally phased haplotypes of evolved populations to validate their results. Without sequences of evolved and ancestral haplotypes, the validation of reconstructed blocks is challenging, as haplotype reconstruction requires ad hoc choices of key parameters which can change the outcome dramatically and are highly dependent on the data set.
Here, we propose a new approach to define the haplotype reconstruction criteria to detect independent genomic regions with selection signatures. It is suitable for most E&R experiments, avoids ad hoc choices of clustering parameters and does not depend on the availability of phased haplotype data. Our approach takes advantage of the full genomic data to distinguish between statistically significant clustering, most likely caused by directional selection, and random associations. It is implemented in the r package haplovalidate.

| Haplotype reconstruction
The haplotype reconstruction approach applied by haplovalidate was proposed by Franssen, Barton, et al. (2017) and implemented in the r package haploReconstruct. It is based on the idea that SNPs on the same haplotype block should behave similarly over time, that is, have highly correlated allele frequency trajectories. The boundaries of haplotype blocks on the chromosomes are formed by the interplay of recombination and evolutionary forces. Selected haplotype blocks increasing in frequency can be monitored by following the trajectories of their marker SNPs. This approach is especially suitable for E&R experiments, as multiple time points and population replicates increase the power of detecting correlated signals (for an example see Figure 1). Franssen, Barton, et al. (2017) proposed to analyse the correlation of SNPs across replicates and time points within overlapping genomic windows and apply hierarchical clustering. Different groups are then separated by applying a correlation cut-off, with each cluster of SNPs corresponding to a reconstructed haplotype block.

| Haplovalidate
The results of an E&R experiment depend greatly on the underlying adaptive architecture, for example selection scenario, number of selected loci, initial starting frequencies of selected alleles, strength of selection and linkage structure of the founder population. One selected haplotype block can consist of a group of haplotypes with some sequence variation, but all haplotypes are sharing the same selected allele(s). As a consequence, the correlation of SNPs in the haplotype block depends on the extent they are shared among selected haplotypes and nonselected ones. This results in variation of the correlation of shared marker SNPs within one selected haplotype block. Haplovalidate specifically estimates two key parameters of the haplotype reconstruction from the data, that is minimal cluster correlation and window size. Other haplotype reconstruction parameters were not inferred from the data, but chosen to fit to a broad range of data sets (see Table 1).

| Detailed procedure
Haplovalidate consists of five different steps (see Figure 2). Please find details on parameter choice and candidate SNP identification in the sections below.

| Step 0: Identify candidate SNPs and select window size
Candidate SNPs (i.e. SNPs that change more in frequency than expected under neutrality) were identified by chi-square test and Cochran-Mantel-Haenszel (CMH) test comparing the most evolved to the founder population, and the optimal window size is calculated using the MNCS (median normalized CMH score sum) approach, which is explained in detail in the section Variable parameters below.

| Step 1: Reconstruct haplotype blocks
Haplotype blocks are reconstructed using parameters as described below and minimum cluster correlations ranging from 0.9 to 0.3 in 0.1 steps. The reconstruction with most reconstructed blocks is used as starting point to determine focal and background block correlations.

| Step 2: Determine focal and background block correlations
We normalize allele frequency data by using arcsine-square-roottransformation followed by centring and scaling. Block correlations are then calculated for SNP allele frequency trajectories of any two haplotype block marker SNPs of two blocks (pairwise correlations), and the median is taken from all estimates of the two block comparison. The two blocks can be either blocks within the window used for haplotype reconstruction (focal block correlations) or blocks on F I G U R E 1 Example of correlated allele frequency trajectories in a reconstructed haplotype block (simulated data). The simulations were performed over 60 generations (x-axis) and allele frequencies (y-axis) were detected in 10 replicates. Each green line is a haplotype block marker SNP, and each box is a different population replicate.

| Step 3: Evaluate block correlation distributions
Block correlations are normalized using the Fisher transformation (Fisher, 1915). The difference in focal and background correlations is determined by a one-sided t test. In the case of a significant difference between chromosome and background (p-value <.025), the procedure is repeated with step 1 and a less stringent minimum cluster correlation (0.01 steps). If there is no significant difference between chromosome and background (p-value ≥.025), haplovalidate uses the last significant haplotype reconstruction, therefore reducing nonindependent haplotype blocks to a minimum. If fewer than three values are available for focal or background block correlation, haplovalidate returns no result.

| Step 4: Filter for dominant blocks
If a selection target is present in several haplotype blocks, they will be identified as independent, overlapping blocks. To identify the dominating block per selection target, we filter genomic regions with overlapping blocks for the block with the most significant allele frequency change (CMH test/chi-square test; Spitzer, Pelizzola, & Futschik, 2020).

| Variable parameters
The parameter 'minimal cluster correlation' determines the cut-off for hierarchical clustering and is among the most important parameters for haplotype reconstruction. Using too high correlation coefficients splits a selected haplotype block into smaller regions, giving the impression of multiple selection targets, rather than one. If the correlation coefficients are too low, independent selected haplotype blocks are combined and consequently the number of selection targets is underestimated (Franssen, Barton, et al., 2017). However, the definition of too high and too low varies between different data sets as the F I G U R E 2 Overview of the iterative procedure to define haplotype blocks with haplovalidate. After identifying candidate SNPs and calculating the window size (step 0), the haplotype reconstruction starts with stringent parameters (step 1). The allele frequency trajectory correlation of SNPs from different blocks on the same chromosome (focal block correlation) is compared to the correlation of SNPs from blocks located on different chromosomes and tested for significant differences (step 2, 3). If focal block correlations are higher than the background correlations, this indicates that a too stringent correlation cut-off was used; and haplotype reconstruction is repeated using less stringent parameter (back to step 1). If focal block correlations are similar to background correlations, the least significant haplotype reconstruction is used and regions with overlapping blocks are filtered for the most dominant block (step 4) [Colour figure can be viewed at wileyonlinelibrary.com] The ad hoc reconstruction parameters were taken from a standard configuration used by Franssen, Barton, et al. (2017).

Reconstruction parameter Definition
TA B L E 1 haploReconstruct parameters and the values used for the ad hoc analysis and haplovalidate correlation between marker SNPs of a selected haplotype block depends on the strength of selection and the initial frequency of marker SNPs. Haplovalidate infers the optimal haplotype block reconstruction using data-specific minimal cluster correlations. This is achieved by using a cluster correlation, which is just above random correlations.
More specifically, we calculate the median correlation between two marker SNPs from two haplotype blocks on the same chromosome (focal block correlation) and two haplotype block marker SNPs between chromosomes (background block correlation) for each minimal cluster correlation tested. Because different chromosomes are not physically linked, background block correlations are on average an estimate of random associations and serve as a null distribution. Focal block correlations higher than background correlation (p-value <.025) suggest that the two blocks should be joined. In contrast, the focal block correlation of reconstructed haplotype blocks belonging to independent selection targets will not differ from the null distribution. We caution that LD for short blocks tends to be higher than background LD between chromosomes. This may cause a more liberal combination of haplotype blocks in very close distance. Since reconstructed haplotype blocks of simulated and experimental data are typically considerably larger than the regions of elevated LD (see Figures S1 and S2), we do not consider this a major problem of our approach.
The parameter window size determines the upper length of reconstructed haplotype blocks. Small windows result in shorter blocks (Franssen, Barton, et al., 2017), which facilitate the separation of independently selected regions. However, reliable haplotype block reconstruction strongly relies on the number and effect size of the candidate SNPs. As the number and effect size of candidate SNPs are highly data-specific, we do not only consider the number and effect size of SNPs in a given window, but also take the total number of candidate SNPs into account. Hence, we use the same fraction of total candidate SNP effect size per window and therefore make haplotype reconstruction comparable, even for data sets that differ in candidate SNP number and/or effect size distribution.
Consequently, a few moderately significant candidate SNPs may be sufficient to define a haplotype block when no other candidate SNPs are present on the chromosome. Alternatively, with many candidate SNPs, larger windows with more and highly significant candidate SNPs are needed to reconstruct a haplotype block. This is achieved with the median normalized CMH score sum (MNCS), a normalized measure combining effect sizes (based on the CMH test) and SNP numbers associated with window size (see also Equation (1) where p is the p-value of the CMH test and −log(p) the CMH score): We split the candidate SNPs in windows of a given size (0.

| Fixed parameters
We fixed the haplotype reconstruction parameters such that alleles starting from any frequency (starting allele frequency between 0 and 1) in at least one replicate are included. The allele frequency change threshold parameter aims to focus on SNPs changing more than expected under drift. As we used a modified chi-square test and CMH test that have a null hypothesis adapted for drift and pool sequencing noise (Spitzer et al., 2020), we set the allele frequency change threshold to 0. Following , we required at least 20 SNPs for each cluster and only clusters sharing at least four SNPs could be merged. Here, allele frequency shifts between the starting and the focal generation were used to estimate standardized variance (Jónás, Taus, Kosiol, Schlötterer, & Futschik, 2016

| Identification of multiple-target haplotype blocks
We identified multiple-target haplotype blocks by comparing hap-  Figure 9) using arcsine-square-root-transformation and Welch's t test.

| Simulations
We performed 1,000 genome-scale forward simulations covering the two main autosomes of Drosophila simulans using mimicree2 version 206 (Vlachos & Kofler, 2018). We simulated a diploid sexual

| Bottleneck simulations
As a decrease in population size can severely influence haplotype structure in a population, we tested the performance of haplovalidate on E&R simulations with two bottleneck scenarios. We repeated 100 simulations from the scenario with 32 targets and used the same simulation parameters but reduced the population size from generation 20 to 30 to 20% or 10%.

| Performance
An approach for reconstructing selected haplotype blocks without information about the founder haplotypes was proposed by Franssen, Barton, et al. (2017) and implemented in the r package haploReconstruct. This approach uses window-based correlation analysis of allele frequency data across replicates and time points combined with hierarchical clustering. Each cluster of SNPs corresponds to a selected haplotype block. Franssen, Barton, et al. (2017) successfully identified selected haplotype blocks from simulated and experimental E&R data. The reconstructed haplotype blocks were validated using haplotype sequence data. Most importantly, the approach of Franssen, Barton, et al. (2017) requires a low starting frequency of the selected haplotype block. While this is typically the case, also higher starting frequencies need to be considered . To test the generality of the approach and parameters proposed by Franssen, Barton, et al. (2017), we applied them to E&R data simulated with a broader range of parameters. We used a set of 189 sequenced D. simulans haplotypes (chromosome 2 and 3) to create a founder population (N = 1,000) with realistic linkage disequilibrium. Selection coefficients and starting allele frequencies were matched to a Drosophila E&R experiment by .
We tested four different numbers of selected loci (16, 32, 64 or 128 loci per chromosome), randomly sampling SNPs with the frequency in the founder population matching . Simulations were conducted over 60 generations in 10 replicates using a selective sweep scenario.
Using the reconstruction parameters for selection targets with low starting frequencies as recommended by Franssen, Barton, et al. with multiple targets can be substantial (median 38% to −91%, see Figure 4c) with a given haplotype block containing a median of 7%-14% of the targets (see Figure 4d). The issue of multiple selection targets being present on a single haplotype block becomes increasingly important with the number of selected loci in a simulation.
In addition, we compared the performance of haplovalidate in simulations with and without bottleneck. We analysed two different bottleneck scenarios, reducing the population size to 20% or 10% for ten generations. Although the resulting differences in effective population size were rather large ( Figure 6 panel h), the total true-positive identification rate was still above 90% for both bottleneck scenarios.
Overall, we identified slightly less selection targets (Wilcoxon ranksum test p-value < .001, Figure 6 panel a), which reflects a weaker selection signature due to the loss of selected haplotype blocks

F I G U R E 6
The performance of haplovalidate is robust to fluctuations in population size. 100 sweep simulations with 32 targets were performed with constant population size or containing a bottleneck which reduces the initial population size to 20% or 10% for 10 generations. in the bottleneck (Figure 6 panel h). This also resulted in a slightly higher false-positive rate (Figure 6 panel e, Wilcoxon rank-sum test p-value < .001 for 10% and p-value < .05 for 20%). Interestingly, the stronger bottleneck scenario resulted in an increase of single target and a decrease of multiple-target identifications (Figure 6 panel b,c), indicating that haplotype blocks with multiple selection targets are more frequently lost due to their lower initial frequency in the population (see below). We conclude that haplovalidate is robust to changes in population size, reliably detecting selected haplotype blocks even after severe population reductions.

| Using early generations to identify blocks with multiple selection targets
The allele frequency trajectory of multiple selected SNPs in high linkage disequilibrium (LD) cannot be easily distinguished from a single selection target. Because of the additive effect of the selection targets, highly correlated allele frequency trajectories will be obtained (see Figure 7) even if the selected SNPs are not in high LD at the beginning of the experiment: The haplotype block with the largest number of selection targets will increase most strongly and LD will increase during the experiment (see Figure 8). Even if the selected alleles have a low starting frequency and are not linked, recombination during the experiment may generate haplotype blocks with multiple selection targets, which will experience a stronger selective advantage, resulting again in increased LD. Thus, haplovalidate is very likely to reconstruct haplotype blocks with multiple targets, in particular for simulations with many selected alleles and high starting frequencies.
Across all simulations 37,846 selected alleles (82%) were located on a haplotype with multiple selected alleles at generation 60. 95% of these alleles share haplotypes already in the founder population. This is significantly more often than random pairs of 35,000 SNPs having the same physical distance (chi-square test, p-value < .001). This result indicates that haplotype structure in the founder population predetermines the occurrence of multiple-target haplotype blocks.
Given that LD between selection targets increases during the experiment, we reasoned that haplotype block reconstruction at earlier generations may be more powerful to distinguish independent selection targets that have high LD at later stages of the experi-  Figure 9). This was also observed when simulations with different numbers of selection targets were analysed separately (see Appendix S3).

| Experimental data
We applied haplovalidate to two different D. simulans E&R data sets which differ in their selection response to a new temperature regime.  found 88 selected regions on the autosomes whereas Mallard et al. (2018) highlighted five selection targets.
Because the haplotypes reconstructed by  were validated with experimentally derived haplotypes from founder and evolved generations, we consider these results as a gold stan-   overlap at least for 50% with haplovalidate (see Figure 10). In both cases, the overlap is significantly higher than expected by chance (chi-square test p-value < .001).
Interestingly, on chromosome 2 the reconstruction of haplovalidate is almost indistinguishable from . On chromosome 3, more independent haplotype blocks are inferred by haplovalidate.
Based on the currently available data, it is not possible to distinguish whether haplovalidate is more powerful to detect independent selection targets, or whether some of the additional haplotype blocks are false positives not containing a selection target.
We also applied haplovalidate to the data of Mallard et al. (2018).
Based on the 1,000 most significant SNPs for each chromosome, haplovalidate identified all five regions detected in the original study ( Figure 11), which is significantly more than expected by chance (chi-square test p-value = .03). In addition, we also identified two additional haplotype blocks, which are, however, not containing candidate SNPs reaching the significance threshold applied by Mallard et al. (2018).

| D ISCUSS I ON & OUTLOOK
Moving from a SNP-centric analysis to the identification of selected haplotype blocks provides a significant advancement of E&R studies . Introducing haplovalidate, we provide a tool to make the reconstruction of selected haplotype blocks a routine method that does not rely on the F I G U R E 8 F Haplotype blocks containing multiple selected alleles behave as single target of selection (see Figure 7 for the corresponding allele frequency trajectories, simulated data  the most commonly used out-crossing model in E&R studies, we matched our simulation parameters to D. simulans, which is better suited for E&R studies than D. melanogaster (Barghi, Tobler, Nolte, & Schlötterer, 2017). We anticipate that haplovalidate can be also applied to E&R studies using other organisms with a different recombination landscape, such as Caenorhabditis remanei (Teotónio, Estes, Phillips, & Baer, 2017), because haplovalidate accounts for linkage by comparing the correlation within blocks and between blocks.
Our study also demonstrated the limits of a haplotype blockbased analysis of the adaptive architecture. We found that a high fraction of the reconstructed haplotype blocks contained multiple selected alleles. Interestingly, a similar observation was made Interestingly, most multiple-target haplotype blocks were already present in the founder population indicating that the initial haplotype structure is an important factor for shaping the genomic signatures of adaptation, a result also supported by a theoretical study of Weissman and Barton (2012). More work is needed to understand how the haplotype composition of the founder population in combination with the number of founder haplotypes affects the power of E&R studies.

ACK N OWLED G EM ENTS
Special thanks to Susanne Franssen for her help with haploReconstruct and her feedback. The authors also thank the members of the Institute of Population Genetics for discussion and support on the