The classical hitchhiking model with continuous mutational pressure and purifying selection

Abstract Detecting selective sweeps driven by strong positive selection and localizing the targets of selection in the genome play a major role in modern population genetics and genomics. Most of these analyses are based on the classical model of genetic hitchhiking proposed by Maynard Smith and Haigh (1974, Genetical Research, 23, 23). Here, we consider extensions of the classical two‐locus model. Introducing mutation at the strongly selected site, we analyze the conditions under which soft sweeps may arise. We identify a new parameter (the ratio of the beneficial mutation rate to the selection coefficient) that characterizes the occurrence of multiple‐origin soft sweeps. Furthermore, we quantify the hitchhiking effect when the polymorphism at the linked locus is not neutral but maintained in a mutation‐selection balance. In this case, we find a smaller relative reduction of heterozygosity at the linked site than for a neutral polymorphism. In our analysis, we use a semi‐deterministic approach; i.e., we analyze the frequency process of the beneficial allele in an infinitely large population when its frequency is above a certain threshold; however, for very small frequencies in the initial phase after the onset of selection we rely on diffusion theory.


| INTRODUC TI ON
When a selectively favored mutation occurs in a population and is subsequently fixed, it is inevitable that the frequency of linked neutral variants will be altered. In a seminal paper, Maynard Smith and Haigh (1974) described this process and termed it genetic hitchhiking. They showed that in large populations a single hitchhiking event may temporarily reduce neutral genetic variation around the site of selection. In recombining organisms the size of the region of reduced variation depends critically on the ratio of the recombination rate and the selection coefficient of the favorable mutation and may be limited to a relatively small fraction of the genome. In nonrecombining organisms such as bacteria, however, variation on entire chromosomes may be eliminated by genetic hitchhiking.
The hitchhiking model was revisited in the late 1980s to describe patterns of reduced variation in DNA polymorphism data, which were found in genomic regions of low recombination rates around centromeres and telomeres of Drosophila (Aguade et al., 1989;Begun & Aquadro, 1992;Stephan & Langley, 1989) and also on the fourth chromosome (Berry et al., 1991). Begun and Aquadro further showed that levels of DNA variation correlate with recombination rates across much of the Drosophila melanogaster genome, whereas average divergence to its sibling species Drosophila simulans was hardly affected by recombination. Given these data, the deterministic hitchhiking model of Maynard Smith and Haigh | 15897 STEPHAN (1974) was extended by Kaplan et al. (1989) who analyzed a stochastic version of the process (including genetic drift) by means of coalescent theory. Furthermore, Stephan et al. (1992) studied genetic hitchhiking using the diffusion equation method. Alternative approximations of the hitchhiking model were provided by Barton (1998) and Gillespie (2000).
In population genetics, the concept of "genetic hitchhiking" is now more broadly used than around the year 1990 and describes any situation in which changes in allele frequencies caused by relatively strong selection affect the frequencies of neutral or weakly selected variants at linked sites in the genome. This includes-for instance-the case of balancing selection (Kaplan et al., 1988) and also background selection (Charlesworth et al., 1993). At the same time, and more specifically, for genetic hitchhiking caused by positive directional selection (as considered by Maynard Smith and Haigh), the term selective sweep is now generally used, which was introduced by Berry et al. (1991).
Several controversies have surrounded the application of the selective sweep model to data. Charlesworth et al. (1993) have explained the observed reduction of nucleotide variation in genomic regions of reduced recombination rates by background selection. According to this model, the level of neutral (or nearly neutral) variation can be reduced below classical neutral expectation by selection against the steady input of deleterious mutations. Furthermore, it has been difficult to distinguish the effect of selective sweeps from that of specific demographic scenarios, in particular bottlenecks (Pavlidis et al., 2010). Another controversy arose between selective sweeps and so-called soft sweeps (Jensen, 2014). The latter may be caused by positive directional selection on standing genetic variation after an environmental change or by multiple beneficial mutations segregating simultaneously in a population (Hermisson & Pennings, 2005, 2017Innan & Kim, 2004). Despite substantial efforts from many theorists and empiricists, fundamental questions on the relationship of demography, selective sweeps, soft sweeps, and background selection with regard to data analysis are still open. However, since these issues are not a focus of this study, the reader is referred to the work of Li andStephan (2006), Elyashiv et al. (2016), Comeron This article is devoted almost exclusively to the modeling efforts of selective sweeps by extending the classical hitchhiking model. We begin by formulating the model of Maynard Smith and Haigh (1974) more generally as a two-locus two-allele model with additive fitness.
Besides strong positive directional selection at the selected locus, we allow for weak purifying selection at the linked locus. Furthermore, we introduce mutation at both loci. This allows us to address the following topics: First, following Maynard Smith and Haigh we perform a deterministic analysis of the extended hitchhiking model. This analysis is valid after the trajectory of the strongly advantageous allele has reached a certain threshold frequency. Second, we derive analytical results that show under which conditions soft sweeps caused by multiple beneficial mutations segregating in a population (so-called multiple-origin soft sweeps) are predicted by our extended hitchhiking model and identify a new parameter characterizing the occurrence of this type of soft sweeps. Third, we quantify the hitchhiking effect (i.e., the degree of reduction of variation) under the assumption that the polymorphism at the linked locus is not neutral but in a mutation-selection balance. Fourth, we analyze the initial phase of the frequency process of strongly beneficial alleles after the onset of positive selection (until it reaches x 0 ) by diffusion theory. This allows us to derive initial conditions for our deterministic analyses mentioned above.

| DE TERMINIS TIC HITCHHIKING MODEL
To extend the classical hitchhiking model (Maynard Smith & Haigh, 1974), it is convenient to start from a diploid, two-locus two-allele model with additive fitness (Bürger, 2000, Chapter II.1). In this model, selection at both loci may be introduced in a straightforward way as well as mutation and recombination between both loci.
Calling the alleles at the first locus A and a, where A is the major allele, and those at the second locus B and b, we denote the possible gametes as AB, aB, Ab, and ab, and the relative frequencies of these gametes are x 1 , x 2 , x 3 , and x 4 . They add up to 1. Wiehe (1995, Chapter 4) derived equations for this model including viability selection and two-way mutation at both loci and recombination between loci. The ordinary differential equations (ODEs) of this model are as follows.
where a dot denotes differentiation with respect to time. A is the mutation rate from allele a to A, and A that in the opposite direction.
Similarly, B denotes the mutation rate from b to B. To maintain the property of the original model that the positively selected mutation at the second locus gets fixed at the end of a sweep, we put B = 0. The selection coefficients at the first and second locus are given by s 1 and s 2 , respectively. We assume that the absolute value of s 1 is generally (much) smaller than s 2 , which is positive and characterizes the fitness advantage of the beneficial allele. The recombination fraction between the two loci is r, and D = x 1 x 4 − x 2 x 3 measures linkage disequilibrium (LD).
The model described by the above equations is different from the model proposed by Maynard Smith and Haigh (1974) as it allows for mutation at both loci and variation at the first locus may deviate from neutrality. We will explore next to what extent this more general model can be treated analytically. Subsequently, because the deterministic model is not valid for very small frequencies of the beneficial allele (Kaplan et al., 1989), we analyze the initial phase of the adaptive process stochastically. This allows us to specify the (1) state of the above variables at time t 0 at which the deterministic phase begins.

| ANALYS IS OF THE DE TERMINIS TIC PHA S E
Following Maynard Smith and Haigh (1974), we introduce the coordinates p 1 , the frequency of A alleles in chromosomes containing B, and p 2 , the frequency of A in b-chromosomes. Thus, assuming x is the frequency of the selected allele B, we have p 1 = A consequence of this variable change is that we can analyze the model only in the interval 0 < x < 1. As explained above, this is not a severe limitation as our deterministic treatment is not valid very close to the boundary 0 anyway (Kaplan et al., 1989). With this transformation of variables, the ODEs (1)-(3) become Eq. (4) results from adding ODEs (1) and (2) and putting (5) and (6) exploit the equality Eq. (4) indicates that the beneficial allele B may be driven by three forces: positive directional selection at the second locus, mutation at the second locus, and selection at the first locus (via LD between the first and second locus; see Eq. (7)).
In the following, we analyze the behavior of the deterministic model in several distinct parameter ranges. In each case, strongly positive directional selection at the second locus is assumed to be present.
It is informative to first consider the effect of mutation and strong directional selection at the second locus on neutral variation (at the first locus) alone. Using the assumption that all parameter values are zero, with the exception of B , s 2 > 0, it follows from Eqs.

(8). This leads to
Separation of variables then yields where the integration constant is given as The initial values (at t = t 0 ) of the variables x, p 1 , and p 2 are denoted by the index 0. Simulations by Kaplan et al. (1989) suggest that x 0 should be at least as high as 5 , where is given by 2Ns 2 in diploid populations of size N.
Eqs. (11) and (12) can be used to calculate the allele frequencies x 1 and x 2 as a function of x (or alternatively as a function of t by solving Eq. (8)). From Eq. (6) follows that p 2 does not depend on mutation at the second locus. Therefore, The allele frequencies AB and aB are then given as and respectively.
Based on these results, we can address the question whether multiple-origin soft sweeps (Hermisson & Pennings, 2005, 2017 are that arose on an A-chromosome is on its way to fixation, the probability that a second beneficial mutation arises on an a-chromosome and substantially increases in frequency becomes smaller.

STEPHAN
Note that we have B s 2 < x 0 = 5 for realistic values of population size and beneficial nucleotide mutation rate. Therefore, B s 2 likely falls into the interval in which a stochastic treatment of the x process is required. Nonetheless, the above argument that is derived from Eq.
(8) holds. The reason is that the right-hand side of Eq. (8) is identical to the drift coefficient of the diffusion equation (except for the scaling factor 2N; see Eqs. (39) and (41) below).
Our analysis adds a new piece to the theory of Pennings (2005, 2017) who studied the occurrence of soft sweeps in a population of finite size. In their approach, the probability for mutation-based soft sweeps largely depends on a single parameter Θ, which is a scaled beneficial mutation rate that accounts for many short-term processes going on in a population (see the detailed discussion of short-term effective population size and the target size of beneficial mutations in Hermisson and Pennings (2017)).
Very recently, Feder et al. (2021) reported simulations of a model that is virtually identical to the hitchhiking model described here, except that it is haploid and consists of only a single locus at which beneficial mutations were allowed to arise such that each mutation created a new allele. In their simulations, mutation rate was fixed, while the selection coefficient s and population size N varied as did = N . For many parameter combinations, they ran forward simulations and recorded the percentage of runs in which the sum the frequencies of all mutations reached 50% by generation 30. In this case, a run was counted as a sweep. If a sweep occurred, they also checked whether more than one allele was at frequency >5%, which was counted as a multiple-origin soft sweep. In their Figure 3C, they show that indeed for all values of s increasing led to a higher percentage of soft sweeps, whereas Ns had almost no effect. These observations are consistent with the theory of Hermisson and Pennings (2017). A strong effect was also found for selection. Increasing s led to a remarkable reduction of the percentage of soft sweeps, which is in qualitative agreement with our analysis.
Finally, we discuss the role of p 10 and p 20 in the detection of soft sweeps. As we show in the stochastic analysis below, p 20 is approximately given by the frequency x * 3 of the major allele A at the onset of selection, while p 10 may be small due to the relatively large variance of x and x 1 in the initial phase. For p 10 < , where is the detection threshold of a soft sweep conditional on a sweep is occurring, AB gametes may remain undetected, if the mutation rate is too small (see Appendix A).
We first analyze the joint effects of mutation at the second locus and recombination between both loci. Eqs. (9) and (10) then become and The latter ODE may be integrated in a similar way as Eq. (10 Since ≪ 1, which appears to be biologically realistic, Eq. (19) is nearly identical to Eq. (11) and Eq. (22) is very similar to Eq. (13).
Therefore, in the presence of strong selection and mutation at the second locus recombination has only a very weak effect on the dynamics of the frequencies of the AB and aB gametes. This may be surprising, given the distinct effect of recombination in the presence of strong selection on heterozygosity at the neutral locus in the study of Maynard Smith and Haigh (1974). The critical difference between our model and the original one by Maynard Smith and Haigh, however, is mutation. Without mutation at the selected locus, our approach would lead to the same predictions as that of Maynard Smith and Haigh. In other words, the presence of mutation alters the typical sweep (hitchhiking) effect of the Maynard Smith-Haigh model by generating more than one haplotype with a selected allele in the initial phase that may lead to partial parallel sweeps.
Finally, we discuss the effect of mutation at the first locus in conjunction with recombination, mutation, and selection at the second locus. In this case, the difference between ODEs (5) and (6) is formally identical to Eq. (17) and can be integrated as shown above in Eqs. (18)-(20), assuming that the selection coefficient is much larger than the mutation rates at the first locus and r. In a similar way as above, p 2 can be calculated.
Here, we analyze the case studied by Maynard Smith and Haigh (1974), except that the polymorphism at the first locus is not neutral, but allele A is deleterious (maintained in a mutation-selection balance). Thus, in this subsection A is not the major allele. From Eqs.
(4)-(6), we get the following ODEs The frequency of A in the mutation-selection balance at the first locus is given by x 30 = A |s1| . Since the frequency of A is assumed to be small, reverse mutation from A to a is neglected.
A general analytical solution of this system of ODEs is difficult to obtain, but we may approximate these equations under the assumption that the frequency x 30 of the deleterious allele A is relatively small such that a strongly advantageous mutation occurring at the second locus at t = 0 hits a chromosome carrying allele a with high probability. In other words, we consider the following initial conditions at t = t 0 Furthermore, we assume that both r and | | s 1 | | ≪ s 2 . Under these assumptions, the quantities p 1 and p 2 remain small (compared to 1), while the strongly selected allele B increases from x 0 to 1 − x 0 (i.e., near fixation). Then, from ODEs (23)-(25) we obtain the following equations Equations (28) and (29) can be readily integrated using the initial conditions (26) and (27) Smith & Haigh, 1974;Stephan et al., 1992;Wiehe, 1995).
Based on Eq. (36), we can immediately predict the effect of strong selection at the second locus on heterozygosity at the first locus. Heterozygosity is an average over two events, as allele B arises with probability p 20 on an A-carrying chromosome or with probability 1 − p 20 on an a-chromosome (Kaplan et al., 1989;Stephan et al., 1992). However, since in a mutation-selection balance p 20 is small, we may neglect the first event and obtain Therefore, the ratio of heterozygosity after the sweep (at =̂ ) to heterozygosity before the sweep at = 0 is given by Thus, heterozygosity at the first locus is reduced after a sweep caused by strong selection at a linked second locus. The relative reduction of variation, i.e., the hitchhiking effect, is, however, less pronounced than in the case of a neutral polymorphism at the first locus.

| S TO CHA S TI C ANALYS IS OF INITIAL PHA S E
As mentioned above, the dynamics of the beneficial allele at very low frequency (x ≤ x 0 ) cannot be treated deterministically. Instead, we will use a diffusion approach. Assuming s 2 > 0, B > 0, r = 0, A = A = 0, and s 1 = 0, we will derive an appropriate diffusion equation for a diploid population of constant size N and then calculate the first and second moments of this diffusion.
We will first consider the frequency process of the beneficial allele and put z = x as diffusion variable. From Eq. (8), we find the drift coefficient as (25) p 2 = A 1 − p 2 + s 1 p 2 1 − p 2 + rx p 1 − p 2 . (26) where = 2Ns 2 and = 4N B . The selection term is linear in z, as the frequency of the beneficial allele in the initial phase is very low. In the initial phase, the diffusion coefficient is also linear in z; i.e., Thus, we have the following Kolmogorov forward equation describing a one-dimensional diffusion in the initial phase (Ewens, 2004, Chapter 4) The However, close inspection shows that for the biologically relevant parameter range ( > 100, > 0.005), the integrand in Eq. (53) can be approximated by e − t � for t � ≤ 2t 0 , where t 0 is defined below. This analysis takes the variance of the x diffusion (Eq. (47)) into account.
Using this approximation, we get Here, the mean time t 0 until the beneficial allele B reaches the threshold frequency x = x 0 under the influence of drift, directional selection, and mutation (starting from frequency 0) is given by Using the same approximation as in the derivation of Eq. (54), we obtain for the second moment of the x 1 process Finally, we are able to determine the initial conditions for the deterministic phase. Eqs. (54) and (55) allow us to calculate the value of x at time t 0 , i.e., at the beginning of the deterministic phase. We find p 1 t 0 = p 10 ≈ x * 3 . However, since the variances of the diffusions x and x 1 , for which we got simple analytical formulas (see Eqs. (47) and (56)), are relatively large, p 1 may not be well predicted by the first moments of x and x 1 at time t 0 . In contrast, the value of p 2 t 0 , which is defined as a ratio of two relatively large quantities (≥0.5) shortly after the onset of selection, is evidently better predicted.
Using Eq. (50), we get . Thus, p 2 t 0 is close to its value x * 3 at t = 0, which is expected.
On the other hand, if indeed p 10 = p 20 , as expected, we get the interesting result that at the time of fixation x 1 = p 10 (see Eqs. (12) and   14)). That means that the ratio of the frequency of AB gametes to the frequency of B alleles is constant during the selective phase from time t 0 to fixation. The effect of mutation during this phase is therefore negligible. In other words, the competition between mutation and selection (and drift) is expected to occur exclusively during the initial phase.

| D ISCUSS I ON
We extended the classical two-locus two-allele hitchhiking model of Maynard Smith and Haigh (1974)  who proposed a simulation model to explain certain features of HIV evolution. In their Figure 3C, they show for a fixed beneficial mutation rate that increasing the selection coefficient leads to a strong reduction of the percentage of multiple-origin soft sweeps, which is in qualitative agreement with our analysis.
We also analyzed the initial phase when-after the onset of strong positive selection-the frequency of the beneficial allele is very small (≪ 1). Based on diffusion theory, we calculated the first and second moments of the frequencies x(benefical allele) and x 1 (gamete AB). This helped us to quantify the initial conditions of the deterministic ODEs, which we needed to analyze our extended deterministic hitchhiking model. Our approach is based on the same biological assumptions as that of Martin and Lambert (2015) who analyzed the frequency process of the beneficial allele of the original hitchhiking model (i.e., without mutation at the selected locus). They used the (linear) Feller diffusion process for which more short-term results can be obtained explicitly than for a Wright-Fisher diffusion.
In the theoretical analysis of selective sweeps, several questions have not been satisfactorily addressed (Stephan, 2019). A major one concerns the traffic model. Although this model has been proposed 25 years ago (Barton, 1995;Kirby & Stephan, 1996), not much progress has been made in analyzing it. Most analyses still assume that selective sweeps along the genome occur sequentially, without interfering with each other. However, imagine a model with two partially linked loci at which beneficial mutations may enter a population independently. An interesting scenario arises when a second mutation B with higher fitness occurs, while the first one (A) is on its way to fixation. If A and B can recombine at some rate, there is a chance that the double beneficial mutant AB forms and eventually fixes. Basic questions such as the fixation probability of AB and its fixation time have been addressed in a series of mathematical papers (Bossert & Pfaffelhuber, 2018;Cuthbertson et al., 2012;Yu & Etheridge, 2010). However, the pattern of variation in genetic data for such a model of competing sweeps is largely unknown.
The only report on patterns of variation in recombining genomic regions has been published by Chevin et al. (2008). They modeled the case of two partially linked loci with positive directional selection at both of them and one neutral locus for an infinitely large population using ordinary differential equations. Solving these equations numerically, they found that the hitchhiking effect is weaker in this model than for a single sweep of comparable selection strength.
Furthermore, the interference of both sweeps may lead to an excess of intermediate-frequency variants in the genomic region between the selected sites, a signature that may be falsely interpreted as a sign of balancing selection. More work is needed to understand such a model.
Similarly, selective sweep models from the quantitative genetics literature have been relatively neglected by the population genetics community, such as the work of Caballero (1995, 1998).
These authors developed a quantitative genetic theory of effective population size and polymorphism of linked neutral loci in populations under directional selection and continuous mutation pressure.
Interestingly, they were able to apply the principles of their theory to the recurrent hitchhiking case by considering a steady input of weakly beneficial mutations instead of rare, strongly favorable ones, as is usually assumed in the model of recurrent selective sweeps (Kaplan et al., 1989;Wiehe & Stephan, 1993).

ACK N OWLED G M ENTS
I thank Jeffrey Jensen for drawing my attention to the paper by Feder et al. (2021), two anonymous reviewers for their valuable suggestions, and Thomas Wiehe for providing me access to his doctoral thesis.

CO N FLI C T O F I NTE R E S T
None declared.