Moderating the neutralist–selectionist debate: exactly which propositions are we debating, and which arguments are valid?

Half a century after its foundation, the neutral theory of molecular evolution continues to attract controversy. The debate has been hampered by the coexistence of different interpretations of the core proposition of the neutral theory, the ‘neutral mutation–random drift’ hypothesis. In this review, we trace the origins of these ambiguities and suggest potential solutions. We highlight the difference between the original, the revised and the nearly neutral hypothesis, and re‐emphasise that none of them equates to the null hypothesis of strict neutrality. We distinguish the neutral hypothesis of protein evolution, the main focus of the ongoing debate, from the neutral hypotheses of genomic and functional DNA evolution, which for many species are generally accepted. We advocate a further distinction between a narrow and an extended neutral hypothesis (of which the latter posits that random non‐conservative amino acid substitutions can cause non‐ecological phenotypic divergence), and we discuss the implications for evolutionary biology beyond the domain of molecular evolution. We furthermore point out that the debate has widened from its initial focus on point mutations, and also concerns the fitness effects of large‐scale mutations, which can alter the dosage of genes and regulatory sequences. We evaluate the validity of neutralist and selectionist arguments and find that the tested predictions, apart from being sensitive to violation of underlying assumptions, are often derived from the null hypothesis of strict neutrality, or equally consistent with the opposing selectionist hypothesis, except when assuming molecular panselectionism. Our review aims to facilitate a constructive neutralist–selectionist debate, and thereby to contribute to answering a key question of evolutionary biology: what proportions of amino acid and nucleotide substitutions and polymorphisms are adaptive?


I. INTRODUCTION
In the 1960s, analysis of molecular data sets generated with the newly developed techniques of protein sequencing (Zuckerkandl & Pauling, 1965;de Chadarevian, 1999) and gel electrophoresis (Harris, 1966;Hubby & Lewontin, 1966;Lewontin & Hubby, 1966) led to the foundation of the 'neutral mutation-random drift theory' (Kimura & Ohta, 1971c), now better known as the 'neutral theory of molecular evolution' (Dietrich, 1994).Half a century later and exabytes of sequencing data further, the debate between its proponents and opponents continues unabated (Kern & Hahn, 2018;Jensen et al., 2019).
The neutral theory of molecular evolution is a set of ideas built around a single hypothesis (Fig. 1), known in full as the 'neutral mutation-random drift hypothesis of molecular evolution and polymorphism' (Kimura, 1976).Apart from this main hypothesis, the neutral theory comprises a number of arguments thought to support the hypothesis (Figs 2 and 3).These arguments are of the hypothetico-deductive type: predictions derived from the neutral mutation-drift hypothesis are compared to empirical findings.The neutralist-selectionist debate questions the validity of this shell of supporting arguments (Figs 2 and 3).The debate has many aspects to it, but the recurring question is: are empirical observations consistent with the neutral mutation-drift hypothesis, or instead with the opposing selectionist hypothesis?
In this review, we argue that the neutralist-selectionist debate has been hampered by the co-existence of different interpretations of the meaning and implications of the neutral mutation-drift hypothesis.We highlight four aspects of the hypothesis that are subject to multiple interpretations (Fig. 4), namely: the extent to which the hypothesis acknowledges natural selection (Section II); the measurement unit in which the hypothesis is defined (Section III); the definition of selective neutrality (Section IV); and the functional importance of neutral molecular changes, along with its implications for phenotypic divergence (cladogenesis, Section V) and evolutionary optimisation (anagenesis, Section VI).Lastly, we discuss how the ambiguous nature of the neutral hypothesis has complicated the process of deducing and evaluating testable predictions (Sections VII-IX).
Validation or falsification of the neutral hypothesis cannot occur in a meaningful manner if the proposition under scrutiny is ambiguous and open to different interpretations (Fig. 1B).This review is meant as an objective, impartial treatise which aims to contribute to a consensus interpretation of the neutral hypothesis and thereby to facilitate a constructive neutralist-selectionist debate.Our review is not meant to advocate a dogmatic stance.On the contrary, we emphasise that in essence neutralists and selectionists agree on many aspects, except for on the frequency of positive selection events (Fig. 5).Estimating this frequency is still as relevant today as it was when the neutral theory was founded.
For brevity, throughout this review we will refer to the neutral mutation-random drift hypothesis as simply the neutral hypothesis.We will use this as an umbrella term, which includes three versions of the neutral hypothesis (i.e. the 'original', 'revised' and 'nearly' neutral hypothesis), but which, crucially, excludes the 'strict neutrality' hypothesis (Fig. 1A).

II. TO WHAT EXTENT DOES THE NEUTRAL HYPOTHESIS ACKNOWLEDGE NATURAL SELECTION?
(1) A misnomer: the neutral hypothesis does not deny selection The controversy between neutralists and selectionists concerns the question of what proportions of mutations, polymorphisms and substitutions are adaptive (Fig. 1).In this context, and throughout this review, mutations are defined as de novo alleles that have been introduced by recent mutation events.Polymorphisms are defined as alleles with a certain arbitrary minimum allele frequency (e.g.MAF > 0.05) and substitutions as alleles that have reached fixation (Kimura, 1991a).
The neutral hypothesis states that most mutations are not adaptive but instead neutral or deleterious, and that most A B Fig. 1.The (nearly) neutral hypotheses.(A) Schematic overview of the nearly neutral hypotheses and competing hypotheses.Colours represent relative proportions of five categories: deleterious (red), slightly deleterious (light-red), neutral (grey), positively selected alleles (green) and alleles under balancing selection (orange).(B) Different approaches to estimate the importance of positive selection yield vastly different results.α, proportion of substitutions resulting from positive selection; β, proportion of polymorphisms resulting from balancing selection; d, uncorrected genetic distance; f 0 , proportion of mutations that are selectively neutral (functional constraint); He, heterozygosity per site; N e , effective population size; q, proportion of deleterious mutations that reach fixation; t, time to most recent common ancestor; μ, mutation rate.Moderating the neutralist-selectionist debate polymorphisms and substitutions are neutral, resulting from segregation and fixation of (nearly) neutral mutations through random genetic drift (Fig. 1) (Crow, 1972a;Hughes, 2008;Hahn, 2008;Razeto-Barry, Díaz & V asquez, 2012;Jensen et al., 2019).A selectionist, on the other hand, believes that most polymorphisms and substitutions are adaptive, and result from balancing selection and positive selection acting on adaptive mutations (Fig. 1).Kimura (1977, p. 275), focusing on substitutions, wrote: 'According to the neutral mutation-random drift hypothesis of molecular evolution and polymorphism, most mutant substitutions detected through comparative studies of homologous proteins (and the nucleotide sequences) are the result of random fixation of selectively neutral or nearly neutral mutations.This is in sharp contrast to the orthodox neo-Darwinian view that practically all mutant substitutions occurring within species in the course of evolution are caused by positive Darwinian selection.' The difference between the neutral hypothesis and the opposing selectionist view is subtle.Although selectionists and neutralists disagree on the importance of positive selection and balancing selection, they agree on the importance of purifying selection (Hughes, 2008).This consensus was reached soon after Kimura (1968a) first proposed the neutral hypothesis, following an almost immediate and 'important revision' (Kimura & Ohta, 1971c, p. 469) which was first suggested by King & Jukes (1969) and subsequently adopted by Kimura & Ohta (1971c), and which acknowledged that many mutations are deleterious.In the words of Kimura (1976, p. 248): 'I must emphasise that there is no fundamental disagreement between the neutralists and the selectionists with respect to the prevalence of negative selection; both agree that deleterious mutations are numerous and these mutants tend to be eliminated by negative selection'.Furthermore, the neutral hypothesis does not deny the occurrence of adaptive substitutions or polymorphisms, but instead poses they are relatively rare.Kimura & Ohta (1974, p. 2851) stated: 'Adaptive changes due to positive Darwinian selection do no doubt occur at the molecular level, but we believe that definitely advantageous mutant substitutions are a minority when compared with a relatively large number of "non-Darwinian" type mutant substitutions, that is, fixations of mutant alleles in the population through the process of random drift of gene frequency'.
Neutralists even admit that many or even most of the alleles that are effectively neutral (i.e.randomly drifting in the population), may not be truly neutral, but instead have a small fitness effect.The nearly neutral hypothesis, which represents a second revision of the neutral hypothesis (Ohta & Kimura, 1971c;Ohta, 1972aOhta, ,c, 1973Ohta, , 2003;;Akashi, Osada & Ohta, 2012), was introduced to formally acknowledge the implications, in particular that small populations are expected to accumulate not only neutral but also slightly deleterious substitutions (Fig. 1).Kimura (1976) argued that the nearly neutral theory accentuated the difference between neutralists and selectionists.He wrote (p.249): 'If we adopt it, we deviate still further from the "selectionist" camp.In fact, if the neutral theory is "non-Darwinism," Ohta's theory amounts to "anti-Darwinian" to the extent that random fixation of very slightly deleterious mutations occurs frequently in evolution'.But if so, the second revision only marginally undid the concession of the first revision.By conceding the importance of purifying selection, neutralists had moved a considerable distance towards the selectionist standpoint (Lewontin & Fig. 3. Systematic classification of neutral theory predictions, and associated neutralist-selectionist arguments.The neutral theory assumes that sequence dissimilarity within and between populationsgenetic distance (d) and heterozygosity (He) respectivelydepend on the proportion of neutral mutations (functional constraint, f 0 ), migration rate (m), effective population size (N e ), recombination rate (r), split time (t), mutation rate (μ), and the functional importance of a nucleotide site [e.g.synonymous (S) versus non-synonymous (N)].Colour coding indicates usefulness of the tested predictions.Red: tested prediction is not useful because it does not necessarily follow from the neutral hypothesis.Grey: tested prediction is indiscriminative, meaning it may (possibly) also be deduced from the selectionist hypothesis.(For considerations relating to the molecular clock argument, see Section IV.2).Green: correct, discriminative prediction.In general, discriminative power increases when both dependent variables (d and He) are examined in combination.Here, levels of population differentiation (F ST ) are defined as (d-He)/d.Not depicted: interaction between the explanatory factors (e.g.N e and r versus functional importance of a site) may explain additional variation, such as variation of the ratio between non-synonymous (N) and synonymous (S) substitutions (dN/dS divergence ) and polymorphisms (dN/dS polymorphism ) across demes and loci.Moderating the neutralist-selectionist debate Hubby, 1966;Lanks & Kitchin, 1970;Clarke, 1970a,b).From the perspective of a selectionist, acknowledging that even randomly drifting alleles are not truly neutral was just another step of neutralists accepting the ubiquity of natural selection (Kimura, 1976).
Whereas the outcome of the second revision became known as the 'nearly neutral theory' or the 'slightly deleterious theory', no new names were introduced during the first revision.The labels 'neutral theory' and 'neutralists' persisted, even though they had become misnomers.  . 5. Flowchart depicting (dis)agreements between selectionists and neutralists.Selectionists and neutralists agree on many aspects of molecular evolution, except for on the question whether most substitutions and polymorphisms in proteins and functional DNA are caused by genetic drift or instead by positive and balancing selection.N e denotes effective population size, and s denotes the selection coefficient.Lewontin (1974, p. 197-198) wrote: 'The suggestion that most, if not all, of the molecular variation in natural populations is selectively neutral has unfortunately led to the widespread use of the terms "neutral mutation theory" and "neutralists" to describe the theory and its proponents […].But these rubrics put the emphasis in just the wrong place […].It is not claimed that nearly all mutations are neutral or that evolution proceeds without natural selection, chiefly by the random fixation of neutral mutations.[…] On the contrary, the claim is that many mutations are subject to natural selection, but these are almost exclusively removed from the population.[…] In addition, the theory allows for the rare favourable mutation, which will be fixed by natural selection, since after all adaptive evolution does occur.But it supposes this event to be uncommon'.
(2) Testing the neutralist strawman of strict neutrality None of the three versions of the neutral hypothesis (original, revised, and nearly neutral) excludes selection events entirely (Fig. 1).This is in sharp contrast to a fourth hypothesis called 'strict neutrality', which assumes complete absence of selective forces (Fig. 1).While most researchers will be well aware of this distinction, it is nevertheless frequently overlooked when interpreting empirical data.
Because the strict neutrality hypothesis assumes stochastic processes only, it can be translated into a statistical null model, and therefore offers practical convenience for testing purposes (Kreitman, 1996;Hey, 1999).However, as with all null models, interpretations need to be in the context of the simplifying assumptions made.Clearly, rejection of strict neutrality predictions does not equate to rejection of the revised neutral or nearly neutral hypothesis, and not even to rejection of the original neutral hypothesis.The neutral hypothesis is, like the selectionist hypothesis, an alternative hypothesis which opposes the null hypothesis of strict neutrality.While rejection of strict neutrality predictions is indicative of selection, it does not reveal the nature of the selection pressures (purifying or positive), and hence it cannot serve to decide between the neutral hypothesis and the selectionist hypothesis.As commented by Kimura & Ohta (1974, p. 2851): 'The existence of selective constraint, often inferred from non-randomness in amino acid or nucleotide sequences, does not contradict the neutral mutation-random drift hypothesis'.And in the words of Lewontin (1974, p. 198): 'The neoclassical theory cannot be refuted by erecting a neutralist strawman and refuting that'.Even so, many popular arguments against the neutral hypothesis do just that: they refute strict neutrality, not one of the versions of the neutral hypothesis (see Section VIII).
(3) The neutral hypothesis acknowledges purifying selection, and thus background selection The neutralist-selectionist debate originally concerned the proportion of adaptive amino acid substitutions, but the debate gradually shifted to the proportion of nucleotide substitutions, first within genes, later also outside genes.The arrival of next generation sequencing data stretched the neutral theory to its limits.Even though the effects of linked selection were investigated concurrently with the foundation of the neutral theory (Hill & Robertson, 1966;Maynard-Smith & Haigh, 1974), predictions from the neutral hypothesis were originally derived for individual loci in isolation from the rest of the genome (Charlesworth & Charlesworth, 2018).In the words of Hey (1999, p. 36): 'The theory that had been constructed on the basis of allelic data was not up to the task of revealing the role of natural selection in shaping the pattern of this new kind of variation'.
Against this historical backdrop, it is perhaps no surprise that challenging the neutral hypothesis based on falsification of strict neutrality predictions has become especially commonplace in the study of genome-wide genetic variation.Evidence for linked selection (i.e.allele frequency changes due to linkage to a nearby allele under selection) is routinely referred to as evidence against the neutral hypothesis, even if the nature (positive or purifying) of the selection events has not been established.For example, it has been stated that the neutral hypothesis is violated by the finding that 'many loci reveal nonneutral local patterns of variation' (Hey, 1999, p. 37), by the finding that 'natural selection plays a dominant role in shaping patterns of neutral molecular variation in the genome' (Corbett-Detig, Hartl & Sackton, 2015, p. 13) and by the finding that 'at almost every locus, there has recently been a selected allele nearby (whether advantageous or deleterious)' (Kern & Hahn, 2018, p. 1368).
This interpretation of genome-wide variation assumes that the neutral hypothesis denies linked selection.For example, Hahn (2008, p. 257) writes: 'The linked selection claim of the neutral theory is that linked selection does not affect a vast majority of loci, and therefore that variation in nature reflects the predictions of neutral models'.Sella et al. (2009, p. 2) claim: 'Neutral theory […] states that the effects of both positive and negative selection at linked loci on the dynamics of neutral alleles can be ignored'.Similarly, Phung, Huber & Lohmueller (2016, p. 4) claim: 'Selection affecting linked neutral diversity and divergence is at odds with the neutral and nearly neutral theories'.
These claims are misleading.It follows logically that a hypothesis that advocates the prevalence of purifying selection also advocates the prevalence of background selection (i.e.allele frequency changes due to linkage to a nearby deleterious allele).Thus, neutralists can readily accept background selection as the null model for explaining levels of genome-wide genetic variation without having to modify their main proposition (Booker, Jackson & Keightley, 2017;Comeron, 2017).
(4) Selectionists do not advocate panselectionism Early papers on the neutral theory challenged the idea of panselectionism.For instance, King & Jukes (1969) explicitly disagreed with Simpson (1964Simpson ( , p. 1537)), who wrote that 'it seems highly improbable that proteins, supposedly fully determined by genes, should have non-functional parts'.King & Jukes (1969, p. 789) argued: 'To hold that selectively neutral iso-alleles cannot occur is equivalent to maintaining that there is one and only one optimal form for every gene at any point in evolutionary time'.Moderating the neutralist-selectionist debate Kimura (1979b) also contrasted the neutral hypothesis to molecular panselectionism.He wrote (p.98): 'The Darwinian theory of evolution through natural selection is firmly established among biologists.[…] In this view any mutant allele, or mutated form of a gene, is either adaptive or less adaptive than the allele from which it derived'.Jukes (1991, p. 484) explicitly stated: 'The real choice is between panselectionism and near-neutrality'.
However, opposing the neutral theory does not automatically make one a panselectionist (Ayala, 1974).The view that most amino acid substitutions are adaptive is compatible with the view that certain amino acids can be replaced without altering protein function.Therefore, claiming that selectionists believe that 'each amino acid has a unique survival value in the phenotype of the organism' (Jukes, 1991, p. 480) is almost as inaccurate as mistaking the neutral hypothesis for strict neutrality.Just as the neutralist hypothesis cannot be refuted by refuting strict neutrality (Section VIII), the selectionist hypothesis cannot be refuted by refuting panselectionism.Unfortunately, many observations thought to support the neutral hypothesis are in reality incompatible with panselectionism, but not with the selectionist hypothesis (Section IX).
(5) Neutrality during stabilising and directional selection Given that neutralists acknowledge that many mutations are deleterious, it is, in fact, self-evident that neutralists also accept that some mutations are adaptive.By incorporating purifying selection into their theory (King & Jukes, 1969;Kimura & Ohta, 1971c), neutralists implicitly acknowledged the occurrence of positive selection events.As pointed out by Goodman (1981, p. 114), whenever purifying selection operates, 'somewhere in the past Darwinian selection must have spread the amino acid substitutions which are now preserved'.In the words of Gillespie (1994, p. 943): 'Natural populations evolve to the point where most mutations are deleterious through the substitution of advantageous mutations.At such time when all mutationally accessible advantageous alleles are exhausted, all newly arising mutations will be deleterious'.Wagner (2008, p. 970) similarly remarked: 'The effect of a mutation exists only in the context of the mutations preceding it'.
Once a gene or protein converged to its optimum configuration, functionally important sites are expected to experience an equal number of slightly deleterious substitutions and slightly beneficial back-substitutions or compensatory substitutions (Fig. 6, Table 1) (Charlesworth & Eyre-Walker, 2007), a process which has been referred to as 'selection without adaptation' (Hartl & Taubes, 1996).In the words of Gillespie (1994, p. 943): 'While the […] majority of borderline mutations are deleterious, of those that fix, precisely half are deleterious and half are advantageous'.Meanwhile, functionally less-important sites will continue to accumulate neutral substitutions, turning adaptive substitutions into a minority (Table 1).
It has been argued that, like deleterious mutations, also neutral mutations ultimately result from past Darwinian selection.This idea, known as the drift-barrier hypothesis (Sung et al., 2012) or 'selection-mutation-drift' hypothesis (Bulmer, 1991), holds that a phenotypic trait under Darwinian selection will converge to its optimum, but may never reach it, because the final selective increments are so small that the causative mutations are effectively neutral (Fig. 6).This would be especially true if the relationship between fitness score and trait value is best described by a plateauing function in which trait increments are associated with diminishing fitness returns (Hartl, Dykhuizen & Dean, 1985;Akashi et al., 2012).The inevitable outcome of the adaptation process would be near-neutrality of subsequent mutations.Hartl et al. (1985, p. 669) wrote: 'We infer from this assumption that natural selection will continue until an enzyme activity is achieved beyond which an additional increase in enzyme activity results in a negligible increase in fitness.At this point the fate of new mutations that have small effects on enzyme activity will be determined principally by the effects of random genetic drift.Thus, we deduce that evolution by means of the random genetic drift of neutral or nearly neutral mutations is not only consistent with the Darwinian theory of natural selection but also that the existence of nearly neutral alleles follows as a consequence of long continued natural selection of an enzyme gene.In this somewhat ironic sense, the greater the selectionist one might be, the greater the neutralist one should become'.
These considerations do not, however, imply that the neutralist hypothesis is valid only during the phase of stabilising selection, once a phenotypic trait, and there with the underlying genetic architecture, has converged to its optimum (Fig. 6).In reality, and depending on the number of critical sites in a protein or regulatory sequence, it is theoretically very much possible that even during the adaptation process itself (i.e.directional selection phase), most amino acid substitutions are neutral (Table 1).

III. IN WHICH MEASUREMENT UNIT IS THE NEUTRAL HYPOTHESIS DEFINED?
(1) Proportion of adaptive substitutions or adaptive polymorphisms?
The neutral hypothesis makes statements about two evolutionary processes: the maintenance of genetic variation within demes, and the genetic divergence of demes.Prior to the neutral theory, it was believed that these two processes were mainly regulated by different mechanisms, namely balancing selection and positive selection (Dietrich, 1994;Ohta, 2003).When viewed in the light of the neutral theory, these two processes reflect to a large extent two sides of the same coin: genetic drift of (nearly) neutral alleles.In the words of Kimura & Ohta (1971c, p. 469): 'Protein polymorphism and molecular evolution are not two separate phenomena, but merely two aspects of a single phenomenon caused by random frequency drift of neutral mutants in finite populations'.
The potential of the neutral theory to unite two seemingly unrelated processesgenetic divergence and genetic variabilityinto a single theoretical framework is one of its great appeals (Higgins, 2004) Fig. 6.Evolutionary optimisation.(A) Directional selection on a certain trait (e.g. protein stability or binding specificity) is expected to enhance protein performance over time through the serial replacement of higher-fitness alleles.However, the search through the multi-dimensional sequence space is constrained by the availability of neighbouring alleles which are single-step mutations away.If all mutations have a fitness effect (panselectionism), the protein configuration may become trapped in a local optimum and remain suboptimal.Neutral mutations provide neutral pathways through the sequence space, providing an escape route out of local optima.Because small fitness increments are effectively neutral, a protein can only converge, yet never fully reach, the global optimum.The magnitude of this drift barrier depends on the population size.In theory, the fixation of slightly deleterious alleles (nearly neutral hypothesis) can in small populations initiate a mutational meltdown.Larger populations can avoid this through the fixation of back and/or compensatory mutations, resulting in an equal number of deleterious and advantageous substitutions during the stabilising selection phase.(B) The selective values of all alleles in sequence space (here arbitrarily depicted as being normally distributed) are relative to the allele currently present in the population (dashed line).Once the protein has converged to its optimum state (stabilising selection phase), all potential mutations are either deleterious or effectively neutral.
Table 1.Estimation of alpha.Simplified, hypothetical example of protein evolution, illustrating different methods to estimate the proportion of adaptive substitutions (α, in bold): a per-generation estimate, a cumulative estimate, and a relative estimate (here relative to start sequence).The protein consists of five amino acids, in which the letters S, N, D, and A refer to start, neutral, deleterious and adaptive base respectively.The colours grey, red and green refer to substitution events where the new base is neutral, deleterious or adaptive respectively, assuming a stable environment.Owing to the occurrence of multiple substitutions per site, the relative estimate will overestimate the cumulative proportion of adaptive substitutions.Once the protein has reached its optimal composition/ configuration (asterisk), the evolution of the protein reaches a new phase of stabilising selection in which the number of deleterious substitutions at functionally important sites is expected to equal the number of adaptive back-substitutions.Meanwhile, functionally less important sites (site 1) will continue to accumulate neutral substitutions, thereby further decreasing the cumulative proportion of adaptive substitutions.Here, a substitution with a deleterious allele is counted as a neutral event.
GEN. Moderating the neutralist-selectionist debate Maynard Smith (1970b, p. 231): 'A hypothesis which kills two birds with one stone […] is attractive'.However, it does mean that neutralists and selectionists are fighting a battle on two fronts, involving slightly different claims (Fig. 1).Selectionists stress the importance of positive selection when it comes to genetic divergence, but the importance of balancing selection when it comes to polymorphisms.Because positively selected mutations reach fixation rapidly, they will constitute a minor proportion of observed polymorphisms, even if they occur frequently.Furthermore, the rapid fixation of adaptive alleles causes reduction of genetic variation, which conflicts with the observed high levels of genetic diversitythe trigger of the neutralist-selectionist debate on polymorphism data (Hubby & Lewontin, 1966;Kimura & Ohta, 1971c;Dietrich, 1994).Mutations under balancing selection, on the other hand, can cause a strong increase in polymorphism levels (Kimura, 1960b;Dietrich, 1994;Roff, 1998).

Site
Thus, the yardstick for the neutralist-selectionist debate on genetic divergence is the proportion of substitutions resulting from positive selection (known as alpha, α) (Booker et al., 2017), whereas the yardstick for the debate on polymorphism data is the proportion of polymorphisms resulting from balancing selection (here denoted as beta, β) not the proportion of polymorphisms that are under purifying or positive selection.
(2) Does the neutral hypothesis refer to protein evolution or DNA evolution?
The neutral hypothesis was originally proposed to explain patterns in protein data, namely the rough constancy of amino acid substitutions inferred from protein sequence data sets, the great diversity of primary structure of homologous proteins across species, and the unexpectedly high levels of polymorphisms in protein gel electrophoresis data sets (Kimura, 1968a;King & Jukes, 1969;Maynard-Smith, 1970b).One could therefore argue that originally the hypothesis that most substitutions are neutral pertained to amino acid sequences (Gillespie, 1995).This agrees with early formulations of the neutral hypothesis in the literature, such as that 'most of the changes that have occurred in evolution at the level of protein sequences might be meaningless noise rather than adaptive change' (King, 1983, p. 25), that 'a majority of amino acid substitutions that occurred in these proteins are the result of random fixation of selectively neutral or nearly neutral mutations' (Kimura, 1969b(Kimura, , p. 1181)), that 'a considerable proportion of amino acid substitutions in proteins are selectively neutral' (Maynard- Smith, 1970b, p. 231), and that 'random drift of neutral mutations in finite populations can account for observed protein polymorphisms' (Kimura & Ohta, 1971c, p. 467).
On the other hand, throughout the 1960s, and prior to the onset of the neutralist-selectionist debate, Kimura repeatedly discussed genetic diversity at the level of nucleotide sequences (Kimura & Crow, 1964;Kimura, 1968b).He continued this line of reasoning in his papers on the neutral theory, including his landmark paper on Haldane's dilemma (Kimura, 1968a), where he attempted to convert amino acid substitution rates into genome-wide nucleotide substitution rates.In a second paper released in the same year, synonymous mutations played a central role in his argument that many mutations must be neutral (Kimura, 1968b), and they would continue to do so (Kimura, 1977(Kimura, , 1980(Kimura, , 1991a)).For instance, Kimura (1991b, p. 374) recalled: 'The first truly favourable evidence for the neutral theory was the finding that synonymous base substitutions, which do not cause amino acid changes, occur almost always at much higher rates than non-synonymous, that is, amino-acid altering substitutions'.
Thus, the neutral theory of molecular evolution can be considered an umbrella term for two parallel theories: the neutral theory of DNA evolution and the neutral theory of protein evolution (Crow, 1972a).The neutral hypothesis of DNA evolution holds that most nucleotide substitutions are fixed by random drift, whereas the neutral hypothesis of protein evolution holds that most amino acid substitutions are fixed by random drift (Fig. 5).The neutral hypothesis of DNA evolution can be further subdivided into the neutral hypothesis of genomic DNA evolution and the neutral hypothesis of functional DNA evolution, previously referred to as the 'weak' and 'strong' version, respectively (Hahn, 2018).The former holds that most nucleotide substitutions across the entire genome are fixed by random drift.The latter holds that this claim remains true even when only considering coding and regulatory regions (Fig. 5).
Inevitably, this ambiguous nature of the neutral theory has clouded the neutralist-selectionist debate.This confusion surfaces mainly during discussions of silent sites and noncoding DNA.Substitutions and polymorphism in these regions are relevant to the neutral hypothesis of DNA evolution, but irrelevant when estimating the proportion of adaptive amino acid substitutions (Crow, 1972a).It may be argued that by steering the discussion to the high substitution rates in silent sites and non-coding regions (Kimura, 1991a;Ohta & Gillespie, 1996), neutralists unintentionally committed a red herring fallacy.Hey (1999, p. 36) reflected: 'The finding that nonfunctional sequences evolved fastest and harboured lots of DNA sequence variation was definitely a neutralist prediction, and books written by neutralists in the middle 1980s proclaimed victory to varying degrees.To a neutralist from this time […], it would not have seemed fair to label these findings a distraction from the debate, but that is what they seemed to selectionists.Even if one did accept that junk DNA could have neutral mutations, nonfunctional mutations in nonfunctional DNA did not bear on the original questions about natural selection on protein variation'.
The recent debate between Kern & Hahn (2018) and Jensen et al. (2019) illustrates that this ambiguity continues to cause apparent disagreements today.Kern & Hahn (2018, p. 1367) write: 'Application of the MK test to data from protein-coding genes has revealed a predominant role for adaptive natural selection'.In reply, Jensen et al. (2019, p. 112) accuse Kern & Hahn (2018) of circular reasoning because the 'inferred frequencies of adaptive substitutions mostly concern only the small fraction of the genome that codes for proteins'.This disagreement can be resolved by identifying whether arguments are meant to evaluate the neutral hypothesis of protein evolution, the neutral hypothesis of Biological Reviews 99 (2024) 23-55 © 2023 The Authors genomic DNA evolution, or the neutral hypothesis of functional DNA evolution.
In this context it is worth noting that the founders of the neutral theory, despite often considering the entire genome, did not believe that the neutral hypothesis stands or falls by the proportion of mutations in non-functional DNA (Kimura, 1991a).Although King & Jukes (1969) did discuss the presence of non-coding DNA, they did so merely as a correction to Kimura's 'cost of selection' calculations (Okazaki et al., 2021).In a subsequent study, Ohta & Kimura (1971c, p. 24) wrote: 'We are quite sure that the adaptive gene substitutions constitute a small fraction of the total substitutions.This is true even if we restrict our consideration to informational part of the DNA.Since nucleotide substitutions in the non-informational part must be largely non-adaptive, the fraction of adaptive substitutions becomes still smaller if we consider the total DNA'.And to quote Kimura (1991b, p. 370): 'If the neutral theory is valid, a large fraction of evolutionary nucleotide substitutions occurring at functionally important parts of the genome are also selectively neutral […].Thus, neutral evolution is by no means restricted to "junk" part of the genome'.
The ongoing debate mostly concerns the original question, the verity of the neutral hypothesis of protein evolution.In the words of Crow (1997, p. 262), who used a slightly different terminology: 'The neutral theory can be thought of in weak and strong forms.The weak form asserts that most non-coding and nonregulatory DNA evolves by random drift.The strong form asserts that amino acid changes are also dominated by random drift.I think it is fair to say that the weak form is widely accepted.[…] As for the strong form, the jury is out'.
Even so, both neutral hypotheses of DNA evolution are not undisputed either.The majority of substitutions within genes may be synonymous, but codon usage bias suggests that at least some synonymous substitutions are adaptive.With regard to the neutral hypothesis of genomic DNA evolution, it is thought that mutational load puts an upper limit of 20-25% on the functional fraction of the human genome, challenging (along with other considerations) claims of the ENCODE project that this proportion could be as high as 80% (Dunham et al., 2012;Eddy, 2012;Graur et al., 2013;Rands et al., 2014;Graur, 2017).But for species with much higher reproductive capacities, mutational load is less of a limiting factor (Elliott & Gregory, 2015), and in many branches of the tree of life the proportion of functional DNA (protein-coding or regulatory) may outweigh the proportion of non-functional DNA (Halligan & Keightley, 2006;Land et al., 2015).For these organisms, the verity of the neutral hypothesis of genomic DNA evolution is not self-evident, and instead depends on the frequency of adaptive and non-adaptive substitutions within functional regions.
When Kimura (1968a) compared the empirical substitution rate to Haldane's theoretical upper limit, he expressed both estimates in nucleotide base pairs per year.More specifically, he calculated that 'one nucleotide pair has been substituted in the population roughly every 2 year' (Kimura, 1968a, p. 625).Based on similar considerations, Ohta & Kimura (1971c, p. 18) provided what may be regarded a quantitative formulation of the neutral hypothesis, by stating that 'approximately 10% of the amino acid substitutions of average cistrons [i.e.genes] might be genetics'.Ohta made a similar argument with regard to regulatory elements (Ohta, 2002(Ohta, , 2003)).
While α estimates are still most reliably obtained from data on single-base substitutions, the most comprehensive picture of the relative contributions of neutral and selective forces on genetic variation is obtained when considering all mutation types (replacements, insertions, deletions, inversions, translocations), ranging in size from single bases to entire chromosomes.

IV. WHEN SHOULD A SUBSTITUTION OR POLYMORPHISM BE CONSIDERED NEUTRAL?
(1) Asking the wrong question: do truly neutral mutations exist?
The neutral hypothesis states that most polymorphisms and substitutions are neutral, but what exactly is meant by 'neutral'?Completely neutral mutations do not alter protein structure, function or regulation in any way and hence do not have a fitness effect (Perutz, 1984;Bowie et al., 1990).Nearly neutral mutations, on the other hand, may affect protein structure and function, but with negligible fitness effects.In the words of Kimura (1991b, p. 370): 'By selectively neutral I mean selectively equivalent: namely, mutant forms can do the job equally well in terms of survival and reproduction of the animals possessing them'.
The co-existence of two definitions of selective neutrality [i.e.completely neutral (s = 0) versus nearly neutral (s À! 0), with s denoting the selection coefficient (Table 2A)], is yet another potential source of confusion in the neutralistselectionist debate (Stoltzfus, 1999).For example, according to Hahn (2008, p. 256), the neutral theory claims that 'the vast majority of polymorphism within species and fixed differences between species have no effect on fitnessthat is, there is no direct selection on them'.By contrast, Jensen et al. (2019, p. 111) write: 'The neutral theory of molecular evolution asserts that most de novo mutations […] are under such weak selection that they may become fixed as a result of genetic drift'.These statements suggest that Hahn (2008) refers to completely neutral mutations, whereas Jensen et al. (2019) refer to nearly neutral mutations.
The ambiguous interpretation of selective neutrality dates back to the foundation of the neutral theory.Kimura (1968a, p. 625) explicitly referred to 'neutral or nearly neutral mutations', and stressed that population-genetic theory predicts that the behaviour of these two mutation types is indiscernible (Kimura, 1968a(Kimura, ,b, 1991a)).By contrast, King & Jukes (1969) focused their discussion on 'truly neutral' mutations.While Kimura (1968b, p. 260) remarked that 'probably not all synonymous mutations are neutral, even if most of them are nearly so', King & Jukes (1969, p. 789) wrote instead: 'As far as is known, synonymous mutations are truly neutral with respect to natural selection'.They furthermore explicitly disagreed with Simpson (1964Simpson ( , p. 1537)), who had stated: 'The consensus is that completely neutral genes or alleles must be very rare if they exist at all'.
Owing to the attempt of King & Jukes (1969) to identify truly neutral genetic variants, the neutralist-selectionist debate headed off in the wrong direction.Proponents and opponents got bogged down in discussions on whether silent and conservative mutations are truly neutral or not, and whether iso-alleles do exist.Critics pointed out, rightfully, that the selective neutrality of a mutation is not determined by its effect on protein structure and protein function only, but also by other factors such as translation efficiency, and that therefore it is premature to assume that silent and conservative mutations are truly neutral (Richmond, 1970;Clarke, 1970a).
These points were all valid, but the relevant question wasand isnot if and how many alleles have a fitness effect, but rather how many alleles have a fitness effect large enough for selection to outforce stochastic factors controlling the fate of alleles (i.e.genetic drift) (Crow, 1972a).Theory predicts that this depends on the effective population size (N e ), which is why population geneticists assume that 'neutrality depends not only on s but also on N e '. (Kimura, 1968b, p. 262).In the words of Kreitman (1996, p. 680): 'It is not so much a question of whether strictly neutral mutations existprobably no mutation is absolutely neutralbut whether the strength of selection for or against that mutation is much smaller or greater than the strength of genetic drift'.The importance of N e cuts both ways.As stressed by Lanks & Kitchin (1970, p. 754), in large populations even the smallest selection coefficients can make a difference: 'An arbitrarily small selective advantage or disadvantage may be undetectable in a single individual but quite apparent in a sufficiently large population.[…] It is certainly not valid to maintain that a mutated protein is equivalent to the wildtype on the basis of such gross evaluation as enzyme activity assays or the absence of clinical symptoms'.But the opposite is equally true: in small populations strong fitness effects can be overruled by genetic drift.In the words of Kimura & Ohta (1974, p. 2851): 'Difference in function at the molecular level does not necessarily lead to effective natural selection at the levels of individuals within a population'.
Confusingly, in spite of explicit statements that neutrality depends on s and N e (Kimura, 1968b), and in spite of explicit references to 'neutral and nearly neutral mutations' Kimura (1968a, p. 625), neutral theory predictions have been derived assuming neutral mutations only (see Section VII).This inconsistency has been corrected in the nearly neutral theory, but persists to date in the neutral theory.In the words of Kimura (1991b, p. 377): 'The neutral theory assumes that the mutations can be classified into two distinct groups, namely, the completely neutral class (with the fraction f 0 ), and the definitely deleterious class (fraction 1-f 0 )'.Kreitman (1996) used this criterion to distinguish between the 'completely neutral theory' (i.e.original and revised neutral hypothesis) and the 'slightly deleterious model' (i.e.nearly neutral hypothesis)not to be confused with the distinction between the neutral hypothesis and strict neutrality (Fig. 1).He wrote (p.678): 'The issue is whether the neutral theory applies to those models in which there is absolutely no selection (completely neutral mutations only), or to models in which genetic drift dominates the fate of a mutation (effectively neutral mutations).The distinction is not trivial, as this slight semantic change in the theory has been shown to have severe consequences on the predicted patterns of genetic variation and the substitution rate'.
Traditionally, the proportion of adaptive substitutions and polymorphisms has been calculated without considering genetic hitchhiking (Ohta & Kimura, 1971a;Crow, 1972b).However, Kern & Hahn (2018, p. 1367) ask the question: 'How much of the genome is directly or indirectly influenced by adaptive natural selection?'.The phrase 'directly or indirectly' suggests that in their opinion hitchhiking alleles should be factored into the equation.This is also how the following statement of Wagner (2008, p. 965) could be interpreted: 'According to selectionism, beneficial mutations are abundant: most mutations that go to fixation in a population would be beneficial, or are at least linked to abundantly occurring beneficial mutations'.Clearly, including hitchhiking sites in the estimate of α will lead to very different conclusions, especially regarding the validity of the neutral hypothesis of genomic DNA evolution.
Although the question posed by Kern & Hahn (2018) is indeed relevant for other theoretical questions and practical applications, such as demographic inference (Gillespie, 2001;Pouyet et al., 2018;Harris, 2018;Johri et al., 2021;Buffalo, 2021), we argue that when evaluating the neutral hypothesis, the proportion of adaptive substitutions should be evaluated considering direct selection only.This is historically more correct, since the statement that most polymorphisms and substitutions are selectively neutral has traditionally always been about direct selection only.It is also theoretically more correct, because the effects of linked selection (also known as 'genetic draft') mimic the effects of genetic drift: linked selection decreases the fixation probabilities of adaptive alleles, while increasing the fixation probabilities of deleterious alleles (Hill & Robertson, 1966;Birky & Walsh, 1988;Gillespie, 2001;Jensen et al., 2019).Thus, classifying hitchhiking alleles as alleles under positive selection would produce a biased, inflated estimate of α.
It has been estimated that up to 85% of the human genome is affected by direct or linked natural selection (Pouyet et al., 2018).Harris (2018, p. 2) commented: 'Superficially, this might seem like a death knell for the neutral theory, but it is nothing of the kind'.We agree.The implications of such a value for the neutral hypothesis depend on two factors: (i) what number/proportion of amino acid (or nucleotide) substitutions under direct selection is responsible for indirect genomewide effects; and (ii) how many of these substitutions result from positive selection (causing genetic hitchhiking) and how many from negative selection (causing background selection)?As long as these two factors are unknown, the implications for the neutral hypothesis cannot be determined.
(3) Delimiting nearly neutral mutations Neutralists consider mutations neutral if their 'selection coefficients are so small that their behaviour may not be very different from the strictly neutral mutants' (Ohta & Kimura, 1971c, p. 21).This begs the question: is there an objective threshold?Or, in the words of Crow (1972b, p. 307): 'How similar in fitness must two genes be to be regarded as selectively equivalent?' Kimura (1968b, p. 262) posited: 'A mutant gene may be called almost neutral if j2N e Ásj is much smaller than unity'.As pointed out by Kreitman (1996), Kimura did not apply a consistent definition, using the phrases 'much smaller' and 'less than', as well as the symbols '≪' and '<', interchangeably.Ohta (1976, p. 258) used a different threshold, writing: 'Generally, when the population size is small, very slightly deleterious alleles become effectively neutral (N e s ≪ 1) and the strict neutral theory is practically valid, whereas when the population size gets large such that N e s ≫ 1, these alleles are selected against'.Ohta (2002, p. 16134) used another threshold still: 'The nearly neutral mutations are defined such that their fate in the population depends on both selection and drift.Thus, the absolute value of the product, N e s, should be small, e.g.not larger than 2'.In yet another study, Kimura & Ohta (1971b) assumed that mutations are selectively nearly neutral when their absolute selection coefficient is below 1/(4N e ) (Kimura & Ohta, 1971a;Ohta & Kimura, 1971c;Hartl et al., 1985;Wagner, 2008).

Moderating the neutralist-selectionist debate
Apart from the inconsistency, there are two problems with any of these proposed ranges meant to delimit nearly neutral mutations.First, the set limits are arbitrary and in need of justification.In previous studies, Kimura derived mathematically that the fixation probability (u) of a novel mutation in a diploid population is given by the formula u = (1 − e −2s )/ (1 − e −4Nes ), which in the case of small selection coefficient (s) simplifies to: u = 2 s/(1 − e −4Nes ) (Kimura, 1957(Kimura, , 1962)).Kimura (1962, p. 716) commented: 'For a positive s and very large N e , we obtain the known result that the probability of ultimate survival of an advantageous mutant gene is approximately twice the selection coefficient.On the other hand, if we let s À! 0, we obtain u = 1/ (2N e ), the result known for a neutral gene'.However, it remains arbitrary at what point fixation probabilities differ sufficiently from neutral mutations to no longer be considered nearly neutral.Kimura (1968a, p. 625) stated: 'In the special case of j2Nsj ≪ 1 […], the probability of fixation […] is roughly equal to its initial frequency'.But without an objective criterion, the same could be said to be true for any of the other suggested thresholds.
Second, the question may be asked how biologically meaningful these suggested narrow ranges are.Theory predicts that the average time to fixation for a nearly neutral mutation is close to 4N e generations (Kimura & Ohta, 1969).Given the spatiotemporal variation of selective pressures (Wallace, 1975;Reznick, 2016), fluctuations of the selection coefficient during these prolonged periods are likely to be of more importance to the fate of the allele than the negligible mean value itself (Ohta & Kimura, 1971c;Ohta, 1972b;Nei, 2005b).Fisher (1931, p. 220) commented: 'The neutral zone of selective advantage in the neighbourhood of zero is so narrow that changes in the environment, and in the genetic constitution of species, must cause this zone to be crossed and perhaps recrossed relatively rapidly in the course of evolutionary change, so that many possible gene substitutions may have a fluctuating history of advance and regression before the final balance of selective advantage is determined'.For this reason, near-neutrality is better defined as N e σ < 1 (instead of jN e sj < 1), in which σ denotes the standard deviation of the selection coefficient (Ohta, 1972b;Ohta & Tachida, 1990).
Defining the limits of near-neutrality is not a mere semantic or technical issue.As we will discuss next, the implications of the neutral theory for phenotypic evolution depend critically on the selective values (i.e.functional consequences) of the molecular changes assumed to be governed by random genetic drift.

V. WHAT ARE THE NEUTRALITY PREDICTIONS FOR PHENOTYPIC EVOLUTION?
(1) The nature of neutral mutations The neutral hypothesis raises a fundamental question for evolutionary biologists: if most differences between species at the molecular level result from genetic drift, then what are the implications for evolution at the phenotypic level?Does it mean that most observed phenotypic differences within and between species are, likewise, neutral?
The neutral theory of molecular evolution applies primarily to the genotype (including protein sequences) while the Darwinian theory of evolution by natural selection applies primarily to the phenotype (Table 2B).To assess whether the neutral theory and Darwinian theory are compatible or in conflict, their main propositions need to be translated into the same unit of measurement.That is, either Darwinian theory must be expressed in terms of genotypic change, or alternatively the neutral theory must be expressed in terms of phenotypic change.For the latter approach, one first needs to know the exact nature (or functional importance) of the molecular changes which neutralists consider to be neutral.King & Jukes (1969) argued that protein-coding DNA contains two types of substitutions which do not affect higher protein structure and function and hence can be considered neutral: synonymous (also known as silent) and conservative substitutions.They wrote (p.797) that 'most proteins contain regions where substitutions of many amino acids can be made, without producing appreciable changes in protein function'.In line with this reasoning, Kimura (1991b, p. 380) wrote: 'A replacement of an amino acid within a protein often preserves the activity of that protein unaltered'.Jukes (1991, p. 477) similarly remarked: 'Enzymes have a small active center containing a few key amino acids, and the rest of the molecule supplies bulk, size and other general properties that are more or less nonspecific.This larger region can accommodate many changes without losing its structure and function'.
But do silent and conservative substitutions represent all that is at stake in the neutralist-selectionist debate?If the neutral hypothesis is taken to mean that many substitutions within protein-coding DNA are silent and many others conservative, or that certain nucleotides within regulatory sequences can be replaced without affecting protein regulation, one might understand the opinion of one of the reviewers of the manuscript of King & Jukes (1969), according to which the 'idea was obviously true and therefore trivial' (King, 1983, p. 25).No biologistselectionist or neutralist alikewill disagree with the statement that synonymous sites accumulate more substitutions than non-synonymous sites, or that alteration of amino acid sequence does not necessarily imply a change of protein function.
If neutrality is meant to refer not only to completely neutral mutations but also to effectively neutral mutations, it is evident that the neutral hypothesis is not just about the prevalence of silent and conservative polymorphism and substitutions.It is also about the prevalence of non-conservative (i.e.radical) polymorphisms and substitutionschanges which do affect protein function and/or regulation (Stoltzfus, 1999).This, in turn, has implications for phenotypic evolution, both for cladogenesis (discussed in this section) and for anagenesis (see Section VI). between species are caused by a minority of adaptive substitutions (Ayala, 1974;Perutz, 1984;Komiyama et al., 1995;Nei, 2005b;Zhang, 2018).This decoupling of genomic and phenotypic differences was emphasised by King & Jukes (1969).From their introductory remarks it is evident that although they argued that most nucleotide substitutions are neutral, they believed that most species differences at the phenotypic level are adaptive.They wrote (p.788): 'Evolutionary change at the morphological, functional and behavioural levels results from the process of natural selection, operating though adaptive changes in DNA.It does not necessarily follow that all or most evolutionary change in DNA is due to the action of Darwinian natural selection.There appears to be considerable latitude at the molecular level for random genetic changes that have no effect upon the fitness of the organism'.Kimura (1976, p. 249) also emphasised this decoupling between genomic and phenotypic evolution: 'When we come to the phenotypic level far removed from the molecular constitution of genes, I have little doubt that Darwin's principle of natural selection will prevail in the evolutionary changes of form and function.What I am claiming is that, deep down at the level of the internal structure of genetic material, there is a great deal of evolutionary change propelled by random genetic drift'.
The belief that molecular evolution is mostly neutral whereas phenotypic evolution is mostly adaptive rests on the assumption that the neutral hypothesis refers solely, or at least predominantly, to the occurrence of non-coding, silent and conservative substitutions.Protein sequences and DNA sequences contain a great deal of redundant information, which means that many nucleotides and even amino acids can be replaced by random genetic drift without affecting protein function or regulation, and hence without substantially affecting phenotype (Perutz, 1984;Bowie et al., 1990;Komiyama et al., 1995;Bloom & Arnold, 2009).
If we accept this view, the neutral theory of molecular evolution is complementary, or 'supplementary' (Kimura, 1976), to Darwin's concept of ecological speciation.Darwinian theory describes evolutionary change at the phenotypic level, whereas the neutral theory of molecular evolution describes meaningless evolutionary change ('noise') at the molecular level (Ayala, 1974).In the words of Kimura (1979b, p. 126): 'Darwinian, or positive, selection cares little how […] phenotypes are determined by genotypes.[…] Even if Darwin's principle of natural selection prevails in determining evolution at the phenotypic level, down at the level of the internal structure of the genetic material a great deal of evolutionary change is propelled by random drift'.Kimura (1983, p. ix) also supported the complementary perspective, and wrote: 'The neutral theory is not antagonistic to the cherished view that evolution of form and function is guided by Darwinian selection, but it brings out another facet of the evolutionary process by emphasizing the much greater role of mutation pressure and random drift at the molecular level'.
(3) Neutral phenotypic divergence An alternative interpretation is that the neutral theory of molecular evolution opposes the concept of ecological divergence, at least to a certain extent, namely by implying the existence of phenotypic differences between species that result from random genetic drift instead of positive selection.This interpretation posits that the random fixation of nonconservative substitutions has far-reaching consequences for phenotypic divergence.Once we admit that genetic drift can cause differential fixation or loss of alleles that alter protein function and regulation, and thus cause functional changes, there is no way back from admitting that disconnected populations can diverge without adaptation, through the accumulation of non-conservative substitutions resulting from random genetic drift.Thus, we are left with having to accept that at least part of the phenotypic differences observed between species can result from genetic drift rather than from natural selection (Lande, 1976;Lynch & Hill, 1986;Stoltzfus, 1999).If we take a final additional step, by assuming that the random fixation of non-conservative mutations in disconnected populations can cause reproductive barriers, then the neutral theory can even be regarded to advocate the reality of neutral allopatric speciation, as an alternative to ecological speciation (Hedges et al., 2015).
Kimura contemplated this possibility in his last papers on the subject.Kimura (1991a, p. 5972): 'How can we understand evolution at two levelsthat is, molecular and phenotypicin a unified way?It is generally believed that, in contrast to the neutralist view of molecular evolution, evolutionary changes at the phenotypic level are almost exclusively adaptive and caused by Darwinian positive selection.However, I think that even at the phenotypic level, there must be many changes that are so nearly neutral that random drift plays a significant role, particularly with respect to quantitative characters'.Kimura (1991a, p. 5972) concluded: 'If the neutral theory is valid so that a great majority of evolutionary changes at the molecular level are controlled by random genetic drift under continued input of mutations, it is likely that selectively neutral changes have played an important role […] in phenotypic evolution'.
Based on the above discussion, we propose here that a distinction can be made between two types of the neutral hypothesis: a narrow and extended version (Fig. 5).The narrow neutral hypothesis holds that even though most substitutions within proteins, genes and regulatory factors result from random genetic drift, these are solely or predominantly silent or conservative substitutions that do affect protein function and hence do not affect the phenotype (and have no fitness consequences either).The extended neutral hypothesis, by contrast, which may be regarded a 'generalisation of the neutral theory to the phenotypic level' (Lynch & Hill, 1986, p. 915), holds that genetic drift can cause fixation of nearly neutral, nonconservative mutations (which do affect protein function and regulation and hence do affect phenotype), implying that accumulation of these types of substitutions can cause between-species phenotypic differences.Moderating the neutralist-selectionist debate the time to most recent common ancestor (Khaitovich et al., 2004).In terms of variability of divergence rates across traits, the neutral hypothesis predicts lower phenotypic divergence for functionally important (i.e.conserved) traits compared to less important traits (Ho, Ohya & Zhang, 2017).If, instead, phenotypic divergence is caused solely by a few (large-effect) selected mutations (as posited by the narrow neutral hypothesis) (Nei, 2007), we expect to find no correlation between levels of molecular and phenotypic divergence (Kimura & Ohta, 1971a).
Zuckerland & Pauling (1965, p. 148) leaned towards the last view, stating: 'There is no reason to expect that the extent of functional change in a polypeptide chain is proportional to the number of amino acid substitutions in the chain.Many such substitutions may lead to relatively little functional change, whereas at other times the replacement of one single amino acid residue by another may lead to a radical functional change'.Kimura (1969bKimura ( , p. 1187) similarly commented: 'If amino acid changes are often due to chance, then these should be established as frequently in evolutionary conservative species as in those that undergo rapid changes in morphology.[…] It would support the hypothesis of this paper if haemoglobins and other proteins show the same rate of amino acid substitution in […] living fossils as in rapidly evolving species'.
Under strict neutrality conditions (i.e. when the genetic variance underlying a phenotypic quantitative trait is determined solely by neutral mutations and genetic drift), levels of phenotypic variation within and between lineages are predictable, conditional on a hypothetical distribution of fitness effects (DFE) (Lynch & Hill, 1986;Lynch, 1988;Hodgins-Davis, Rice & Townsend, 2015).The actual variation observed both within and between lineages, as for instance measured in terms of gene expression levels, is typically found to be smaller than predicted by the strictly neutral model of phenotypic evolution (Lemos et al., 2005;Denver et al., 2005;Hodgins-Davis et al., 2015).As discussed in Section II, such violations of strict neutrality predictions may result from purifying selection, and hence do not necessarily refute the neutral hypothesis.

VI. WHAT ARE THE NEUTRALITY PREDICTIONS FOR FITNESS LEVELS? (1) Neutral mutations may facilitate adaptation
Despite the designation 'non-Darwinian evolution' (King & Jukes, 1969), the founders of the neutral theory did not deny the importance of Darwinian selection for evolutionary innovations.In the words of Kimura & Ohta (1974, p. 2851): 'There is not a slightest doubt that the marvellous adaptations of all the living forms to their environment have been brought about by positive Darwinian selection'.And in the words of Lewontin (1974, p. 199): 'The neoclassical theory [i.e. the neutral theory] cannot be disposed of by pointing to the elephant's trunk and the camel's hump.The theory does not deny adaptive evolution but only that the vast quantity of molecular variation within populations and, consequently, much of the molecular evolution among species, has anything to do with that adaptive process'.
For this reason, it has been argued that the neutral hypothesis may just as well as be ignored by evolutionary biologists interested in phenotypic evolution.In the words of Crow (1972a, p. 2): 'A biologist may well say that if these changes are so nearly neutral as to be governed by chance […] they are not really of much interest.He is more interested in processes that affect the organism's ability to survive and reproduce, and which have brought about such exquisite adaptations to diverse environments'.
However, even though neutral mutations do not affect fitness directly, this does not mean they cannot affect the adaptation process indirectly.Theoretical considerations suggest that random mutations may indirectly facilitate, or even enable, the adaptation process.This relevance of neutral mutations is generally accepted in the context of environmental change.According to this 'adaptive potential' hypothesis, the genetic variation accumulated by the random drift of neutral mutations may become adaptive following an environmental change, allowing populations to react promptly to new environmental demands (Haldane, 1957;Kimura, 1960a;Teixeira & Huber, 2021).
Moreover, the indirect importance of neutral mutations for the adaptation process is not limited to environmental change scenarios aloneit also applies to populations occurring in a stable environment, without selection coefficient fluctuations.The reasoning is that in the absence of neutral changes (i.e. if all mutations have a certain fitness effect) populations are likely to become trapped in local optima (Fig. 6).Neutral mutations, on the other hand, allow populations to wander the fitness landscape freely (Wright, 1932;Crow, 1972b;Kauffman & Levin, 1987;Gavrilets, 1997).As an analogy, the countless conceivable series of neutral mutations can be thought of as a dense network of raised mutational pathways crisscrossing the fitness landscape and connecting high-fitness regions (Huynen, 1996;van Nimwegen & Crutchfield, 2000;Poelwijk et al., 2007;Lenormand, Roze & Rousset, 2009).In the words of Sewall Wright: 'Changes in wholly non-functional parts of the molecule would be the most frequent ones but would be unimportant, unless they occasionally give a basis for later changes which improve function in the species in question which would then become established by selection'.(Huynen, 1996, p. 165).
This conceptual idea of (near-)neutral substitutions facilitating the evolutionary process is supported by protein network modelling and protein engineering studies.These studies suggest that evolution by means of natural selection may be severely impeded if most amino acid substitutions have a strong phenotypic effect, because this would prohibit an explorative 'random walk' through the hyperdimensional space of all possible protein sequences (Maynard- Smith, 1970a;Gillespie, 1984a;Lipman & Wilbur, 1991;Wagner, 2008;Bloom & Arnold, 2009).It also suggests that neutrality dictates the pace and even the direction of evolution, as the search through genotype space is determined by the available neutral paths (i.e.series of single-step mutations involving amino acids with similar physicochemical properties) (van Nimwegen & Crutchfield, 2000).A random walk may accidently hit upon an entry into an adaptive pathway, the evolutionary starting point for the next adaptation event (Bloom & Arnold, 2009).
Others have gone even a step further, by arguing that complexity can also arise solely through random fixation of neutral mutations, without Darwinian selection (Stoltzfus, 1999;Muñoz-G omez et al., 2021).This hypothesis, known as 'constructive neutral evolution', posits that 'a novel attribute appears initially as an excess capacity and later becomes a contributor to fitness, due to a neutral change at some other locus that creates a dependency on it' (Stoltzfus, 1999, p. 176).Kimura (1979b) initially argued that the scientific value of his theory should not be weighted by the biological relevance of neutral mutations.He wrote (p.126): 'People have told me, directly and indirectly, that the neutral theory is not important biologically because neutral genes are not involved in adaptation.My own view is that what is important is to find the truth, and that if the neutral theory is a valid investigative hypothesis, then to establish the theory, test it against the data and defend it is a worthwhile scientific enterprise'.Later, however, Kimura (1991b) argued that the neutral variation accumulated through random genetic drift may provide the raw material needed for adaptation following an environmental change.Kimura (1991b, p. 383) stated: 'No one would be able to say then, that neutral changes are by definition not concerned with adaptation, and that therefore the neutral theory is biologically not very important'.
The random fixation of neutral mutations allows populations to traverse between high-fitness peaks in a cost-free way.The fixation of slightly deleterious mutations may also reduce the possibility of remaining trapped in local optima, but this is a riskier strategy.Before accidently reaching the base of a higher adaptive peak, populations first have to descend into a fitness valley.In the words of Lande (1976, p. 320): 'Genetic drift can thus be thought of as a process of random exploration of the adaptive zones in a temporarily maladaptive way, on the chance that a new phenotype may be found which will be better adapted'.
In theory, a population risks never finding the way up again and to enter a 'slippery slope'.Continuous accumulation of slightly deleterious mutations could draw populations into a vicious circle known as the 'extinction vortex' or 'mutational meltdown'.A positive feedback loop between population size and the probability of fixation of deleterious mutations could see the population heading for extinction (Lynch, Conery & Burger, 1995).In the words of Kimura andOhta (1974, p. 2851): 'Accumulation of very slightly deleterious mutations by random drift is essentially equivalent to the deterioration of environment, and definitely adaptive gene substitutions must occur from time to time to save the species from extinction'.
In reality, empirical evidence for mutational meltdowns in sexually reproducing populations is sparse (Whitlock, Griswold & Peters, 2003;Teixeira & Huber, 2021).This may simply indicate that mutational meltdowns are usually averted by compensatory mutations (Lande, 1998;Poon & Otto, 2000;Whitlock et al., 2003), but some authors regard this discrepancy as potential evidence against the nearly neutral hypothesis.For instance, Crow (1997, p. 262) commented: 'An unappealing aspect of the nearly neutral theory is its implication that slightly deleterious mutations accumulate with a consequent (very slow) deterioration of the population'.Nei (2005bNei ( , p. 2325) ) agreed: 'A problem with Ohta's theory is that if deleterious mutations accumulate in a gene, the gene gradually deteriorates and eventually loses its function.If this event occurs in many important genes, the population or species would become extinct'.
(3) Genetic load is not an optimality measure Fitness differences are typically quantified in terms of the genetic load, which denotes the mean difference in fitness in a population relative to the most fit genotype present at a given time (Muller, 1950;Kimura, 1960a;Brues, 1964Brues, , 1969;;Graur, 2017;Bertorelle et al., 2022).
The genetic load consists of several components (Brues, 1969;Bertorelle et al., 2022), but its definition was originally formulated to measure one particular type of genetic load, the 'mutational load' (Muller, 1950).This load results from a mutation-purifying selection balance in which 'selection tends to eliminate alternative alleles but mutation restores them' (Lewontin & Hubby, 1966, p. 606).Because neutralists and selectionists both acknowledge that many mutations are deleterious, they agree that populations carry such a mutation load.However, they disagree about other components of the genetic load, in particular drift load, segregation load and substitutional load.
Considering all components, the predictions of neutralists imply a lower genetic load than the predictions of selectionists do.This seemingly contradictory outcome is due to the definition of genetic load, which, following the classical definition, does not measure the difference between the observed genotypes and the theoretical optimal genotype, but instead the difference between the observed genotypes and the fittest genotype present in the population (Brues, 1964).This definition leads to counterintuitive estimates in certain evolutionary contexts (Van Valen, 1963;Brues, 1964Brues, , 1969)).
For instance, whenever an adaptive mutation occurs, a genetic load (i.e.'substitutional load') is createdeven though the new allele causes individuals in the population to be better adapted.The substitutional load peaks immediately after the mutation event and disappears only when the beneficial allele has spread throughout the entire population (Van Valen, 1963;Brues, 1964;Kimura, 1968a).In the words of Brues (1969Brues ( , p. 1135): 'The genetic load involved in the substitutional situation is in fact an artifact […] if we adhere to the definition of fitness in terms of the optimum genotype.The appearance of a new advantageous gene in even the smallest numbers creates a new optimum genotype in relation to which the formerly optimum genotype is demoted, accused of contributing a large amount of load, and blamed for a loss of population fitness.Actually, the appearance and multiplication of a new and advantageous gene leads to an increase in population fitness'.

Moderating the neutralist-selectionist debate
The above considerations explain why the selectionist's model, in which most polymorphisms and substitutions result from balancing and positive selection respectively, may imply a higher genetic load than the neutralist hypothesis.Indeed, one of the original arguments underlying the foundation of the neutral theory, the 'cost of selection' argument, was that the selectionist's model would predict an intolerably high genetic load and therefore cannot be true (Kimura & Crow, 1964;Hubby & Lewontin, 1966;Dietrich, 1994).
(4) Reasons underlying suboptimality Genetic load quantifies the fitness differences between observed genotypes, but more relevant for the neutralistselectionist debate is the question how observed genotypes, and the phenotypic traits they underlie, compare to the theoretical optimum (i.e.maximised fitness) (Fig. 6).This theoretical optimum reflects a trade-off at which, given all abiotic and biotic environmental conditions, the total costs are minimised and, hence, the net fitness effect maximised.This trade-off is evident at various levels of biological organisation.For instance, the optimum limb morphology of an aquatic mammal reflects a trade-off between mobility in water and mobility on land; while the optimum sprint speed of a cheetah reflects a trade-off between hunting success rate and its ability to defend a catch against other predators.On the molecular level, the optimum protein configuration reflects a trade-off between protein functionality, stability, production and regulation, while the optimum codon usage represents a trade-off between factors such as translation efficiency, metabolic costs and messenger RNA (mRNA) stability (Hey, 1999;Akashi & Gojobori, 2002;Scott et al., 2012;Kessler & Dean, 2014).
The complexity of these trade-offs makes it difficult to determine whether observed trait values are close to optimality.At any point in time the realised genotype or trait value might be suboptimal for a number of reasons.One obvious explanation is that the trait is still evolving (under directional selection) and will reach its optimum if given enough time.Another possible reason, relating to the phase of stabilising selection, is practical limitations: even though, in theory, higher sprint speeds will increase a cheetah's hunting success rate, these predators may have already reached the limit of what is physically attainable for quadrupedal locomotion.A third possible explanation, also relating to the phase of stabilising selection, is the drift-barrier hypothesis (Sung et al., 2012): the cheetah's sprint speed may have converged so close to its optimum value that the effects of further adaptive mutations are too small to overcome the effects of genetic drift (Fig. 6).Once this point has been reached, genetic drift will prevent the spread of slightly beneficial alleles (which would reduce the distance between achieved and optimum value), and promote the fixation of slightly deleterious alleles (which move the trait further from its optimum value) (Fig. 6).
The 'cost of fidelity' hypothesis states that mutation rates are at their optimum values.This optimum value reflects the trade-off between the costs and benefits associated with increasing or decreasing the mutation rate.The cost of lowering the mutation rate is not thought to be the loss of adaptive potential (because natural selection only cares about the here and now, not about the future), but the time and energy needed to maintain this fidelity.Kimura (1967, p. 31) states: 'An elaborate apparatus that must be developed for checking and eliminating errors in replication might be physiologically so costly relative to the gain thereby achieved that it did not pay in adaptive evolution'.Thus, the 'cost of fidelity' hypothesis holds that genomic mutation rates will tend to evolve towards to the equilibrium point with minimum total cost of the mutation rate (which is the sum of the cost of fidelity and the cost of deleterious mutations).
The remaining two hypotheses hold that mutation rates in nature may be suboptimal, but assume different causes.The 'physical limit' hypothesis states that selection only strives to lower the mutation rates (i.e.zero is the optimal value), but that the mutation rate gets stuck at a suboptimal value above zero, namely at the lower limit that is practically attainable.Kimura (1967, p. 31) suggested that 'mutation as the replication error of the genetic material cannot entirely be eliminated because of physical or physiological limitations'.
The 'drift-barrier' hypothesis (Lynch, 2010;Sung et al., 2012) suggests that mutation rate may be suboptimal due to genetic drift.It asserts that the lower limit is not determined by practical feasibility, but instead by the efficacy of natural selection given the effective population size (i.e. the magnitude of genetic drift).In the words of Lynch et al. (2016, p. 712): 'Selection typically operates to minimise the mutation rate, with the efficiency of such downward movement being eventually overcome by the power of genetic drift.[…] The drift barrier is typically encountered before any insurmountable biophysical or biochemical limits to replication fidelity'.This hypothesis is consistent with the empirical finding that species-specific mutation rates are negatively correlated with effective population size (Lynch, 2010;Sung et al., 2012).
It is evident that the drift-barrier hypothesis fits into the scheme of the nearly neutral hypothesis: namely, by assuming that the suboptimal (above zero) mutation rates result from repeated random fixation of slightly deleterious mutations, in this particular case specifically in DNA polymerase and repair genes which compromise the fidelity of these enzymes.But what about the other two hypotheses, which assert that mutation rates are kept at their practical or even theoretical optimum?While these hypotheses, if true, might seem more compatible with the selectionist view than with the neutral hypothesis, there are two reasons why this conclusion could be premature.
First, and as discussed above, even proteins that have been fine-tuned to their optimal configuration (in this case DNA polymerase) are not necessarily at odds with the neutral hypothesis.It may still be the case that the majority of amino acid substitutions during the evolution of a protein did not (substantially) alter the protein function.Once the protein has reached its optimal configuration, the protein is kept in this optimal state by the removal of deleterious alleles through purifying selectiona process which is compatible with both neutralist and selectionist views.
Second, while the performance of DNA polymerase affects global mutation rates across the entire genome, these proteins make up a small subset of the entire proteome.Thus, even if most substitutions in these particular proteins during a considerable time span and in a wide range of organisms have been adaptive, the neutral hypothesis does not stand or fall with these proteins alone.The proportion of adaptive substitutions should be considered across all proteins, or at least a substantial proportion of all proteins.Similar considerations apply to the adjustment of local mutation rates through any other hypothetical mechanisms, such as modification of epigenomic features in genomic regions enriched with functionally constrained genes (Monroe et al., 2022).
(6) The (sub)optimality of codon usage The frequency of triplets underlying amino acids deviates significantly from neutral expectations, even after correcting for GC content (Sharp & Li, 1986;Li et al., 2015).This finding indicates that synonymous mutations are not completely neutral but instead subject to weak selection (Richmond, 1970;Clarke, 1970a;Ikemura, 1981;Akashi, 1995) contrary to the claim of King & Jukes (1969), but as correctly predicted by Kimura (1968b) and Kimura & Ohta (1974).As a naïve prediction, one could postulate that the ultimate outcome of this adaptation process is a genome in which each amino acid is encoded for by the preferred codon exclusively.In reality, relative synonym codon usage (RSCU) values are mostly below 1.5, indicating that the preferred codon is only slightly more frequent than expected under strict neutrality.
One possible explanation is that the observed codon usage is in fact (nearly) optimal, for instance because the preferred codon may differ across genes and across sites depending on certain unknown factors.An alternative explanation is that codon usage is suboptimal, and that 'codon usage patterns result from the balance in a finite population between selection favouring an optimal codon for each amino acid, and mutation together with drift allowing the persistence of nonoptimal codons' (Bulmer, 1991, p. 897).This hypothesis is consistent with the finding that codon usage bias is less pronounced in genes with low expression levels, which is indicative of a drift barrier, where the weaker purifying selection on less-used genes is more frequently overruled by genetic drift (Bulmer, 1991;Akashi et al., 2012).
Eventually, what matters for the neutralist-selectionist debate are the proportions of adaptive, neutral and deleterious substitutions at degenerate sites.While codon usage bias is indicative of pervasive selection, it does not necessarily imply that most substitutions at degenerate sites are or have been adaptive (see Table 1).
(7) The (sub)optimality of the genetic code King & Jukes (1969) pointed out that the variation in the relative frequencies of each amino acid type is largely explained by the redundancy of the genetic code (King & Jukes, 1969;Gilis et al., 2001).For instance, the amino acid serine, which is encoded for by six DNA codons (TCT, TCC, TCA, TCG, AGT, AGC), is about three times more abundant than the amino acid tyrosine, which is encoded for by two DNA codons (TAT, TAC) (King & Jukes, 1969).To this, selectionists replied that the genetic code is a product of natural selection itself, moulded to provide the most codons for the amino acids in most demand (Richmond, 1970;Clarke, 1970a;Kimura & Ohta, 1974;Xia & Li, 1998;Gilis et al., 2001).However, even if this hypothesis is correct, the question of how the universal genetic code evolved during the origin of life and whether it reached an optimal configuration is very different from the question of which evolutionary processes have shaped the primary structure of proteins in vertebrates many millions of years later.Evaluating amino acid type frequencies, in the context of the available genetic code, and how these frequencies affect substitution rates (Graur, 1985), provides insight into the latter question without challenging the claim that the genetic code itself is adaptive.

VII. IS THE NEUTRAL HYPOTHESIS FALSIFIABLE? (1) Strict neutrality predictions
The neutral theory has been praised for its usefulness as null model, and neutralists and selectionists agree that as such, regardless of its validity, it has greatly benefited the field of molecular biology (Ayala, 1974;Higgins, 2004).Kimura (1979b, p. 126) wrote: 'Because our theory is quantitative it is testable and therefore much more susceptible to refutation when it is wrong than are selectionist theories, which can invoke special kinds of selection to fit special circumstances and usually fail to make quantitative predictions'.Ayala (1974, p. 694)  Moderating the neutralist-selectionist debate pattern of protein polymorphisms in populations and of protein differences between populations'.Maynard- Smith (1978, p. 37) wrote: 'The neutral hypothesis is a good "Popperian" one; if it is false, it should be possible to show it'.Kreitman (1996, p. 678) famously stated: 'The neutral theory is dead.Long live the neutral theory!'He argued (p.678): 'It has provided empiricists with a strong set of testable predictions and hence, a useful null hypothesis against which to test for the presence of selection'.Crow (1997, p. 262) wrote: 'It has had great heuristic value.Being concrete, mathematical and simple, it leads to testable predictions, and this has been one of its greatest merits'.
However, while there can be little doubt that the neutral hypothesis is more parsimonious than the selectionist hypothesis, and while the above comments might be mostly true for strict neutrality, they are somewhat flattering for the neutral hypotheses, in particular the nearly neutral hypothesis.
In the complete absence of selective pressures, coalescent theory predicts that the genetic distance (d) between two haplotypes is given by: d = 2μT, in which μ and T denote the mutation rate and time to coalescence respectively.For two haplotypes drawn from two distantly related species, the expected coalescence time roughly equals the divergence time (t), implying: d = 2μt (Figs 2 and 3) (Gillespie, 2001).For two haplotypes drawn from the same population, the expected coalescence time equals 2N e , and thus d = 2 μ (2N e ) = 4N e μ.In the case of panmixia, the mean genetic distance between any two haplotypes within a population (also known as nucleotide diversity, π) equals heterozygosity, and hence: He = 4N e μ (Figs 2 and 3).Kimura, who used a different approach, concluded: He = (4N e μ)/ (1 + 4N e μ), which for realistic values of N e and μ returns very similar estimates (Kimura & Crow, 1964;Kimura, 1969aKimura, , 1991b;;Tajima, 1996).
One difficulty is that the exact values of the parameters t, μ and N e are typically unknown (Maynard-Smith, 1978).Therefore, neutrality predictions can often only be evaluated globallyd / t instead of d = 2μt, and He / N e instead of He = 4N e μand in the latter case only if assuming that certain species characteristics, such as range size, body size, generation time or IUCN status, may serve as N e proxies (Bolívar et al., 2019) (Fig. 2).
A further complication is that each function contains multiple explanatory variables, meaning that variation of dependent variables may be attributed to different factors, plus potential interaction effects, leaving room for interpretation.For instance, variation in genetic distances between lineages may not only be attributed to the time to most recent common ancestor, but also to mutation pressure (Kimura, 1991b).Similarly, variation in genetic diversity across lineages may reflect differences in effective population size, but also differences in global mutation rates (Hodgkinson & Eyre-Walker, 2011), or an interaction effect between these two factors (Kimura & Ohta, 1971c).
Another complication is that discrepancies between strict neutrality predictions and actual observations might not refute the hypothesis, but instead be attributable to confounding factors.In the words of Kimura (1991b, p. 379): 'The real biological world […] is very complicated, containing some disturbing or complicating factors that make actual observations depart from the neutrality predictions'.The most prominent confounder is linked selection.Strict neutrality predictions are derived for individual loci in isolation from the rest of the genome, and hence ignore the effects of linked selection (Charlesworth & Charlesworth, 2018).While linked selection does not affect substitution rates (Birky & Walsh, 1988), it is able to cause genetic diversity to deviate from neutral theory predictions (Gillespie, 2001;Buffalo, 2021).
Another example of a confounder is directional mutation pressure: variation in GC content across species and across loci may result from differences in conversion rates from AT bases to GC bases and vice versa (Sueoka, 1962(Sueoka, , 1988;;Freese, 1962;Jukes, 1991).A process with similar effect is GC-biased gene conversion (BGC), where during recombination events GC bases are favoured over AT bases, regardless of their selective values (Galtier et al., 2001;Kern & Begun, 2005;Pouyet et al., 2018;Harris, 2018).As a consequence, codon frequencies might deviate from strict neutrality predictions even in the absence of selective pressure, and hence codon usage bias can only be interpreted as such after correcting for GC content (Sharp & Li, 1986;Li et al., 2015).
The greatest difficulty for testing the strict neutrality model, however, is violation of underlying assumptions.Strict neutrality predictions are based on the Wright-Fisher model, which assumes constant population sizes and nonoverlapping generations (Akashi et al., 2012).A mismatch between predicted and observed values could be interpreted as evidence against the strict neutrality model, but may alternatively simply reflect violation of Wright-Fisher model assumptions, particularly demographic non-equilibrium (Nei, Maruyama & Chakraborty, 1975;Pollak, 1982;Nei, 2005b;Balloux & Lehmann, 2012;Akashi et al., 2012;Müller, Kaj & Mugal, 2022).
For instance, the predicted correlation between heterozygosity and effective population size (He / N e ) holds only in the case of long-term constant population sizes, when the gain and loss of alleles has reached an equilibrium, a prerequisite which in reality will rarely be met (Nei et al., 1975;Nei, 2005a).In the words of Kimura (1979b, p. 120): 'One can show mathematically that the genetic variability due to neutral alleles can be greatly reduced by a population bottleneck from time to time, after which it takes millions of generations for the variability to build up again to the theoretical level characteristic of a very large population maintained constantly over a long period'.While, in theory, N e may be considered the harmonic mean of historic population sizes, in practice this interpretation makes it almost impossible to evaluate the relationship between genetic diversity and N e proxies (He / N e ), as the latter are usually based on contemporary indicators.
This footnote to the neutral theory concerns not only predictions on genetic diversity, but also predictions on genetic divergence from standing variation.In the absence of selective pressures, allele frequencies between disconnected sister populations are expected eventually to become uncorrelated.However, it may take genetic drift many generations to remove all traces of shared ancestry and to establish the new migration-drift equilibrium (Ayala, 1974;Li & Nei, 1977).
(2) Neutral theory predictions The neutral hypothesis adds another factor to the equation: functional constraint f 0 , which is defined as the proportion of neutral mutations.In the absence of positive and balancing selection, or at least if assuming 'they are so rare that they may be neglected from our consideration' (Kimura, 1991b, p. 371), the expected genetic distances and heterozygosity levels are now given by d = 2μf 0 and He = 4N e μf 0 (Figs 1-3), with μf 0 denoting the neutral mutation rate (μ 0 ) (Kimura, 1977(Kimura, , 1991a)).
The addition of an extra parameter opens up room for interpretation.Owing to potential differences in functional constraint across lineages, the theory now allows for substitution rate variation.Similarly, variation of polymorphism levels across loci may now not only be attributed to mutation rate variation or differences in effective population size (as a result of linked selection) (Hill & Robertson, 1966;Birky & Walsh, 1988), but also to functional constraint (Fig. 3).Consequently, the positive relationship d / He, which according to the strict neutrality hypothesis arises from local mutation rate variation, could according to the neutral hypothesis also reflect variation in f 0 across loci (Hudson, Kreitman & Aguadé, 1987).
Another question is: how to estimate f 0 ?Among the factors traditionally thought to determine functional constraint are functional density (i.e. the proportion of critical sites) (Dickerson, 1971;Zuckerkandl, 1976) and the physicochemical differences between amino acid sequences and their 'single-mutation derivatives' (Epstein, 1967;Grantham, 1974;Kimura, 1983;Graur, 1985;Tang et al., 2004;Chen et al., 2019).In practice, the functional constraint of a protein can be measured based on the second criterion (e.g. by calculating physicochemical distances to all conceivable singlemutation derivatives), but it is rarely feasible to determine the proportion of critical sites.As commented by Graur (1985, p. 53), 'the "importance" of a protein or site is frequently inferred from its rate of evolution, and the argument thus becomes a circular one'.Kimura & Ohta (1974) suggested that functional constraint may also depend on 'functional importance', which could be interpreted to mean 'indispensability': the probability that an organism can survive and reproduce without the given protein (Wilson, Carlson & White, 1977).However, gene knockout experiments have provided little empirical support for this hypothesis, and indicate that essential genes have similar substitution rates to non-essential genes (Hurst & Smith, 1999;Zhang & Yang, 2015).It has since emerged that protein substitution rates correlate with expression levels, but the underlying mechanisms are yet to be understood (Zhang & Yang, 2015).Functional constraint is still a vague catch-all term, which compromises the falsifiability of neutral theory predictions.
(3) Nearly neutral theory predictions The 'remarkable simplicity' (Kimura, 1991b, p. 371) of neutral theory equations depends critically on the unrealistic assumption that deleterious mutations are either so detrimental that they are immediately removed from the population, or alternatively so nearly neutral that they behave like completely neutral mutations (Gillespie, 1995;Akashi et al., 2012).In the words of Kimura (1991a, p. 5969): '(1-f 0 ) represents the fraction of definitely deleterious mutants that are eliminated from the population without contributing either to evolution or polymorphism, even if the selective disadvantages involved may be very small in the ordinary sense'.This simplification causes a disparity between the proposition (which is claimed to consider completely neutral as well as nearly neutral mutations) and the tested predictions (d = 2μf 0 and He = 4N e μf 0 ), which hold true if, and only if, ignoring nearly neutral mutations.
In reality, as pointed out by Ohta (1972a), nearly neutral, slightly deleterious mutations will segregate in the population and occasionally reach fixation, and thereby increase polymorphism levels and substitution rates.Taking both components (neutral and deleterious mutations) into account, the expected genetic distance is given by d = 2μt(f 0 + q(1 − f 0 )) (Ohta, 1972a(Ohta, , 1973)), with q denoting the proportion of deleterious mutations that reach fixation, a proportion which in itself depends on Ne.Similarly, polymorphism rates are a composite of polymorphisms in mutation-drift equilibrium (He = 4N e μf 0 ) and polymorphisms occurring in mutationpurifying selection balance (He / 1/N e ).This second part of the equation is more difficult to predict, but likely it is incorrect to assume that the proportion of sites in mutation-drift equilibrium largely outweighs the proportion occurring in mutation-selection equilibrium (Ohta, 2003).Although deleterious variants can be relatively rapidly purged by selection, the high frequency with which they arise through mutation means they could constitute a substantial, perhaps even largest, proportion of total polymorphisms (Lewontin, 1974;Akashi & Schaeffer, 1997;Bertorelle et al., 2022).Thus, substitution rates and polymorphism levels are much less predictable when slightly deleterious alleles are included in the equations (Kimura, 1979a).
Gradual refinement of the neutral hypothesis has generated a multi-parameter hypothesis (the nearly neutral hypothesis) which is difficult to falsify on empirical grounds, as almost any observation can be brought into accordance by adjusting one or multiple parameters (particularly f 0 and N e , the latter determining q) (Akashi et al., 2012).In the words of Kreitman (1996, p. 683): 'The battle [between neutralists and selectionists] may have shifted slightly from complete neutrality to near neutrality.But this slight shift is a quantum leap in terms of the difficulty in distinguishing between the nearly neutral model and the stronger selection models.[…] One can almost always propose a particular history of changes in population size that will account for almost any pattern of molecular variation or change.Thus, unlike the strictly neutral theory, the slightly deleterious model cannot be easily falsified'.Crow (1997, p. 262)  Moderating the neutralist-selectionist debate by incorporation of favourable mutations in a fluctuating environment'.Kimura held 'that the nearly neutral theory might be more realistic than the […] neutral theory, but that the latter was certainly more useful than the former'.(Ohta, 2003, p. 376).
This drawback of the nearly neutral hypothesis illustrates the delicate trade-off between simplicity and realism of a model (Higgins, 2004).The assumption of reduced efficiency of purifying selection in small populations (Ohta, 1972c), which makes the nearly neutral hypothesis less 'attractive as a null model' (Kreitman, 1996, p. 678), is supported by empirical data (Hughes, 2008).For instance, comparisons across species with a range of population sizes have revealed a negative relationship between N e and amino acid substitution rates (Akashi et al., 2012;Galtier, 2016;Moutinho, Bataillon & Dutheil, 2020), between N e and the ratio of radical to conservative substitutions (Kr/Kc) (Weber et al., 2014;Weber & Whelan, 2019), between N e and the ratio of non-synonymous to synonymous substitutions (Popadin et al., 2007;Hughes, 2008;Kim & Yi, 2008;Weber et al., 2014;Galtier, 2016;Figuet et al., 2016;Bolívar et al., 2019), between N e and the difference between expected and observed heterozygosity (Corbett-Detig et al., 2015), and between generation time (proxy for N e ) and extent of codon usage bias (Subramanian, 2008).
(4) Predictions based on a continuous DFE For simplicity, the neutral theory assigns mutations artificially into discrete categories, namely: (slightly) deleterious, neutral, and beneficial.In reality, of course, the fitness effects of mutations are better described by a continuous distribution, ranging from lethal to highly beneficial (Crow, 1972a;Ohta, 1977;Kimura, 1979a;Ohta & Gillespie, 1996;Keightley & Eyre-Walker, 2010;Razeto-Barry et al., 2012), meaning that 'the borderline between deleterious and neutral mutations is vague' (Ohta & Kimura, 1971b, p. 393).Ohta (1977) and Kimura (1979a) used mathematical modelling to derive neutralist predictions which are not based on the erroneous assumption of discrete categories, but instead on a more realistic, continuous distribution of fitness effects (DFE).This increases the precision of neutral theory predictions, but only if the hypothesised underlying DFE is accurate.Ohta (1977) assumed the DFE is best described by an exponential distribution, while Kimura (1979a) argued in favour of a gamma distribution.
On the downside, not much remains of the elegant simplicity of the neutral theory once its equations have been rewritten to derive predictions from a continuous distribution.The complexity of this type of model is illustrated by a conceptual error which was uncovered by Gillespie (1995).Both Ohta (1977) and Kimura (1979a) originally defined the fitness effects of novel mutations in their models relative to the dominant allele, and as a consequence the DFE shifted every time the dominant allele was substituted.To overcome the shortcomings of these 'shift models', Ohta & Tachida (1990, p. 220) introduced a 'fixed model' (also known as house-of-cards model) (Gillespie, 1995;Razeto-Barry et al., 2012), in which 'the distribution of fitness coefficients is fixed regardless of the allelic state occupying the population'.However, Gillespie (1995) discovered that this model, which was supposed to represent Ohta's nearly neutral hypothesis, in fact equated to a selectionist hypothesis: it predicted half of all substitutions to be advantageous.
(5) A theory under construction The difficulty to falsify the neutral hypothesis is illustrated by the history of the neutralist-selectionist debate, which is characterised by repeated attempts of neutralists to reconcile neutrality predictions with empirical observations through adjustment of the four parameters f 0 , μ, N e , and q (and their interdependencies).
For instance, adjusting the parameter f 0 (functional constraint, or proportion of neutral mutations) from large to small (first revision), served to bring the neutral hypothesis into agreement with lines of evidence for pervasive purifying selection.For example, the observation that chemically dissimilar amino acids replace each other less frequently than chemically similar amino acids (Epstein, 1967), a fingerprint of purifying selection, was initially cited against the neutral hypothesis (Clarke, 1970b;Grantham, 1974), but after the revision as support for the neutral hypothesis (Kimura & Ohta, 1971b).Lewontin (1974, p. 228) complained: 'Thus the neoclassicists have the best of both worlds.Both randomness and non-randomness are interpreted as evidence in their favour.They do not tell us what observations might not confirm the theory'.
As a second example, King & Jukes (1969) argued that the correlation between codon degeneracy and amino acid type frequencies indicated stochasticity, but they were unable to explain the observed deviations (Gilis et al., 2001).These deviations from strict neutrality expectations were seized upon by selectionists as evidence against the neutral theory (Richmond, 1970;Clarke, 1970a), until modelling studies suggested that they could also arise from purifying selection (Kimura & Ohta, 1971b;Ohta & Kimura, 1971b).As a final example, King & Jukes (1969) claimed that the number of amino acid substitutions at variant sites within proteins follows a Poisson distribution.This claim, rejected by Clarke (1970a), Richmond (1970) and Fitch & Markowitz (1970), was tacitly retracted following the first revision of the neutral hypothesis (Kimura & Ohta, 1971c), which from then on correctly predicted fewer substitutions in functionally important protein regions (Kimura & Ohta, 1974;Cooper et al., 2005;Hughes, 2008).
A purported inverse relationship between the parameter μ (mutation rate per generation) and N e (effective population size) served to bring the prediction for protein polymorphism (i.e.He = (4N e μ)/(1 + 4N e μ)) in accordance with the observation that heterozygosity is relatively constant across species.Kimura & Ohta (1971c)  heterogeneity differs significantly from neutral theory predictions (Wilson et al., 1977).Unlike the strict neutrality model, the neutral hypothesis predicts that the substitution rate depends on the proportion of neutral and deleterious mutations (f 0 ), which can differ between environments and thus between lineages.The nearly neutral hypothesis adds that the substitution rate depends as well as on the efficacy of purifying selection, which varies across lineages with N e .Therefore, the neutral hypotheses allow for a certain degree of rate heterogeneity across lineages greater than that predicted by a simple stochastic process (Wilson et al., 1977;Kimura, 1977).This prediction has been confirmed by modelling and simulation studies, which have indicated that rate heterogeneity can indeed occur in the absence of positive selection and hence does not contradict the neutral hypothesis (Cutler, 2000;Bastolla, Vendruscolo & Roman, 2000).In other words, rejection of a strict molecular clock does not necessarily imply rejection of the neutral theory (Nei, Suzuki & Nozawa, 2010).Wilson et al. (1977, p. 615) commented: 'The use of the Poisson distribution as the probabilistic model for the neutral-mutation hypothesis may have been an oversimplification.It seems reasonable to propose, for example, that the rate of occurrence of neutral mutations [the factor "μf 0 "] might itself be subject to fluctuations.[…] Thus, although the variation of the evolutionary clock is greater than that of a Poisson process, this does not invalidate the neutral hypothesis'.Kimura (1991a, p. 5971) agreed: 'A universally valid and exact molecular evolutionary clock would exist only if, for a given molecule, the mutation rate for neutral alleles [μf 0 ] were exactly equal among all organisms at all times (which is rather unlikely in nature).[…] In other words, the variance of the evolutionary rates among different lineages for a given molecule may tend to become larger than expected from the simple Poisson distribution'.
(2) Lewontin's paradox Arguing against the neutral hypothesis based on Lewontin's paradox of variation is yet another example of erecting and refuting the neutralist strawman of strict neutrality.Strict neutrality predicts a mutation-drift equilibrium in which heterozygosity depends on N e , as given by He = 4N e μ (Kimura & Crow, 1964;Kimura, 1969aKimura, , 1991a)).In reality, levels of genetic variation appear to differ much less among species with varying N e than predicted by this equation (Maynard-Smith, 1970b;Lewontin, 1974;Nei, 2005b;Buffalo, 2021).This violation of strict neutrality predictions is regularly equated with violation of neutral hypothesis predictions, cultivating claims that Lewontin's paradox is incompatible with the neutral hypothesis (Hahn, 2008;Corbett-Detig et al., 2015;Kern & Hahn, 2018).However, as discussed in Section VII, the neutral hypothesis predicts some alleles to be in mutation-drift equilibrium and others in mutation-selection equilibrium, causing deviations of heterozygosity levels (Kimura, 1979b;Gillespie, 2000;Charlesworth & Charlesworth, 2018).Furthermore, even if all mutations are neutral, deviations from strict neutrality predictions may still be expected due to the confounding effects of linked selection (Gillespie, 2001;Buffalo, 2021), as well as due to violation of the assumption of population equilibrium conditions (Nei et al., 1975;Nei, 2005b) (see Section VII).
(3) Polymorphism levels versus recombination rates A third example of arguing against the neutral hypothesis based on rejection of strict neutrality predictions concerns the observed correlation between recombination rates and polymorphism levels.Linked selection reduces the local effective population size and therefore allows to test the prediction He / N e by comparing polymorphism levels across loci.More specifically, proteins or genes occurring in genomic regions with low recombination rates should harbour less genetic variation than proteins or genes in regions with high recombination rates (Begun & Aquadro, 1992;Begun et al., 2007;Hughes, 2008;Wagner, 2008;Lohmueller et al., 2011;Cutter & Payseur, 2013;Charlesworth & Jensen, 2021).Originally, it was argued that such a positive correlation between recombination rates and polymorphism levels is a hallmark of linked positive selection (Kaplan, Hudson & Langley, 1989;Begun & Aquadro, 1992;Hahn, 2008;Kern & Hahn, 2018).However, it was soon recognised that this correlation can also result from linked negative selection (Charlesworth, Morgan & Charlesworth, 1993;Hudson & Kaplan, 1995;Lohmueller et al., 2011;Frankham, 2012;Charlesworth & Jensen, 2021).
Thus, the observed correlation between recombination rates and polymorphism levels is of little use in assessing the validity of the neutral hypothesis.In the words of Kreitman (1996, p. 682): 'An alternative model of hitchhiking has now been proposed in which neutral mutations are eliminated by virtue of their linkage to frequently occurring deleterious mutation, and the data are now thought to be largely consistent with this theory.Therefore, what was initially thought to be incontrovertible evidence for positive selection is now explained by deleterious selection'.Kreitman (1996, p. 682) concluded: 'The unexpected strong correlation of nucleotide polymorphism levels and recombination means that selection and hitchhiking must be operating to shape patterns of variation […], but we may not be able to resolve whether that selection is mostly positive, mostly negative, or a mixture of the two'.
(4) Similar allele frequencies across populations An early argument against the neutral theory was the claim that the observed similarity of allele frequencies in geographically separated populations suggested balancing selection (Stone et al., 1968;Prakash, Lewontin & Hubby, 1969;Maynard-Smith, 1970b;Clarke, 1970b;Ayala, 1974;Ayala & Gilpin, 1974;Chakraborty, Fuerst & Nei, 1978).It was thought that the neutral hypothesis predicts high allele frequency differences between isolated populations, due to the random segregation of alleles.Li & Nei (1977) showed that this prediction only holds true for a system of populations in migration-drift equilibrium, which may take many generations to set in.Recently disconnected populations can retain the same alleles at similar frequencies for prolonged periods of time, provided that population sizes are large.They concluded (p.912): 'Recent electrophoretic surveys of proteins have shown that two related species often share many common alleles.Some investigators interpreted this as an indication of the selective maintenance of the alleles.[…] The present study, however, shows that two related species may share common alleles for a long time after their separation even if there is no selection'.
Because the level of population differentiation (F ST value) depends on many, often unknown, demographic factors (i.e.split time, effective population size, migration rate), allele frequency differences do in itself not allow to readily test the neutral hypothesis.But there is a potential work-around: evaluating the variation of F ST values across loci.Genetic drift, as a function of demographic changes, affects all loci, unlike natural selection (Cavalli-Sforza, 1966;Lewontin & Krakauer, 1973).Therefore, strict neutrality predicts that all loci should have roughly similar F ST values, with the neutral hypothesis allowing for few loci to stand out from this neutral distribution (de Jong, Lovatt & Hoelzel, 2021).
(5) Polymorphism levels versus substitution rates The neutral hypothesis makes a testable prediction regarding the covariance of divergence and polymorphisms across loci (e.g.proteins or genes), namely 'that molecules or parts of one molecule which are more important in function, and which therefore evolve more slowly, will show a lower level of heterozygosity' (Hudson et al., 1987, p. 153).
A locus that is free to evolve without positive or negative selection will have high levels of variation (e.g.high nucleotide diversity) within populations, as well as high levels of divergence (e.g.high sequence dissimilarity) between populations.A locus that is functionally constrained experiences purifying selection and is expected to have low levels of variation within populations, as well as low levels of divergence between populations (i.e.low He, low d) (Chen, Wang & Cohen, 2007).By contrast, a locus under positive selection will contain low variation within the selected population due to rapid fixation of the adaptive allele, but will differ from homologous loci in sister populations (i.e.low He, high d), assuming the locus is selected for in only one population.Furthermore, a locus under balancing selection will exhibit high levels of genetic variation within populations, but relatively low levels of genetic variation between populations (i.e.low He, low d).In summary, genetic drift and purifying selection cause a positive relationship between genetic distance and polymorphism levels (albeit for different reasons), whereas positive and balancing selection cause a negative relationship betweenor 'uncoupling of'genetic distance and polymorphism levels.These different predictions permit the neutral hypothesis and the selectionist hypothesis to be put to the test (Berry, Ajioka & Kreitman, 1991).Begun et al. (2007) tested this prediction on genomic data from two closely related Drosophila species, and found a negative relationship between divergence and polymorphism levels.It has been claimed that this finding is inconsistent with the neutral theory (Begun et al., 2007;Hahn, 2008;Wagner, 2008).However, the reasoning underlying the predicted polymorphism-divergence correlation applies only to direct selection, not to linked selection.Background selection and selective sweeps reduce intraspecies genetic variation, but do not alter substitution rates of linked neutral alleles (Birky & Walsh, 1988;Berry et al., 1991;Phung et al., 2016).Because linked selection does not affect substitution rates, genome-wide polymorphism-divergence correlations (Begun et al., 2007) cannot distinguish between the neutral hypothesis and the selectionist hypothesis (Jensen et al., 2019).Thus, the predicted positive correlation between polymorphism levels and divergence levels applies to protein or gene data only (Skibinski & Ward, 1982;Chakraborty & Hedrick, 1983;Ward & Skibinski, 1985;Hudson et al., 1987).Challenging the neutral hypothesis based on the absence of such a relationship in genome-wide data is another example of testing a prediction that does not logically follow from the neutral hypothesis (Begun & Aquadro, 1991;Berry et al., 1991;Begun et al., 2007).

IX. INVALID ARGUMENTS AGAINST THE SELECTIONIST HYPOTHESIS (1) Cost of selection
Useful predictions are discriminative: they logically follow from only one of the competing hypotheses.In the words of Gillespie (1984b, p. 733): 'It is not enough to argue that the neutral theory is compatible with the observations; it must also be shown that selection is incompatible'.In this last section we will question four observations from molecular data for which neutralists claim that they are consistent with the neutral hypothesis and inconsistent with the selectionist hypothesis.
One of the earliest arguments against the selectionist hypothesis was the 'cost of selection' argument.Kimura (1968a) calculated from protein sequence alignment data that substitution rates in nature greatly exceed Haldane's theoretical upper limit of sustainable adaptive substitutions (one in every 300 generations), as inferred from substitutional load considerations.Kimura (1968a) argued that this discrepancy implies that most substitutions cannot be adaptive, and instead must result from the fixation of neutral mutations through random genetic drift.
Similarly, considerations on segregation load appear to suggest an upper limit on the number of balanced polymorphisms (Kimura & Crow, 1964;Hubby & Lewontin, 1966;Dietrich, 1994).In theory, the fitness differences between individuals in a population with many polymorphisms under balancing selection would vary beyond reproductive capacities.Kimura (1968a) argued that the high levels of polymorphisms observed by Hubby & Lewontin (1966) therefore implied that most polymorphisms had to be neutral.
However, the 'cost of selection' argument has been challenged on theoretical grounds by numerous studies (Sved, 1968;Maynard-Smith, 1968;Brues, 1969; Biological Reviews 99 (2024) 23-55 © 2023 The Authors.Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.
However, this work-around is not flawless.Because deleterious alleles are unlikely to fix but do segregate within a population before being removed, the inferred dN/dS threshold is possibly an overestimate of the true corrected dN/dS threshold (albeit less severely as in the case of the threshold of dN/dS = 1).The outcome would be consistent underestimation of the proportion of adaptive substitutions (Fay, Wyckoff & Wu, 2002;Eyre-Walker, 2006;Charlesworth & Eyre-Walker, 2008;Andolfatto, 2008;Parsch, Zhang & Baines, 2009;Nei et al., 2010;Galtier, 2016;Booker et al., 2017;Murga-Moreno et al., 2019).Violation of the assumption of constant population sizes could occasionally cause further bias, in either direction.Overestimation of α could occur in the case of recent population expansion (if slightly deleterious mutations previously were fixed but recently became more effectively removed), whereas underestimation would occur in the case of recent population reduction (if slightly deleterious mutations previously were removed but recently became less effectively removed) (McDonald & Kreitman, 1991;Fay et al., 2002;Eyre-Walker, 2002;Parsch et al., 2009;Nei et al., 2010).

X. CONCLUSIONS
(1) The history of the neutralist-selectionist debate is characterised by continuous adjustment of four key parameters (i.e.effective population size, mutation rate, selective constraint, and efficiency of selection).These corrections, meant to reconcile neutrality predictions with empirical observations, have compromised the falsifiability of the neutral hypothesis.
(2) Two adjustments led to official revisions of the neutral hypothesis, as originally proposed by Kimura (1968a).The first and lesser-known revision acknowledged that many mutations are deleterious and are removed by purifying selection (King & Jukes, 1969;Kimura & Ohta, 1971c).The second revision acknowledged that in small populations deleterious mutations may contribute to polymorphism levels and substitution rates (Ohta & Kimura, 1971c;Ohta, 1972a).
(3) The neutral hypothesis of protein evolution is a statement about amino acid proportions, whereas the neutral hypothesis of genomic DNA evolution and the neutral hypothesis of functional DNA evolution are statements on nucleotide proportions across the entire genome and in functional regions, respectively.(4) The neutralist-selectionist debate is a battle on two frontsthe proportion of substitutions resulting from positive selection (α) and the proportion of polymorphisms resulting from balancing selection (beta)not the proportions of polymorphisms under directional (positive or purifying) selection.
(5) The neutralist-selectionist debate does not concern the question of whether truly neutral mutations exist, but the question of how often the selective value of a mutation permits genetic drift to override selection.If we acknowledge that genetic drift can govern the fate of non-conservative mutations, the neutral hypothesis has implications beyond the domain of molecular evolution, and predicts that certain phenotypic differences between species may be neutral.( 6) The phrase 'non-Darwinian evolution' is a misnomer.Neutral mutations indirectly facilitate the adaptation process, by providing an escape route out of local fitness optima.Evidence that certain traits have been fine-tuned by natural selection (such as codon usage or local mutation rates), should not be mistaken for evidence against the neutral theory.(7) Many arguments against the neutral hypothesisincluding arguments relating to the molecular clock and Lewontin's paradoxrefute the null hypothesis of strict neutrality rather than any version of the neutral hypothesis.Vice versa, observations claimed to support the neutral hypothesissuch as the high substitution rates in non-functional regions and synonymous sitesare often equally consistent with the selectionist hypothesis.(8) The neutral hypothesis of protein evolution and the hypothesis of functional DNA evolution are most clearly interpreted through meta-analyses of protein-coding genes and regulatory regions, rather than from indirect inferences obtained through evaluation of genome-wide patterns.

XI. ACKNOWLEDGEMENTS
Open Access funding enabled and organized by Projekt DEAL.

XII. AUTHOR CONTRIBUTIONS
This review has been written by MJ, under the supervision of CO, RH and AJ.

Fig. 2 .
Fig.2.Schematic overview of the neutral theory and the neutralist-selectionist debate.Green text presents arguments in favour of the neutral hypothesis, red text lists opposing views.The arguments shown relate to the neutral hypothesis of protein evolution (i.e.proportion of adaptive amino acid substitutions).The figure does not include arguments relating to the neutral hypothesis of DNA evolution (i.e.proportion of adaptive nucleotide substitutions), such as the prevalence of silent and non-coding substitutions.The validity of arguments is evaluated with respect to the revised neutral hypothesis, which acknowledges purifying selection.

Fig. 4 .
Fig. 4. The neutral hypothesis can be interpreted in various ways.Grey shading indicates one possible interpretation, namely: 'Most substitutions involving single-base replacements in functional DNA are nearly neutral'.The roman numbers denote the sections of this review in which each potential source of ambiguity is discussed.Biological Reviews 99 (2024) 23-55 © 2023 The Authors.Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.

Fig
Fig. 5. Flowchart depicting (dis)agreements between selectionists and neutralists.Selectionists and neutralists agree on many aspects of molecular evolution, except for on the question whether most substitutions and polymorphisms in proteins and functional DNA are caused by genetic drift or instead by positive and balancing selection.N e denotes effective population size, and s denotes the selection coefficient.

.
Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.

( 2 )
Ecological phenotypic divergence In theory, the neutral hypothesis leaves open the possibility that all or most of the phenotypic differences observed Biological Reviews 99 (2024) 23-55 © 2023 The Authors.Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.

( 4 )
Testing for neutral phenotypic divergence If the accumulation of nearly neutral substitutions indeed causes phenotypic divergence (as posited by the extended neutral hypothesis), then levels of molecular and phenotypic evolution are expected to correlate: both would depend on Biological Reviews 99 (2024) 23-55 © 2023 The Authors.Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.
commented: 'The neutrality theory is a hypothesis with rich empirical content, according to Popper's criterion of falsifiability.It makes precise predictions about the nature and Biological Reviews 99 (2024) 23-55 © 2023 The Authors.Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.
agreed: 'It has been remarkably difficult to distinguish experimentally or analytically between the nearly neutral theory and evolution Biological Reviews 99 (2024) 23-55 © 2023 The Authors.Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.
. In the words of Biological Reviews 99 (2024) 23-55 © 2023 The Authors.Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.

Table 2 .
(A) Three different definitions of neutrality.N e denotes effective population size, s denotes the selection coefficient, and z denotes an arbitrary threshold value, typically assumed to range between 0 and 2. (B) The neutral theory of molecular evolution and the Darwinian theory of evolution by natural selection are not directly comparable because they apply to different levels of biological organisation.
Biological Reviews 99 (2024) 23-55 © 2023 The Authors.Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.
Biological Reviews 99 (2024) 23-55 © 2023 The Authors.Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.
speculated that differences in effective population size are cancelled out by differences in mutation rate (measured per generation, μ g ).They wrote (p.469): 'The species with short generation time [and hence lower μ] tends to have small body size and attain a large population number, while the species Biological Reviews 99 (2024) 23-55 © 2023 The Authors.Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.
Biological Reviews 99 (2024) 23-55 © 2023 The Authors.Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.