The evolutionary impact of population size, mutation rate and virulence on pathogen niche width

Understanding the evolution of pathogen niche width is important for predicting disease spread and the probability that pathogens can emerge in novel hosts. Findings from previous theoretical studies often suggest that pathogens will evolve to be specialists in specific host environments. However, several of these studies make unrealistic assumptions regarding demographic stochasticity and the ability of pathogens to select their hosts. Here, an individual‐based model was used to predict how population size, virulence and pathogen mutation rate affects the evolution niche specialism in pathogens. Pathogen specialism evolved regardless of virulence or populations size; thus, the findings of this study are somewhat consistent with those of previous work. However, because specialist pathogens had only a weak selective advantage over generalist pathogens, high mutation rates caused random trait variation to accumulate, preventing the evolution of specialism. Mutation rate varies greatly across different species and strains of pathogen. By showing that high mutation rates may prevent pathogen specialism evolving, this study highlights an intrinsic pathogen trait that may influence the evolution of pathogen niche width.


| INTRODUC TI ON
The breadth of an organism's niche defines how generalist or specialist that organism is (Futuyma & Moreno, 1988). In the context of pathogens, a generalist could be described as a pathogen that has low variance in infectivity across different host phenotypes (Leggett et al., 2013). As generalist pathogens are able to infect a range of host phenotypes, it follows that generalist pathogens are more likely than specialists to be able to infect novel hosts (Antonovics et al., 2013).
Thus, because many infectious diseases emerge as a result of pathogens being transmitted to new species , identifying conditions that lead to the evolution of pathogen generalism may help to predict future disease outbreaks.
There are evolutionary advantages to being able to exploit a wide range of habitats, such as an improved ability to survive in a highly variable environment (Gilchrist, 1995). However, although most known pathogens are capable of infecting more than one host species, pathogens are typically less likely to infect and cause disease in novel hosts (Kuiken et al., 2006;Woolhouse et al., 2001). This may be because there is often an evolutionary trade-off associated with generalism (Whitlock, 1996). Trade-off theory predicts that generalists will be able to exploit a wider range of habitats than specialists, but specialists can exploit a narrower subset of habitats more successfully (MacArthur, 1984;Stearns, 1989). In pathogens, this may mean that generalists are able to infect a larger range of host phenotypes than specialists. However, specialist pathogens | 1257 FISHER may have a higher fitness than generalists when infecting specific host phenotypes. Although directly measuring trade-offs is difficult (Fry, 2003), there is some evidence to suggest that trade-offs associated with generalism do occur in pathogens (Bono et al., 2017;Cooper & Lenski, 2000;Visher & Boots, 2020). As such, pathogen generalism may only be selected under specific circumstances where the benefits of generalism outweigh the costs.
There is a large body of evidence suggesting that high environmental stochasticity selects for generalist species whose fitness is more consistent across a range of habitat conditions than that of a specialist (Davey et al., 2012;Wilson & Yoshimura, 1994). For commensal animals, climate change and anthropogenic disturbance are likely to be major sources of environmental stochasticity. For parasites and pathogens, environmental stochasticity may be largely determined by variation across host environments. However, comparatively little is known about how environmental stochasticity affects the evolution of generalism in pathogens, or which characteristics of host populations make them more likely to constitute a stochastic environment. Ecological theory predicts that small populations are more likely to experience higher levels of phenotypic stochasticity through time than large populations (Dennis, 2002;Fisher et al., 2020;Møller & Legendre, 2001). It is possible that increased phenotypic stochasticity creates a more variable host environment for pathogens. As such, small host population size may favour the evolution of generalist pathogens that have more consistent success across fluctuating host environments than specialists. The theory that exposure to a more varied host environment selects for generalist pathogens is supported by an empirical and theoretical study, in which exposure to tissue from multiple species led to the evolution of RNA viruses that were more successful on novel tissue types (Ogbunugafor et al., 2010;Turner et al., 2010). However, there is contrasting evidence from several theoretical studies which predict that specialism should evolve in pathogens even when the host population is variable (Kawecki, 1998;Papaïx et al., 2014).
In addition to the extrinsic factors that may affect the evolution of generalism, there may also be intrinsic pathogen characteristics that affect the evolution of generalism. Very high mutation rates, as are common in RNA viruses, can increase genetic diversity in a population. High genetic diversity increases the adaptive potential of a population and has previously been suggested to lead to adaptation to specific hosts in pathogens (Greischar & Koskella, 2007;Morgan et al., 2005). However, whether there is a causal link between mutation rate and the pathogen niche width remains unclear. As well as mutation rate, virulence may play an important role in the evolution of pathogen specialism. Both theoretical and laboratory studies have shown that high virulence can drive antagonistic co-evolutionary cycles whereby hosts continuously evolve to be less vulnerable to infection (Luijckx et al., 2013;Morgan et al., 2005). This can lead to rapid fluctuations in the genetic and phenotypic composition of host populations (Paterson et al., 2010). Rapidly changing host genotype frequencies will create a more variable host environment, and this may select for pathogens that have consistent infection success across a range of host environments. However, it is not known whether rapid changes to host phenotype frequencies brought about by high pathogen virulence selects for pathogen generalism.
Theoretical models have been extremely useful for providing predictions about conditions that can lead to the evolution of important epidemiological traits (Day et al., 2020;Gandon & Michalakis, 2002). Moreover, there are several theoretical studies which analyse the evolutionary ecology of niche width in pathogens and parasites. Despite providing valuable insights, many of these studies fail to incorporate several potentially important ecological details into their models; these include demographic stochasticity, continuous variation in host and pathogen phenotypes, and the inability of pathogens to actively select their preferred hosts. In this study, an individual-based co-evolutionary model was developed to explore how population size, pathogen virulence and pathogen mutation rate affect the evolution of inter-host generalism where there is a fitness trade-off associated with generalism. It was predicted that (a) demographic stochasticity would increase at smaller population sizes causing pathogens to evolve to be more generalist, (b) high virulence would cause large fluctuations in the phenotypic composition of host populations and select for generalist pathogens; and (c) high pathogen mutation rates would select for specialist pathogens by increasing phenotypic diversity in the pathogen population, allowing pathogens to rapidly adapt to specific host phenotypes, removing the need to be generalist.  (Table 1).

| Infection probability
The probability that a pathogen can infect a host is negatively related to the numerical difference between the host susceptibility and pathogen infectivity traits (|h -p|). As such, a particular pathogen is most likely to infect a host when |h -p| = 0 and least likely to infect when |h -p| = 1. In this sense, the model is similar to a matchingallele model in that infection probability is determined by compatibility between hosts and pathogens (Gandon & Michalakis, 2002;Luijckx et al., 2013). Each pathogen is also assigned a second trait (α) which can be 0, 0.5 or 1. The α trait modulates the strength of the effect that |h -p| has on the probability that the pathogen will infect a particular host. In other words, α is used to determine the degree of host specificity displayed by the pathogen and therefore how generalist or specialist the pathogen is. The shape of the curve describing the relationship between infection probability and host phenotype has been shown to have a significant effect on pathogen evolution (Gudelj et al., 2004;Lievens et al., 2020;Regoes et al., 2000). To this end, the model was run under two scenarios: one in which it was assumed that there was a negative sigmoidal relationship between |h -p| and the probability that the pathogen will infect the host (μ) and another scenario where a linear relationship was assumed ( Figure 1). For the sigmoidal scenario, the probability that pathogen i will infect host j is where φ is the gradient coefficient that determines the shape of the curve describing the relationship between |h -p| and μ. The value of φ was kept constant at three so that the relationship between |h -p| and μ was substantially different from linear when α = 1 (Figure 1a). The linear relationship (Figure 1b) between |h -p| and μ was formulated as For both the sigmoidal and linear scenario, when α = 1, μ varies from 1-0 as |h -p| varies from 0-1; however, when α = 0, μ remains constant at 0.5 (F). A pathogen with an α of 1 would have very high host specificity and would only be likely to infect phenotypically similar hosts (specialist), whereas a pathogen with an α of 0 would be equally as infective to all hosts (generalist).
Thus, in this study, a commonly accepted definition of ecological generalism is used (Davey et al., 2012;Devictor et al., 2008;Julliard et al., 2006), and a generalist pathogen is defined as a pathogen that has lower variation than a specialist in its infection probability (μ) across a range of host phenotypes.

Parameter
Description

| Pathogen fitness
Only pathogens in the inter-host environment are infectious.
Pathogens in the inter-host environment encounter hosts randomly, as is the case with airborne and waterborne diseases. It is assumed After infecting a host, pathogens may reproduce asexually within the host. The probability that a pathogen will reproduce (r p ) is proportional to the fitness of that pathogen relative to the fitness of all other pathogens in the population. Thus, r p of pathogen i is Because pathogens must infect a host before reproducing, the pathogen population size scales directly with the number of infected hosts such that P t+1 = I t . New pathogens are selected from a multinomial distribution such that the pathogen population at time point Pathogen offspring are either clones of their parents or mutants. The p and α traits mutate independently with probability m p . Thus, if p mutates it is replaced by a random value from 0 to 1, and if α mutates it is replaced by a randomly selected value of 0, 0.5 or 1. After reproduction, pathogen offspring are released into the inter-host environment and the parent pathogens will either: (a) be removed from the population if the host dies or (b) remain inside the host until the host recovers. Thus, parent pathogens cannot re-enter the inter-host environment meaning the infectious pathogen population is replaced at each time point.

| Host fitness
Hosts remain infected for one time point, after which hosts either die or recover. The infection period is deliberately kept short so that the host population evolves in response to fitness consequences imposed by infectious pathogens that exist within the same time point as the host. Host mortality and fecundity are assumed to have a hyperbolic relationship with infection load (n p ) as described by the formula n p n p + , where θ determines the gradient of the curve between n p and host mortality ( Figure S1). Hence, when θ is small, infection by a pathogen has a greater effect on the probability of host death than when θ is large. As such, smaller θ values correspond to more virulent pathogens. The probability that a host will die is positively This study aims to investigate long-term evolutionary dynamics between pathogens and their long-term hosts, not the impact of disease on extinction. As such, x was assigned a value that allowed populations to fluctuate while ensuring the long-term survival of the host population under a variety of conditions ( Figure S2).
Given the huge natural variation present in reproductive rates across species, the specific value of x is arbitrary.
The probability of a reproductive opportunity being successful is negatively associated with host-pathogen load (n p ) as described by the formula 1 − n p n p + ( Figure S1). The size of the host population at time t + 1 is therefore, The offspring of hosts are either clones of their parents or mutants. Mutant offspring are produced at a probability m h . When the offspring of a host mutates, its h value is replaced by a value selected randomly from 0-1. For all of the simulations, host mutation probability remained constant at 0.01 (1%).

| RE SULTS
The results of the simulations did not change depending on whether the linear or negative sigmoidal trade-off curves were used

| The dynamics of h and p
In all simulations, coevolution between the host susceptibility trait (h) and pathogen infectivity trait (p) was cyclical (Figure 2), which is a common symptom of antagonistic coevolution between hosts and pathogens (Kawecki, 1998;Sasaki, 2000). Variation in pathogen and host population size had no obvious effect on the magnitude and frequency of co-evolutionary cycling between the h and p trait values (Figure 2a-c). Increasing the pathogen mutation probability (m p ) had a dampening effect on the co-evolutionary cycles between h and p ( Figure 2d-f). As the value of θ was reduced (and pathogen virulence increased), the magnitude of and frequency of co-evolutionary cycles between h and p increased. As such, when θ was at its smallest value (which corresponds to the highest level of pathogen virulence), temporal variation in h and p was at its highest. The graphical observations of h and p dynamics are supported by the formal quantification of phenotypic change in h and p through time ( Figure S5).

| The evolution and fitness of specialist pathogens
Across all simulations, the proportion of specialists (α 1 ) in the pathogen population was higher than the proportion of medium (α 0.5 ) and generalist (α 0 ) pathogens ( Figure 3). Moreover, when pathogen mutation probability was at its lowest value (0.01) the pathogen population was composed almost exclusively of specialists. High frequencies of specialists were maintained in the pathogen population regardless of pathogen and host population size (H and P, F I G U R E 2 Coevolutionary cycling between the mean pathogen infectivity (p) and host susceptibility (h) traits through time. In panels (a-c), host population carrying capacity (K) varied between 250 (a), 500 (b) and 1,000 (c). In panels (d-f), pathogen mutation probability (m p ) varied between 0.01 (d), 0.05 (e), and 0.1 (f). In panels (g-i), θ varied between 300 (g), 200 (h), and 100 (i). Default values for K, m p , host mutation probability (m h ) and θ were 1,000, 0.01, 0.01 and 100, respectively respectively) and regardless of θ (Figure 3a-c, g-i). However, as pathogen mutation probability (m p ) increased, the proportion of specialists in the pathogen population decreased towards 0.5 (Figure 3d-f).
For simulations where α was kept constant (unable to evolve), there was variation in pathogen fitness at the early time points, with specialist (α = 1) and medium (α = 0.5) pathogens having higher fitness than generalist pathogens (α = 0). However, in the later time points (≈t > 50) the pathogen fitness converged between the different α variants (Figure 4). Fitness variation between the different α strains was not sensitive to variation in pathogen or host population size, variation in pathogen virulence (θ) or variation in pathogen mutation probability (m p ) (Figure 4).

| Interactions between m p , θ and population size
The relationship between pathogen mutation probability (m p ) and the frequency of specialists in the pathogen population was not sensitive to variation in pathogen virulence (θ) or variation in pathogen and host population size ( Figure 5). However, at lower population sizes there were a greater number of instances in which the frequency of specialists in the pathogen population fell below 0.4, meaning that the majority of the pathogen population was composed of nonspecialists.

| D ISCUSS I ON
The results of this study largely support the idea that pathogens evolve to be specialists-that is rather than evolving to be equally infectious across a range of host phenotypes, pathogens evolve to be highly infectious to a small subset of host phenotypes. The evolutionary dynamics between pathogen infectivity and host susceptibility traits were sensitive to changes in pathogen virulence and pathogen mutation probability. However, variation in the evolutionary dynamics between pathogen infectivity and host susceptibility did not appear to affect the evolution of pathogen specialism.
Being a specialist only earned pathogens a small fitness advantage over generalists, meaning that selection for specialism was weak. As such, the evolution of pathogen specialism was not robust to large amounts of phenotypic change brought about by high pathogen mutation probability.
Coevolutionary cycles between two species or strains are typically formed when the success of one species or strain negatively affects the success of the other (Van Valen, 1973). When this negative association is particularly strong, co-evolutionary selection pressure can lead to accelerated rates of phenotypic and genetic change through time (Nair et al., 2019;Sasaki & Godfray, 1999). In this study, it was predicted that high virulence would lead to high variation in host phenotypes and therefore select for generalist pathogens.
Although high virulence did lead to greater variation in host phenotype through time (Figure 2g-i; Figure S5), specialist pathogens were still selected for (Figure 3g-i). This is likely to be due to the ability of the pathogens in this model to evolve such that their infectivity traits tightly tracked the evolutionary path of host receptivity traits ( Figure 2). Also, because there was a trade-off associated with generalism in this model, being equally as infective to multiple host phenotypes would be maladaptive if pathogens can continuously evolve to be highly infectious to the host population. In nature, the adaptive potential of a pathogen is typically higher than that of their host owing to much shorter generation times. It is possible that this adaptive potential allows for tight coevolution between pathogens and hosts in natural systems and therefore selects for host specialism (Gandon & Michalakis, 2002;Morgan et al., 2005). Although specialist pathogens are less likely to be able to infect novel hosts, local adaptation to specific host phenotypes will likely increase evolutionary divergence among pathogen strains and thus increase pathogen diversity. This may be particularly true in populations where genetic and phenotypic diversity among hosts is high (Kang et al., 2009).
The ability of pathogens in this model to rapidly adapt to changes in the phenotypic structure of the host population could also explain why any potential differences in phenotypic stochasticity brought about by variation in population size did not affect the evolution of pathogen specialism (Figure 3a-c). However, rare instances in which the proportion of specialist to generalist pathogens in the population fell below 0.4 occurred more frequently at small population sizes ( Figure 5). This phenomenon is likely to be due to the simple fact that at small population sizes, each random variant represents a larger proportion of the population. As such, in small populations, the probability that a high proportion of the population will be composed of random variants is higher than in a large population.
Although this mechanism for stochasticity may seem mathematically obvious, it could provide insights into how the probability of the emergence of rare pathogen variants scales with host and pathogen population size.
Traits evolve when the effect of selection on the frequency of a trait is stronger than random changes to trait frequency. As such, when random variation is frequently introduced into a population, selection must be sufficiently strong for directional evolution to take place. For example, very high rates of migration in viruses can homogenize gene frequencies and prevent local adaptation (Vogwill et al., 2008). In this study, specialist pathogen strains had substantially higher fitness than generalist strains at the early time points (t < 50). This early advantage of specialism is likely due to high levels of random preselection phenotypic variation in the host population.
In this scenario, the majority pathogens would, by chance, be phenotypically similar to a subset of hosts, reducing the benefit of being equally infectious to all hosts. However, at the later time points once co-evolutionary cycling had begun and hosts had started to 'evolve away' from pathogens, the fitness advantage of specialism decreased such that specialists were only marginally fitter than generalists ( Figure 4). Pathogen specialism was therefore under weak selection. As a result, high variation in trait value brought about by high mutation probability prevented selection from propagating specialist trait values. This led to a decreased proportion of specialists in the pathogen population (Figure 3d-f). There is strong theoretical evidence to suggest a selective advantage of fast mutation rates for disease-causing pathogens (André & Godelle, 2006;M'Gonigle et al., 2009). This evidence is supported by the fact that high mutation rates are well documented in many disease-causing pathogens, particularly RNA viruses (Sniegowski et al., 1997;Woolhouse et al., 2001). Moreover, many RNA viruses are known to be able to infect multiple host species and are considered to be more likely to cause disease in novel hosts than DNA viruses . Previously, it was thought that high genetic variation caused by high mutation rates may be the evolutionary mechanism that allows many RNA viruses to infect novel hosts. Results from this F I G U R E 3 Variation in the frequency of specialist (solid line, α = 1), medium (dashed line, α = 0.5) and generalist (dotted line, α = 0) pathogens through time. In panels (a-c), host population carrying capacity (K) varied between 250 (a), 500 (b) and 1,000 (c). In panels (d-f), pathogen mutation probability (m p ) varied between 0.01 (d), 0.05 (e) and 0.1 (f). In panels (g-i), θ varied between 300 (g), 200 (h) and 100 (i). Default values for K, m p , host mutation probability (m h ) and θ were 1,000, 0.01, 0.01 and 100, respectively. Lines indicate mean values from 100 simulations F I G U R E 4 Variation in pathogen fitness (wp) for specialist (solid line, α = 1), medium (dashed line, α = 0.5) and generalist (dotted line, α = 0) pathogens through time. In panels (a-c), host population carrying capacity (K) varied between 250 (a), 500 (b) and 1,000 (c). In panels (d-f), pathogen mutation probability (m p ) varied between 0.01 (d), 0.05 (e) and 0.1 (f). In panels (g-i), θ varied between 300 (g), 200 (h) and 100 (i). Default values for K, m p , host mutation probability (m h ) and θ were 1,000, 0.01, 0.01 and 100, respectively. Lines indicate mean values from 100 simulations model suggest that high mutation rates may increase pathogen generalism by limiting adaptation to specific hosts phenotypes, thereby potentially increasing the ability of pathogens to emerge in novel hosts.
The model presented in this study makes several assumptions which may limit how much the results reflect natural scenarios.
For example, it was assumed that there was no migration in or out of the host or pathogen population. In nature, migration between adjacent sub-populations is common in host and pathogen populations and has previously been shown to be important for disease spread and pathogen evolution (Gandon & Michalakis, 2002;Vogwill et al., 2008). Moreover, host migratory patterns vary greatly across taxa; this variation could have profound epidemiological effects. Flying animals may be of particular interest as they have the ability to transport pathogens at great distances to genetically distinct hosts. Frequent mixing of genetically distinct host may strongly select for generalist pathogens. Birds may prove to be an ideal study system for this question as they often form mixedspecies flocks for the purposes of foraging and are known to harbour several zoonotic diseases (Liu et al., 2005;Olsen et al., 2006;Reed et al., 2003). As well as assuming no migration, this study also assumed that mutation in hosts and pathogens was unbounded (i.e. any phenotype could mutate into any other possible phenotype). However, there is some evidence from RNA viruses that suggests genetic diversity created from mutation may be limited by a 'mutational neighbourhood' (Burch & Chao, 2000). As such, this model may overestimate the amount of diversity that is created by high mutation rates. Finally, this model assumes that the trade-off between the probability of infection and host-pathogen compatibility is either linear or sigmoidal (Figure 1). Measuring the shape of a trade-off is very difficult to do empirically (though see Lievens et al., 2020); hence, it is not known what shape most accurately reflects natural trade-offs. Previous theoretical studies have shown that the shape of a trade-off can have large implications for the evolution of pathogen generalism (Gudelj et al., 2004;Regoes et al., 2000). Thus, a more informed understanding of natural trade-offs would allow for more reliable host-pathogen tradeoff models to be constructed.
In recent years, humans have modified landscapes at an unprecedented rate, causing an increase in the spatial overlap between animal populations (Hulme, 2017). As a result, the risk of diseases being spread to novel hosts has also increased (Gibb et al., 2020).
The emergence and spread of new diseases can have huge negative implications for both human and animal populations (Brearley et al., 2013;Williamson et al., 2020). To this end, identifying what makes pathogens more likely to evolve to be able to infect new hosts is becoming increasingly important. This study has shown that higher mutation rates may prevent specialism evolving in pathogens, resulting in pathogens that are more infectious to a wider range of hosts. These findings are supported by the current knowledge that pathogens which can infect multiple hosts are often RNA viruses with very high mutation rates. There are still gaps in the knowledge regarding what drives the evolution of pathogen generalism. For example, although the effect of animal ecology on disease spread is fairly well understood (Altizer et al., 2011), little is known about how variation in host ecology affects pathogen evolution in terms of niche width. Future theoretical work establishing a stronger link between host ecology and pathogen evolution would greatly improve our ability to predict which species are likely to carry generalist pathogens.

ACK N OWLED G EM ENTS
I would like to thank Andy Fenton, Mark Viney and Stephen Cornell for their helpful comments on the model and manuscript. F I G U R E 5 The impact of pathogen mutation probability (m p ) and virulence (θ) on variation in the proportion of the pathogen populations that were specialists (α 1 ) after 100 time points. Panels (a-d) correspond to a host population carrying capacity (K) of 125, 250, 500 and 1,000, respectively. Host mutation probability (m h ) remained constant at 0.01

CO N FLI C T O F I NTE R E S T
The author has no conflict of interest to declare.

AUTH O R CO NTR I B UTI O N S
All work related to this manuscript was carried out by AMF.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/jeb.13882.

DATA AVA I L A B I L I T Y S TAT E M E N T
Code for the model is available at the following link: http://doi. org/10.5281/zenodo.4605612.