Can natural selection favour altruism between species?

Authors


Abstract

Darwin suggested that the discovery of altruism between species would annihilate his theory of natural selection. However, it has not been formally shown whether between-species altruism can evolve by natural selection, or why this could never happen. Here, we develop a spatial population genetic model of two interacting species, showing that indiscriminate between species helping can be favoured by natural selection. We then ask if this helping behaviour constitutes altruism between species, using a linear-regression analysis to separate the total action of natural selection into its direct and indirect (kin selected) components. We show that our model can be interpreted in two ways, as either altruism within species, or altruism between species. This ambiguity arises depending on whether or not we treat genes in the other species as predictors of an individual's fitness, which is equivalent to treating these individuals as agents (actors or recipients). Our formal analysis, which focuses upon evolutionary dynamics rather than agents and their agendas, cannot resolve which is the better approach. Nonetheless, because a within-species altruism interpretation is always possible, our analysis supports Darwin's suggestion that natural selection does not favour traits that provide benefits exclusively to individuals of other species.

Introduction

‘If it could be proved that any part of the structure of any one species had been formed for the exclusive good of another species, it would annihilate my theory, for such could not have been produced through natural selection’.

Darwin (1859, p. 201)

Darwin's (1859) theory of natural selection explains the process and purpose of organismal adaptation. Specifically, those heritable characters that are associated with higher individual reproductive success will tend to accumulate in biological populations under the action of natural selection. Hence, Darwin argued, individual organisms will appear increasingly well designed to maximize their reproductive success. Darwin (1859) also recognized that natural selection can work indirectly, through the reproductive success of family members, so as to favour characters that promote the reproductive success of an individual's close kin. Hamilton's (1963, 1964, 1970) theory of inclusive fitness expanded on this principle, showing that natural selection can favour the evolution of altruistic behaviour that reduces the actor's reproductive success provided that sufficient benefits accrue to the actor's kin.

Darwin (1859) suggested that natural selection would never favour altruism between individuals of different species. This appears to be borne out by empirical observation: although cooperative interactions between different species (mutualisms) are widespread in the natural world, these typically involve mechanisms that ensure return benefits accrue either to the actor or to her close kin (Foster & Wenseleers, 2006; Leigh, 2010; Bourke, 2011). For example, plants that form symbioses with mycorrhizae provide more carbohydrates to mutualistic partners that supply more nutrients, giving the mycorrhizae an incentive to cooperate (Kiers et al., 2011). This mutualism may involve mutually beneficial helping if sufficient return benefits accrue to the helpful mycorrhiza. Alternatively, it may involve altruistic helping, favoured owing to return benefits that accrue to the mycorrhiza's close kin forming symbioses with the same root. Hence, although altruism may occur in the context of mutualisms, it appears that this altruism is occurring within rather than between species.

From a theoretical perspective, although some authors have argued that natural selection cannot favour altruism between species (Foster et al., 2006; Bourke, 2011), others have argued that it can (Frank, 1994; Fletcher & Zwick, 2006; Fletcher & Doebeli, 2009). Hamilton's (1963, 1964, 1970) theory of inclusive fitness highlights that it is not kinship (i.e. genealogical relationship) per se that is needed in order for altruism to be favoured, but rather that the actor and recipient are genetically similar (i.e. genetic relatedness). Frank (1994) suggested that genetic relatedness could arise between species due to the action of selection in viscous populations in a way that could favour the evolution of altruism between species (see also Gardner et al., 2007). However, a formal analysis of when such genetic associations arise, when they will favour indiscriminate helping between species, and whether this helping fits the criteria for altruism between species remains to be undertaken.

Here, we first develop an infinite stepping stone population genetic model to provide a concrete illustration of whether and how indiscriminate helping can evolve between species. Previous theory has shown that: (i) population viscosity alone can favour the evolution of indiscriminate helping within a single species because it leads to a positive genetic relatedness between interacting individuals (Hamilton, 1964; Ohtsuki & Nowak, 2006; Ohtsuki et al., 2006; Grafen, 2007; Lehmann et al., 2007a; Taylor et al., 2007), (ii) the evolution of reciprocal helping between species can be facilitated by population structuring (Doebeli & Knowlton, 1998), (iii) the evolution of indiscriminate helping between species can be favoured by transmission mechanisms that systematically force pairs of helpers together across generations (Yamamura et al., 2004; Gardner et al., 2007; Fletcher & Doebeli, 2009). Our aim here is to extend this previous theory, by examining whether population viscosity alone can favour indiscriminate helping between species.

We then examine whether such helping can be classified as between-species altruism using an inclusive fitness analysis. This requires dissecting the analytical conditions calculated for natural selection to favour indiscriminate between species helping in our model into separate inclusive fitness costs and benefits. In their most general sense, the costs and benefits of an inclusive fitness analysis are defined as least-squares regressions of fitness against genetic predictors (Queller, 1992; Gardner et al., 2011). We explore the consequences of allowing or disallowing the genes of other species to feature in this regression analysis, both in general and also using our stepping stone model as a concrete illustration. Our aim here is to determine both whether indiscriminate helping between species can qualify as altruism between species and whether it must be considered altruism between species.

Indiscriminate helping between species?

In this section, we ask whether indiscriminate helping between species can be favoured by natural selection despite fecundity costs to the helper. We develop an infinite stepping stone population model, derive analytical conditions for helping to be favoured in this model, and then check the robustness of these results using individual-based numerical simulations of finite populations.

Model

We consider two identical asexual haploid species – A and B – in a one-dimensional stepping stone model (Kimura & Weiss, 1964) with infinitely many consecutively numbered patches, each containing one individual of each species. Individuals vary only at a locus controlling social behaviour and may carry either an allele for helping (H) or carry a null allele and are nonhelpers (N; Ohtsuki & Nowak, 2006; Taylor et al., 2007; Taylor, 2010; Grafen & Archetti, 2008). The fecundity of an individual in patch i is given by Fi = 1 – cx + by, where x is her own helping genotype (x = 1 if H, x = 0 if N) and y is the helping genotype of her social partner (y = 1 if H, y = 0 if N; as illustrated in Fig. 1). Thus, 0 < < 1 is the marginal fecundity cost of cooperation and b > 0 is the marginal fecundity benefit of cooperation.

Figure 1.

Population structure. This figure shows a section of five patches in the population. The fecundity rate of individuals is affected by the individual of the other species with which they share a patch and they compete with individuals that reside two patches away. Shaded patches are inhabited by helpers, whereas white patches are inhabited by nonhelpers.

In every generation, we assume that all individuals die and that most are replaced by their clonal offspring, resulting in no genetic change within the patch. However, a small fraction of individuals are chosen at random to die without reproducing in this way, in which case their two conspecific neighbours in adjacent patches compete to fill the vacant breeding spot with one of their own offspring. If the fecundity of the neighbour in patch i−1 is Fi−1 and the fecundity of the neighbour in patch i+1 is Fi+1, then the probability that a vacant spot in patch i is filled with an offspring of the neighbour in patch i−1 is Fi−1/(Fi−1 + Fi+1) and the probability that it is filled with an offspring of the neighbour in patch i+1 is Fi+1/(Fi−1+Fi+1).

Evolution of helping

We consider a resident population of nonhelpers into which we introduce helpers of both species at random and at low frequency. Most helpers will leave no descendants in the long term, owing to them never meeting helpers of the other species, and hence, being outcompeted by their nonhelping conspecific neighbours. However, there is a nonzero probability that any helper of species A will eventually meet a helper of species B. If this happens, then there may be a nonzero probability that these helpers will give rise to an expanding chain of patches that contain a helper of each species, leading the local frequency of helpers to increase when:

display math(1)

(see Appendix for derivation).

More realistically, we should take into account the effects of interactions between chains of patches that contain a helper of each species, and also the effect of mutation, which is the ultimate source of genetic variation. We assume that nonhelpers transform into helpers, and vice versa, at a low rate in each generation. The evolutionary dynamics are consequently complicated by the fact that nonhelpers may appear within expanding chains of patches containing a helper of each species, whether due to the junction of two pre-existing chains or fresh mutational input. Thus, it is not sufficient that chains of helpers tend to increase in length (i.e. inequality (1)), but also these chains must expand faster than the subpopulations of nonhelpers that appear within them. This gives rise to a more stringent condition for natural selection to favour helping:

display math(2)

We obtain this result irrespective of the relative rates of mutation in each direction (see Appendix for derivation), although the derivation requires low absolute rates. These analytical results are readily confirmed by numerical simulation with higher mutation rates, also revealing the robustness of the results to relaxation of the assumption of infinite population size (Figs 2 and 3, see Appendix for details). Natural selection favours helping when the benefit is greater than approximately 7.13 for a cost near 0 and when the benefit is greater than approximately 6.39 for a cost near 1. The required level of benefit changes almost linearly with cost. We notice that the cost has a relatively small effect on whether or not natural selection favours helping.

Figure 2.

Condition for expected local frequency of helpers to be expected to increase under natural selection. The line indicates the analytically derived condition for the expected local frequency of helpers to increase under natural selection (inequality (2) is satisfied above the line). Black dots indicate parameter values where helpers are significantly fitter than a neutral allele, white where they are significantly less fit and grey where there is no significant difference in fitness at a 95% confidence level (see Appendix for details).

Figure 3.

Condition for natural selection to favour helpers. The line indicates the analytically derived condition for natural selection to favour helpers (inequality (1) is satisfied above the line). Dots indicate parameter values tested by simulation and darker dots indicate greater evolutionary success of helpers (see Appendix for details).

Paradoxically, we find that between species helping is promoted when the fecundity cost of helping is higher, as higher values of c make conditions (1) and (2) less stringent. This is because a larger cost is associated with stronger selection, and selection is responsible for generating a statistical association between species, such that helpers of one species are more likely to be associated with helpers of the other species (see Appendix; Gardner et al., 2007). Natural selection can even favour helping when the fecundity cost is 1, meaning that helpers who do not share patches with other helpers cannot successfully place offspring into adjacent patches. This maximizes the association between helpers in the two species.

Altruism between species?

We have determined that indiscriminate helping between species can be selectively favoured in the face of fecundity costs. But can this helping be considered as altruism between species? To address this problem, we need to calculate the cost and benefit terms of Hamilton's rule. In empirical studies, fecundity and survival effects are often used as readily measured proxies for these costs and benefits. However, from a theoretical perspective, the benefit and cost terms of Hamilton's rule are not just the fecundity and/or survival effects (Rousset, 2004; Gardner et al., 2011). Most generally, the benefit and cost terms of Hamilton's rule are defined as marginal fitness effects, which are computed by means of least-squares regression of fitness against genetic predictors (Queller, 1992; Gardner et al., 2007, 2011).

Here, we briefly review: (i) how least-squares regression methodology can be used to formally separate individual fitness into its direct versus indirect (kin selected) effects (Queller, 1992; Frank, 1997a,b, 1998; Gardner et al., 2007, 2011), (ii) how these fitness effects are used to classify social behaviours as altruistic, selfish, mutually beneficial or spiteful (Hamilton, 1964; West et al., 2007), and (iii) how these fitness effects are weighted by coefficients of relatedness to yield Hamilton's rule of kin selection (Hamilton, 1963, 1964, 1970; Queller, 1992; Gardner et al., 2011). We then: (iv) describe an ambiguity that arises in the application of these methods to our evolutionary model of helping between species, and (v) show that this ambiguity has a bearing upon whether or not such helping is classified as altruism between species.

Direct fitness versus indirect fitness

An individual's fitness w is her expected lifetime number of offspring that survive to breed in the next generation. Fitness depends not only on an individual's own genotype but also on the genotypes of her social partners. We may calculate the separate fitness effects by fitting an equation of the following form to population data by the method of least squares:

display math(3)

where the predictor x is the focal individual's genic value for helping, and the predictors x1,,xn are the genic values of the individual's n social partners, that is, those individuals whose genotypes mediate the focal individual's fitness (e.g. Gardner et al., 2011). The partial regression coefficient math formula describes the effect of the individual's own genic value on her fitness, holding fixed the genic values of her n social partners, and defines the direct fitness effect –C. The partial regression coefficient math formula describes the effect of the individual's jth social partner's genic value on her fitness, holding fixed the genic value of the focal individual and the genic values of her n−1 other social partners, and defines an indirect fitness effect Bj. We note that any partition of fitness that includes the focal individual's genic value for helping allows us to recover the total fitness effect of helping. The effects of genic values that mediate the focal individual's fitness but are not used in the partition will be redistributed into the fitness effects of the other predictors included in the analysis.

Classification of social behaviours

The signs of the direct and indirect fitness effects yielded by the above regression analysis are used to formally classify social behaviours. According to Hamilton's (1964) two-by-two matrix (Fig. 4), those behaviours involving B > 0 and C > 0 are ‘altruistic’, those involving B > 0 and C < 0 are ‘mutually beneficial’, those involving B < 0 and C > 0 are ‘spiteful’ and those involving B < 0 and C < 0 are ‘selfish’ (see West et al., 2007 for a review of the history of this terminology). Importantly, these fitness costs and benefits derived from the statistical model must not be confused with the fecundity cost and benefit c and b of the evolutionary model (Rousset & Ronce, 2004; Grafen, 2007; Lehmann et al., 2007a,b; West et al., 2007; Gardner et al., 2011).

Figure 4.

Hamilton's classification of social behaviours. A classification of social behaviours based on their effect on the reproductive fitness of actors and recipients.

Hamilton's rule

We can weigh the direct and indirect fitness effects (−C and Bj) yielded by the above regression analysis by appropriate coefficients of genetic relatedness (rj) to give a condition for natural selection to favour an increase in the trait of interest. This is Hamilton's (1963, 1964, 1970) rule: −C + ∑j Bj rj > 0. This can easily be seen to emerge from application of the least-squares regression model of individual fitness to Price's (1970) equation of natural selection. Price's equation states that the change in average genic value is given by:

display math(4)

noting that E(w) = 1, since the population is of fixed size across generations. Substituting the expression for fitness in eqn (3) into eqn (4) yields:

display math(5)

noting that cov(x,x) = var(x), cov(xj,x)/cov(x,x) = βxj,x, cov(ε,x) = 0 and cov(ε,xj) = 0 for all j. Replacing the complicated regression terms with the –C and Bj symbols defined above (Queller, 1992), and noting that rj = βxj,x is the regression form of genetic relatedness between a focal individual and her jth social partner (Orlove & Wood, 1978), the condition for the genetical trait to be favoured by natural selection ΔE(x) > 0 is Hamilton's (1963, 1964, 1970) rule: −C + ∑j Bj rj > 0.

Which predictors to use?

Here, we have used purely genetic predictors of fitness (Queller, 1992; Gardner et al., 2011). Frank (1997a,b, 1998, 2013a,b) has clarified that any set of predictors – including, for example, phenotypes – can be used. However, we are following Fisher's (1918, 1930) genetical paradigm that partitions the action of natural selection into purely genetic effects, as this reduces ambiguity over the definition of direct versus indirect fitness effects and consequent classification of social behaviours. For example, in a scenario involving reciprocated cooperation among nonrelatives, the direct fitness effect of a gene for cooperation can be positive, because it is associated with greater levels of cooperation among one's social partners, independently of the genes that they carry. Thus, cooperation, in the context of reciprocity, is a mutually beneficial behaviour (West et al., 2007). But, if cooperation phenotypes had been used as explicit predictors of individual fitness, then because the partial effect of the individual's own phenotype is negative and the partial effect of the phenotype of a social partner is positive, the reciprocated cooperation would be incorrectly diagnosed as altruistic.

Nevertheless, even restricting ourselves to purely genetical predictors of fitness, an ambiguity arises as to which set of genes we should use in our regression analysis. Specifically, should we only consider those genes belonging to social partners of the individual's own species, or should we also consider those genes belonging to heterospecific social partners? Below, we show that the genes of conspecific and heterospecific social partners both mediate the focal individual's fitness. We then investigate the consequences of taking alternative approaches to resolving the ambiguity over the use of statistical predictors of fitness.

Causal relationship between genes and fitness

Genes in both species mediate the focal individual's fitness (Fig. 5a). First, her fitness is mediated by her own gene at the locus for helping as, all else being equal, she has fewer offspring if she helps more. Second, her fitness is mediated by the gene at the locus for helping carried by her heterospecific patch mate as, all else being equal, she has more offspring if her patch mate helps more. Third, her fitness is mediated by the genes carried by the conspecific individuals residing two patches away on either side, because she competes with these individuals to leave offspring whenever the patches immediately adjacent to her own become vacant. All else being equal, she has more offspring if these conspecific individuals help more. Fourth, her fitness is mediated by the genes carried by the heterospecific individuals residing two patches away on either side, because their help enhances the fitness of her competitors. All else being equal, she has fewer offspring if these heterospecific individuals help more.

Figure 5.

Genetic mediators of fitness. (a) The genetic composition of past generations of both species causally impacts upon the genetic composition of the present generation of both species, and the genetic composition of the present generation of both species causally impacts upon the focal individual's fitness, w. A larger number of individual affect the fitness of a focal individual in each previous generation. (b) A path diagram illustrating the statistical model of fitness that uses genetic predictors from the focal individual's species only. The causal impact of genes in the other species is subsumed into the causal impact of genes in the focal individual's species from previous generations. A larger number of conspecifics affect the fitness of a focal individual in each previous generation. (c) A path diagram illustrating the statistical model of fitness that uses genetic predictors from the focal individual's species and also from the other species. The causal impact of genes in past generations of the individual's own species is subsumed into the causal impact of genes in the present generation of the focal individual's own species and in the present generation of the other species. Genetic correlations are shown as dashed lines.

This suggests that six genes mediate the individual's fitness: three belonging to her own species and three belonging to the other species. However, the causality can be traced further back in time, to other genes. The local genetic composition of the other species owes, in part, to the local genetic composition of the individual's own species in the previous generation. It also owes, in part, to the genetic composition of the individual's own species in each generation prior to that. In fact, the presence of any helper in the other species after a sufficiently long time owes entirely to the action of the individual's conspecific helpers in previous generations, as without these natural selection would eliminate helpers in the other species. This flow of causation is illustrated in Fig. 5a.

Conspecific genetic predictors only

If we consider that only genes from the individual's own species may be used as predictors of her fitness, then the effects of heterospecific genes are subsumed into the effects of conspecific genes from past generations. The resulting path diagram is illustrated in Fig. 5b. In the context of this statistical model, the partial effect of increasing the focal individual's genetic value for helping (that is, keeping all other predictors constant) is to decrease her own fitness. Hence, the direct fitness effect of helping is negative (−C < 0) and helping is altruistic. However, the indirect fitness effects of helping are all within species according to this statistical model, and so although the trait is altruistic, it is altruism within species. The inclusive fitness interpretation of this view is that, by helping, the actor suffers a direct fitness loss, but enjoys an indirect fitness benefit by increasing the local abundance of heterospecific helpers whose help will improve the reproductive success of future generations of the actor's conspecific kin. In this statistical model, the help provided by heterospecific helpers to a focal individual is interpreted as caused by previous generations of conspecifics. It cannot be interpreted as caused by the focal individual, as she cannot impact the help provided by heterospecifics in her lifetime.

Conspecific and heterospecific genetic predictors

Alternatively, if we consider that genes from both conspecifics and heterospecifics may be used as predictors, then we need only to employ the six genes that determine a focal individual's fitness described above, as no other genes have an impact upon the individual's fitness except through those six. The resulting path diagram is illustrated in Fig. 5c. Again, the partial effect of increasing the focal individual's genetic value for helping is to decrease her own fitness, and hence, the direct fitness effect of helping is negative (−C < 0) and altruistic. Whenever helping is favoured, we have –C + ∑K Bk rk + ∑L Bl rl > 0, where K is the set of all conspecific genes that mediate fitness, and L is the set of all heterospecific genes that mediate fitness, and in particular −C + ∑K Bk rk < 0 (see Appendix for details). That is, the selective benefit of helping is dependent on the help that is directed to heterospecifics, and this diagnoses the helping trait as true altruism between species. The inclusive fitness interpretation here is that the coefficient of relatedness describes how the reproductive success of a social partner correlates with the transmission of copies of the actor's genes to future generations, and so – because the continued success of helper heterospecifics is important to the reproductive success of future generations of the actor's conspecific kin – these heterospecific social partners can be considered ‘relatives’ and consequently afforded positive valuation by the actor.

Discussion

We have developed an infinite stepping stone population model to explore the evolution of indiscriminate helping between species. We have found that natural selection may build genetic associations between individuals in different species, such that helpers of one species tend to interact socially with helpers of another species, giving an indirect fitness benefit for helping. Moreover, we have found that the classification of this helping behaviour depends on the modeller's decision as to which set of genetic predictors are used in a statistical model of fitness. If genetic predictors are restricted to those genes of the individual's own species, then the helping behaviour represents within-species altruism. However, if genetic predictors are allowed to include genes from both the individual's own species and also the other species, then the helping behaviour is diagnosed as between-species altruism.

Indiscriminate helping between species

We have shown that natural selection can favour indiscriminate helping, even when the trait can only benefit members of another species and carries a fecundity cost to the actor. Discriminate helping, involving mechanisms such as partner choice or partner-fidelity feedback, readily evolves between species, owing to return benefits for the actor and/or her conspecific relatives (Doebeli & Knowlton, 1998; West et al., 2002; Foster & Wenseleers, 2006). However, the evolution of indiscriminate helping between species has been more difficult to address. Frank's (1994) analysis suggests that indiscriminate helping between species can be favoured in principle, but his model was not fully dynamical, and so the robustness of this result is unclear. Fully explicit dynamical models that have considered indiscriminate between species helping are problematic in that they systematically force pairs of helpers together across generations through the transmission scheme (Yamamura et al., 2004; Gardner et al., 2007). In contrast, in the present model, we allow individuals to reproduce and disperse independently. A statistical association between helpers across species boundaries builds up purely by population viscosity and the action of natural selection.

We have focused upon a simple, infinite stepping stone model (Kimura & Weiss, 1964), for the purpose of illustration. Investigating the impact of more complex population structure on the evolution of helping between species represents an interesting avenue for future research. Perhaps most work on social evolution in genetically structured populations has focused upon the infinite island model (Wright, 1931). But the island model does not present any means for pairs of helpers from different species to retain associations whilst spreading into new territory. This is because every disperser moves to a new patch at random, independently of the destinations of the other individuals dispersing away from her natal patch. However, there is further scope for studying the evolution of helping between species on lattices, which have explicitly spatial structure in more than one dimension (Taylor, 1992).

Altruism between species

When is helping between species truly altruistic? Our analysis suggests that this classification issue hinges upon the set of fitness predictors that are employed in a regression analysis. Different sets of predictors lead to different partitions of fitness effects, including different estimates of the direct versus indirect components of an individual's fitness, and hence, differences in classification of social behaviour as altruistic, selfish, mutually beneficial or spiteful. We have focused on genetical predictors, because using phenotypes leads to ambiguity and misinterpretation. Moreover, the phenotype is not the inherited strategy upon which natural selection acts. For example, in the context of reciprocity between nonrelatives, if fitness is partitioned into the effects of own versus social partner's level of cooperation, then cooperation can appear altruistic (Fletcher & Doebeli, 2009), but if it is partitioned into the effects of own versus social partner's genes, then it appears mutually beneficial (West et al., 2007). The genetical approach is preferable, because it highlights that the rationale for cooperating in this scenario is to elicit cooperation from one's social partners, in a purely self-interested manner.

However, we have also shown that even the strictly genetical approach is beset by ambiguity over which genes are to be included as explicit predictors of fitness. In particular, do we only include genes belonging to the focal individual's own species, or do we also include genes belonging to other species? We have found that the evolution of between species helping can be fully accounted for using either approach. If we use only conspecific genes as predictors, then we find that helping between species constitutes within-species altruism. That is, the focal actor helps cooperators of the other species in order to improve the social environment for future generations of her own kin. In contrast, if we use both conspecific and heterospecific genes as predictors, then we find that helping between species constitutes between-species altruism. That is, the focal actor aids helpers of the other species because their reproductive success – like that of conspecific relations – is associated with an increase in the population frequency of the actor's genes.

Actors and recipients

The decision as to which genetic predictors are to be used in the regression model of fitness amounts to deciding which individuals we are considering as the actors and the recipients of the helping behaviour (Fig. 5). Simply having an impact upon a social partner's reproductive success does not necessarily make an individual an actor, if they might alternatively be considered a mere instrument that is used by a different individual – the true actor – in order to bring about the fitness effect. And simply having one's reproductive success mediated by a social partner does not necessarily make an individual a recipient, if they might alternatively be considered a mere instrument that is used by the actor to bring about a fitness effect for a different individual – the true recipient. This notion of agency is already implicit in any discussion of altruism: an intentional term, the use of which in scientific discourse is justified on the basis of a mathematical relationship (isomorphism) between the dynamics of natural selection and an individual-as-maximizing-agent analogy (Grafen, 2002, 2006).

If we use only conspecific genes as predictors then we must consider only conspecific individuals in the roles of actor and recipient. That is, those heterospecific individuals who mediate a focal recipient's reproductive success must be considered mere instruments, and the causality underlying their behaviour (i.e. why they carry out the helping behaviour) must be traced back to previous generations of the focal individual's conspecific kin (the true actors). Similarly, the heterospecific individuals whose reproductive success is mediated by a focal actor must be considered mere instruments, having only instrumental value in ensuring a better social environment for future generations of the actor's conspecific kin (the true recipients). In contrast, if we use heterospecific as well as conspecific genes as predictors of fitness, then we must consider both conspecific and heterospecific individuals in the roles of actor and recipient. Note that few mutualisms admit the latter interpretation – it requires special circumstances, such as those considered in our mathematical model, where genetic correlations arise between species. Most mutualisms appear to function through phenotypic correlations, such as cooperator association, partner-fidelity feedback or partner choice (Foster & Wenseleers, 2006). Also note that we cannot use only heterospecific genes as predictors because, unless these fully screen-off the effects of the individual's own gene, the sum of partial effects will not generally be equal to the overall least-squares linear regression of the individual's fitness against her own genetic value, which determines the direction of natural selection.

Are these different interpretations equally valid? A possible justification for the conspecifics-only approach is to note that the dynamics of natural selection is framed in terms of within-species genetical change, so that it makes sense to also restrict corresponding notions of optimization and agency to conspecifics. We also suggest that whilst there may be appreciable genetic relatedness across species with regard to helping genes, this might not be the case across the rest of the genome. In contrast, co-ancestry of conspecifics leads to an approximately equal relatedness across the entire genome, allowing for the evolutionary elaboration of complex adaptations.

On the other hand, a possible justification for the conspecifics and heterospecifics approach is that real-world organisms do not cease to manifest the appearance of agency and intention when we are considering the evolution of traits in other species. Consequently, it makes sense to regard individuals in all species as having agency at all times. Our formal analysis cannot address this issue, as it is framed only in terms of the dynamics of gene frequency change and not in terms of optimization theory, which is the proper framework within which to develop notions of agency and intentionality (Grafen, 2002, 2006). Hence, we leave this puzzle as an open problem for future exploration.

Conclusion

To conclude, was Darwin correct to rule out the adaptive evolution of behaviours that provide benefits only for individuals of other species? We suggest that he was. Natural selection can favour the evolution of indiscriminate helping between species and, in certain circumstances such helping may justifiably be interpreted as altruism between species. However, the alternative interpretation that such helping behaviour represents mere within-species altruism is available, as restricting the set of predictors to conspecifics gives a full account of the fitness effects of the trait. Thus, benefits to individuals of other species would never provide an exclusive explanation for any behaviour that has evolved by natural selection.

Acknowledgments

We thank Kevin Foster, Alan Grafen and David Queller and three anonymous reviewers for helpful comments and discussion, and the BBSRC, ERC, Royal Society and Balliol College for funding.

Appendix

Evolution of Helping: condition for the expected local frequency of helpers to increase over time under natural selection

Here, we derive the condition on b and c for natural selection to favour a local population of helpers in both species. Helpers can gain offspring irrespective of whether they share patches with other helpers. Therefore, as long as the population contains helpers at a high enough frequency that there are a finite number of patches between a pair of helpers in different species, stochastic effects ensure that a local distribution of helpers consisting of a set of at least two, but potentially many more, adjacent patches that contain two helpers, one of each species, may form. At the ends of this set, there may also be sequences of patches that contain one helper and one nonhelper, where all of the helpers in one of these sequences are from the same species. This local distribution is illustrated in Fig. 1.

Other local distributions of helpers may initially be present but when no two adjacent patches contain a helper of each species, natural selection eventually either eliminates helpers or gives rise to two adjacent patches that contain a helper of each species. Therefore, only helpers in a local distribution as described in the previous paragraph (and illustrated in Fig. 1) can be present in the population in the long term.

If a patch contains a helper of each species and is surrounded by two patches also containing a helper of each species, then selection in that patch does not impact the total number of helpers in the population as other helpers replaced deceased ones. Therefore, we can limit our attention to the edges of the distribution of helpers. The two edges are symmetric, so we consider only one of them (patches i−1, i and i+1 in Fig. 1).

The probability that a helper replaces a nonhelper and vice versa depends on the number of patches that contain a helper of one species and a nonhelper of the other (see Fig. S1 for details). We write the number of patches that contain only a single helper as s. For now, we assume that this number (s) can only change by one at a time through the replacement of an individual by the offspring of her neighbour. We relax this assumption in the next section and find that it does not affect our results. We can write the matrix of the relative rates of change in s.

display math(6)

We can enter fecundity cost c for helpers and benefit b for individuals that share patches with helpers to calculate each of the values in matrix (6). The rates of increase and decrease in s are equal for all s ≥ 2.

display math(7)

Hence, from the theory of Markov chains, there is a limiting distribution of the values of s. We calculate the long-term equilibrium frequency with which each value of s occurs in this distribution, ps.

display math(8)

Each as can be decomposed into as+, the rate at which s changes by the replacement of a nonhelper by a helper, and as−, the rate at which s changes by the replacement of a helper by a nonhelper. The average probability that a helper replaces a nonhelper is greater than the average probability that a nonhelper replaces a helper if

display math(9)

The values of each as+ and as− can be derived directly from the model

display math(10)

Substituting these into inequality (9) gives

display math(11)

Simplifying, we obtain inequality (1) of the main text. If inequality (1) is satisfied, helpers are expected to increase in number over time if the population is in the local distribution described at the beginning of this section and illustrated in Figs 1 and S1. That is, at least two patches containing a helper of each species surrounded on each side by a sequence of patches that contain a helper of one of the species but not of the other.

When inequality (1) is satisfied, the probability that a helper replaces a nonhelper is greater than vice versa in each species. Therefore, the probability that the patches that initially contain a helper of each species continue to do so forever is nonzero as the number of patches with a helper of each species can be described as a random walk bounded at zero where the probability of increase is greater than the probability of decrease. Hence, when inequality (1) is satisfied the expected number of helpers increases without limit over time. If inequality (1) is not satisfied, helpers are eventually eliminated.

Evolution of Helping: condition for the global frequency of helpers to increase under natural selection

In this section, we first analyse the effect of fresh mutational input at low frequency in the population (a). Then, we consider the impact of multiple instances of deceased individuals being replaced by neighbours in the same generation within a local distribution of helpers (b). Finally, we study the effect of local distributions of helpers meeting other helpers in the population (c).

(a) Further mutational input in the population

Nonhelpers randomly mutate to become helpers, and vice versa. We assume that rates of mutation in both directions are sufficiently low that, after one mutation, we can determine the effect of natural selection on helping prior to the occurrence of another mutation. We can ignore nonhelpers that mutate to become helpers: they are absorbed by larger distributions of helpers or eliminated by natural selection. Our simulations confirm that these assumptions do not require exceedingly low mutation rates (following section).

However, a local distribution of helpers can be disrupted by a mutation that arises in its midst (illustrated in Fig. S2). When this happens, there are two sequences of adjacent patches with a helper in each species joined by a single patch with a helper of one species and a nonhelper of the other. The nonhelpers and her descendants will always share a patch with a helper of the other species, unless her descendants reach the end of the sequence of patches with a helper in both species and join the global population of nonhelpers. Therefore, as long as the subpopulation of nonhelpers formed by the initial mutation avoids stochastic loss, they replace helpers from within a local distribution (illustrated in Fig. S2). Mutations are sufficiently rare that we assume local distributions grow beyond a sequence of three adjacent patches with a helper of each species before a mutation occurs. This ensures that mutations must occur at least two patches away from one end of the sequence. A sequence of at least two patches with a helper of each species must remain. Therefore, we can compare the expected rate at which new helpers are added at the far end of that sequence to the expected rate at which they are lost at the end where helpers are replaced by the subpopulation of mutant nonhelpers.

The rate at which helpers are lost is the rate at which nonhelpers replace helpers subtracted from the rate at which helpers replace nonhelpers when each receives the benefit of sharing a patch with a helper of the other species

display math(12)

The LHS of inequality (11) gives the expected rate at which helpers at the far end of each sequence increase in number. This is equal to twice the rate at which helpers in one species increase in number. We subtract the value in eqn (12) from half the LHS of inequality (11) to find whether the rate at which helpers increase at the edge is greater than that at which they are lost in competition with the mutant subpopulation. This recovers inequality (2) from the main text.

(b) Multiple replacement by neighbours in the same generation within a local distribution of helpers

The only instance in which multiple replacement by neighbours matters is when both the last individual carrying one allele and the first carrying the other are replaced by genetically different individuals. The probability that this occurs in a single generation is therefore the probability that both are independently selected and replaced by neighbours. The effect of this double replacement is to create a subpopulation of nonhelpers at the end of a sequence of helpers in one species, the effect of which we have already studied. We therefore define the rate at which individuals are selected and replaced by neighbours to be small enough that a double replacement occurs rarely enough that the effect of natural selection can be measured between any double replacements. Therefore, natural selection still favours helping when inequality (2) is satisfied.

(c) Local distributions of helpers encounter others

The evolutionary process is only affected when there are fewer than two patches containing a nonhelper of each species separating the two local distributions. If there is only one patch containing a nonhelper of each species, then there must be helpers of the same species in the two neighbouring patches. If the nonhelper separating the two nearest helpers dies, it is certainly replaced by a helper. This gives rise to a continuous sequence of patches that contain a helper in one species. Therefore, the effect of two local distributions of helpers joining is the same as a nonhelping mutant arising in a local distribution of helpers (illustrated in Fig. S3). We have already shown in section (a) that when a subpopulation of nonhelpers arises within a local distribution of helpers, natural selection favours helping when inequality (2) is satisfied.

Simulations

(a) The effect of natural selection on helping in large populations

We run a numerical simulation of our model, relaxing the assumptions of infinite time and population size. We consider a ring of 106 patches. We initialize the population by randomly assigning each individual the helping allele with probability 0.02 and the nonhelping allele with probability 0.98. We run the simulation for a sequence of 2 × 1010 replacements where a clone of the previous inhabitant does not replace a deceased individual. For each of these replacements, we select a random individual in the population. The probability that a deceased individual is replaced by a mutant offspring, an offspring of the other genetic type, is 2 × 10−5. If there is no mutation, the two conspecific neighbours compete to replace the deceased individual as detailed in our model. We perform five replicates for each set of parameter values displayed in Fig. 3. We consider the last 2 × 109 replacements for each replicate and count the number of times that helpers in a species have increased in number by more than 100. We assign a darker colour to the dot at parameter values where the number of helpers has increased more often, from white when helpers have never increased by 100 or more to black when this has happened each time.

We notice that at low cost, helpers often increase in frequency when our analytical results show that natural selection does not favour them. However, we can see (inequality (11) and eqn (12)) that cost is a factor in the rate at which helpers are gained and lost by natural selection. This means that when cost is low, natural selection is very weak, and so we would expect the number of helpers to occasionally increase over this timeframe (grey dots, but not black) even when natural selection does not favour helpers.

(b) Conditions for the expected local frequency of helpers to increase

We use a cycle of 250 patches, a sequence of 2.5 × 106 replacements, and assume that no mutations occur. Here, the population initially consists entirely of nonhelpers except for a single patch that contains a helper of each species. We perform 2000 replicates at each of the parameter values shown in Fig. 2. Given an initial frequency of 1/250 in each species, we would expect a neutral allele to fix 16 times in the simulations. If there are significantly more than 16 fixations (95% confidence level), we colour the parameter values in Fig. 2 black. If there are significantly fewer than 16 fixation, we colour the parameter values white. If the result is not significant, we colour the parameter values grey.

Fitness effects and relatedness when both conspecifics and heterospecifics are predictors

The fitness of a focal individual is determined by her own genotype, that of her two direct competitors, and the genotypes of the individuals share patches with each of these three. This yields 32 possible genotype combinations that determine fitness. We calculate the relative frequency of each of these combinations by considering a local distribution of helpers where there are at least five patches with a helper of each species. We use the probability distribution of s, the number of patches with a helper of one species and a nonhelper of the other, derived above, to calculate the relative frequency of each genotype combination.

We calculate the fitness coefficients using least-squares regression. To that end, we write the sum of the squared differences between the actual and predicted fitness given the genotypes present, weighted by the frequency of each of the 32 genotypic combinations

display math(13)

where qo is the frequency with which each combination occurs and x0(o) is the genic value of the focal individual, xk(o) are the genic values of conspecific predictors, xl(o) are those of heterospecific predictors, and wo is actual fitness of a focal individual in that genotypic combination.

We solve for the least-squares regression coefficients C, Bk and Bl using expression (13). The relatedness terms, rk and rl, are the regressions of the genic value of a focal individual against that of its predictors. This is readily calculated using the qo values. The solutions are too cumbersome to reproduce here (reproduced in Data S1), but the inclusive fitness effect of helping is

display math(14)

We find that expression (14) is positive if and only if inequality (1) is satisfied. We also find that

display math(15)

In order for natural selection to favour helping, a local distribution of helpers must be expected to increase in number. Hence, the fitness effects in heterospecifics, math formula Blrl, are essential for natural selection to favour helping when heterospecifics are used as predictors of fitness.