## Introduction

The species abundance distribution (SAD) is the frequency distribution of the abundances of the species in a community. In other words, the SAD expresses how many species are rare and how many are abundant. Ecologists have often used simple mathematical expressions to describe the SADs in nature (review in McGill et al. 2007). Many have sought simple mechanisms to explain these simple distributions. However, Pueyo et al. (2007) showed that such simple patterns can also be the result of extremely complex dynamics.

Biodiversity is a result of biological evolution. Natural selection is a powerful mechanism to explore complex fitness landscapes, which comprise many more potential genotypes than the number of particles in the Universe (Wright 1932). The landscapes themselves are continuously modified by the action of coevolution (Kauffman and Johnsen 1991) and environmental changes (Wright 1932). The abundance of a species is largely a result of this complex process, to the extent that niche size and population fluctuations depend on the biology and interactions of the species. Therefore, it is unlikely that a simple, fit-for-all mechanistic model can explain the frequency distribution of the abundances of all the species in a community. Because irreducible (incompressible) complexity represents randomness (see, e.g., Downey and Hirschfeldt 2010), it is plausible that the set of abundances of different species is largely a random set, within few other limits than those imposed by the laws of physics. The irreducible complexity due to the specificities of each species was named “idiosyncrasy” by Pueyo et al. (2007), and is the basis of the idiosyncratic theory of biodiversity.

Once aware of the reasons why the abundances of species could well be “random” to a large extent, we have to express this hypothesis in mathematical terms. To this end, Pueyo et al. (2007) borrowed a tool from statistical physics known as maximum entropy formalism (MaxEnt) and due to Jaynes (1957, 1968, 1978). In informal terms, MaxEnt is a method to find the statistical distribution that is “as random as possible” under some given constraints. One of the most evident constraints is that the total number of individuals cannot be infinite in a finite world. A simple way to introduce this constraint is by setting a limit to the mean number of individuals per species. The result found by Pueyo et al. (2007) from this single constraint was the log-series distribution, which is one of the main classical SADs that ecologist find useful to describe empirical data, since first introduced by Fisher et al. (1943).

Following the notation in Frank (2011), let us use the symbol *y* for abundance and *p* for probability density, in a continuous approximation. (Pueyo et al. 2007 used discrete abundances.) The log-series reads:

where *k* and λ are constants. Small deviations from the log-series are to be expected for several reasons (see section “Deviations from MaxEnt predictions” below). Pueyo et al. (2007) showed that, by taking such small deviations into account, all classical SADs are derived straightforward. In particular, at the limit of very small deviations we obtained a gamma distribution,

which generalizes equation (1) because the additional parameter α can be slightly different from zero.

Surprisingly, at the same time and after the publication of the paper by Pueyo et al. (2007), several other papers appeared that also apply MaxEnt and also obtain the log-series or some very similar distribution (Banavar and Maritan 2007; Dewar and Porté 2008; Harte et al. 2008; Bowler and Kelly 2010; Frank 2011). These papers might seem redundant (except for some results other than the SAD that are obtained in some of them), but they are not, because there are subtle but important differences in their ways to apply MaxEnt. These differences are important for two reasons. First, because the primary aim of these works is to find an explanation for the observed patterns, and the explanation will be wrong if MaxEnt is applied incorrectly, even if the same final result is claimed in all cases. Second, because, if the same methodologies are ever used without previous knowledge of the results to be expected, their predictions are unlikely to be correct unless the methodologies are correct.

Such subtle differences among the methods in different papers are a jigsaw for anyone attempting to apply MaxEnt in an ecological context. Further advance will be difficult unless they are carefully compared. The present paper is a contribution to this task. In fact, the approaches by Banavar and Maritan (2007) and by Bowler and Kelly (2010) are very similar to the approach in Pueyo (2006, Appendix B), which was already discussed in Pueyo et al. (2007, Appendix A). Among the rest of papers, I gave priority to Frank (2011) because he included an appendix with a critique of Pueyo et al. (2007). He went beyond and also stated that his own way to apply MaxEnt is better than that of Edwin T. Jaynes, who first developed this method in the context of statistical physics. The aim of the present paper is to reply Frank's criticisms by showing that there are fundamental reasons to prefer the original approach rather than his version of MaxEnt, and also that he does not give an accurate description of the contents of our paper.