The island species–area relationship: biology and statistics


  • Kostas A. Triantis,

    Corresponding author
    1. Biodiversity Research Group, School of Geography and the Environment, University of Oxford, South Parks Road, Oxford OX1 3QY, UK
    2. Azorean Biodiversity Group, Departamento de Ciências Agrárias – CITAA, Universidade dos Açores, Angra do Heroísmo, Pico da Urze, 9700-042, Terceira, Açores, Portugal
    3. Department of Ecology and Taxonomy, Faculty of Biology, National and Kapodistrian University, Athens GR-15784, Greece
    Search for more papers by this author
    • All authors contributed equally to this work.

  • François Guilhaumon,

    1. Azorean Biodiversity Group, Departamento de Ciências Agrárias – CITAA, Universidade dos Açores, Angra do Heroísmo, Pico da Urze, 9700-042, Terceira, Açores, Portugal
    2. UMR CNRS-UM2-IFREMER-IRD 5119 ECOSYM, Université Montpellier 2 cc 093, 34 095 Montpellier Cedex 5, France
    3. ‘Rui Nabeiro’ Biodiversity Chair CIBIO – Universidade de Évora, Casa Cordovil, Rua Dr. Joaquim Henrique da Fonseca, 7000-890 Évora, Portugal
    Search for more papers by this author
    • All authors contributed equally to this work.

  • Robert J. Whittaker

    1. Biodiversity Research Group, School of Geography and the Environment, University of Oxford, South Parks Road, Oxford OX1 3QY, UK
    2. Center for Macroecology, Evolution and Climate, Department of Biology, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark
    Search for more papers by this author
    • All authors contributed equally to this work.

Kostas A. Triantis, Azorean Biodiversity Group, Departamento de Ciências Agrárias – CITAA, Universidade dos Açores, Angra do Heroísmo, Pico da Urze, 9700-042, Terceira, Açores, Portugal.


Aim  We conducted the most extensive quantitative analysis yet undertaken of the form taken by the island species–area relationship (ISAR), among 20 models, to determine: (1) the best-fit model, (2) the best-fit model family, (3) the best-fit ISAR shape (and presence of an asymptote), (4) system properties that may explain ISAR form, and (5) parameter values and interpretation of the logarithmic implementation of the power model.

Location  World-wide.

Methods  We amassed 601 data sets from terrestrial islands and employed an information-theoretic framework to test for the best-fit ISAR model, family, and shape, and for the presence/absence of an asymptote. Two main criteria were applied: generality (the proportion of cases for which the model provided an adequate fit) and efficiency (the overall probability of a model, when adequate, being the best at explaining ISARs; evaluated using the mean overall AICc weight). Multivariate analyses were used to explore the potential of island system properties to explain trends in ISAR form, and to describe variation in the parameters of the logarithmic power model.

Results  Adequate fits were obtained for 465 data sets. The simpler models performed best, with the power model ranked first. Similar results were obtained at model family level. The ISAR form is most commonly convex upwards, without an asymptote. Island system traits had low descriptive power in relation to variation in ISAR form. However, the z and c parameters of the logarithmic power model show significant pattern in relation to island system type and taxon.

Main conclusions  Over most scales of space, ISARs are best represented by the power model and other simple models. More complex, sigmoid models may be applicable when the spatial range exceeds three orders of magnitude. With respect to the log power model, z-values are indicative of the process(es) establishing species richness and composition patterns, while c-values are indicative of the realized carrying capacity of the system per unit area. Variation in ISAR form is biologically meaningful, but the signal is noisy, as multiple processes constrain the ecological space available within island systems and the relative importance of these processes varies with the spatial scale of the system.


The small size of the island, together with its vast distance from either the eastern or western continent, did not admit of a great variety of animals.

(G. Forster, 1777, Book I, Chapter VIII, p. 156)

Islands only produce a greater or less number of species, as their circumference is more or less extensive.

(J.R. Forster, 1778, Chapter V, p. 169)

In general, as sampling area increases so too does the number of species recorded. Quantification of this pattern dates back to at least the mid-19th century (Watson, 1835, 1859) and now encompasses thousands of studies of a wide variety of taxa and scales (e.g. Connor & McCoy, 1979; Rosenzweig, 1995; Lomolino & Weiser, 2001; Bell et al., 2005; Drakare et al., 2006). Indeed, the species–area relationship (SAR) is widely regarded as one of ecology’s few laws (Schoener, 1976; Dodds, 2009).

The shape taken by SARs can be approximated by many functions (Tjørve, 2009; Williams et al., 2009). The most commonly invoked models, the power model (Arrhenius, 1920, 1921) and the exponential model (Gleason, 1922), were also the first to be proposed. The power model remains the most frequently preferred model, both for fitting curves to species–area data and as a basis for the development of explanatory theories of species diversity (Williams, 1943; Preston, 1962; MacArthur & Wilson, 1967; McGuiness, 1984; Holt et al., 1999; Rosenzweig, 1995; Hubbell, 2001; Lomolino, 2001; Azovsky, 2002; Martin & Goldenfeld, 2006; Triantis et al., 2008a, 2010; Dengler, 2009; Harte et al., 2009; Tjørve, 2009; Williams et al., 2009; O’Dwyer & Green, 2010; Santos et al., 2010; He & Hubbell, 2011; Kisel et al., 2011; Rosindell & Phillimore, 2011; Sólymos & Lele, 2011).

According to Rosenzweig (1995) the ‘species–area pattern’ comprises four different SARs, as the processes determining species richness are scale dependent (Williams, 1943; Preston, 1960; Schmida & Wilson, 1985; Rosenzweig, 1998; Whittaker, 2000; Crawley & Harral, 2001; White et al., 2010). Intriguingly, Rosenzweig’s scale framework includes two distinctive classes of data structures. First, two of his species–area curves, i.e. ‘point scale’ and intraprovincial, are sampling or species-accumulation curves, plotting the accumulation of new species as the sampling area increases (cf. type 1, 2 and 3 curves sensuScheiner, 2003). The second class of data structure is exemplified by his archipelagic and interprovincial curves, which are tallies of the richness of each of a set of islands from a single archipelago (or other biogeographical region). We refer to this latter, classic type of SAR as island species–area relationships (ISARs; cf. type IV curves sensuScheiner, 2003). These simple distinctions in data form remain a rich source of debate and potential confusion, with different authors favouring alternative classifications and nomenclature of relationship types (Lomolino, 2000; Scheiner, 2003, 2004, 2009; Gray et al., 2004a,b; Williamson et al., 2001, 2002; Tjørve, 2003, 2009; Whittaker & Fernández-Palacios, 2007; Dengler, 2009; Williams et al., 2009; Smith, 2010; Scheiner et al., 2011).

Within sampling or species-accumulation curves, the data structure determines a monotonically increasing function: as area increases, species number can only increase, or remain constant, with each increment of area. Within ISARs, each data point is tallied independently of every other and so the relationship can potentially be positive, negative, humped, neutral, or can be more complex, depending on other controlling variables. However, in general, ISARs when fitted statistically describe the tendency for species numbers to increase with island area. Such analyses can be undertaken for any system of isolates (e.g. lakes, mountain tops, forest fragments) in which the data points are tallied independently; but to constrain our analyses to systems of a common type, we consider only true islands in this article.

As indicated in the quotations above, the existence and generality of ISARs has long been discussed within island biogeography (e.g. Darlington, 1957; MacArthur & Wilson, 1967). Yet we lack consensus concerning the importance of individual mechanisms contributing to the pattern, or the exact shape of the ISAR, across different spatial scales, environmental conditions and taxa (Whittaker & Fernández-Palacios, 2007).

Conventionally, simple data transformations have typically been employed to produce linear ISAR fits, as this makes the relationships tractable for further analysis. By contrast, Lomolino (2000) and Tjørve (2003), among others, have argued that we should examine the fit of theoretical (mechanistic) models, including those of more complex form. Lomolino (2000, 2001) theorized that untransformed ISARs should exhibit a sigmoidal form (see also Tjørve, 2009), with (1) a phase of little or no increase in species richness across very small island areas (the small-island effect), followed by (2) a rapid rise in richness, with (3) a subsequent flattening towards an asymptote as the number of species approaches the richness of the mainland species pool, while (4) in situ speciation may contribute significantly to elevate ISAR slope on large, remote islands (Fig. 1a). These suggestions have proven controversial, with considerable dispute concerning the existence of sigmoidal ISARs, the detection of the small-island effect (e.g. Lomolino & Weiser, 2001; Tjørve & Tjørve, 2011; Triantis & Sfenthourakis, 2011) and the proposition of an upper asymptote (e.g. Williamson et al., 2001, 2002; Lomolino, 2002) (see discussion in Whittaker & Fernández-Palacios, 2007).

Figure 1.

 (a) The hypothetical sigmoidal form of the species–area relationship as suggested by Lomolino (2001). The main features are the potential small-island effect on the left hand side, the presence of an asymptote as species richness reaches that of the mainland species pool, and a secondary phase of increase in species richness (dashed line) corresponding to islands large enough to allow in situ speciation (redrawn from Lomolino, 2001). (b) The general form of the island species–area relationship based on the results of our analysis. The dominant shape of the relationship between species richness and area is convex without an asymptote. A sigmoid shape but without an asymptote may be observed when the range of area considered is large.

Thirty-three years after Connor & McCoy’s (1979) seminal review of the species–area relationship (SAR), we take advantage of statistical, theoretical and empirical developments, to provide a general quantitative analysis of the form taken by ISARs across island types, geographical contexts and major taxa. To this end we amassed 601 data sets, and employed an information-theoretic framework to compare 20 species–area functions (Burnham & Anderson, 2002; Stiles & Scheiner, 2007; Guilhaumon et al., 2008, 2010). We pose five fundamental questions derived from the literature cited above. (1) Is there an overall best-fit ISAR model? (2) Is there a best-fit family of ISAR models? (3) Is there a best-fit ISAR shape and does it includes an asymptote? (4) Can we infer biological processes responsible for variations in ISAR form by reference to system properties? (5) Can the z and c parameters of the logarithmic implementation of the power model be interpreted biologically and ecologically?

Materials and methods

The nature of the data sets

Alongside searches using general-purpose search engines, we used two main abstracting/indexing systems – ISI Web of Knowledge and Scopus – with a wide range of search strings. The compilation of data sets lasted 3 years and was completed in April 2010. More than 800 journal papers, books, doctoral theses, online databases, reports and unpublished resources were screened. Each possible source was checked to ensure the following conditions applied.

  • 1 The data sets each pertained to an area of land surrounded by water, i.e. true geographical islands.
  • 2 The source provided a full list of species per island, or at least the number of species present on each island. We aimed to restrict analysis to tallies of native species only, but in a few cases non-natives were included (below).
  • 3 Descriptions of data sets were sufficient to permit basic evaluation of data adequacy so that data points/sets known to contain significant biases could be eliminated. Of course, perfect resolution of sampling and taxonomic uncertainty is unobtainable in general surveys of island biotas: all such data sets contain a certain level of error (e.g. Whittaker et al., 2000).
  • 4 Each data set contained at least four islands; by setting the threshold so low, we permitted the inclusion of the Greater Antilles, a well-studied island group from which ISARs have been reported in the past (e.g. Losos & Schluter, 2000). Basing studies of a relationship on such a small number of data points can raise a number of issues related to our ability to attribute form to the relationship (e.g. Whittaker, 2010), and so in subsequent analyses we assessed the effect of increasing this threshold on the results.
  • 5 The data set extracted from a source should not be essentially the same as previously captured from another source. We did include some cases where, for example, data were available from adjacent island groups and also were collated as a regional data set, but these cases were flagged as overlapping for further analysis.

Our compilation cannot be viewed as an unbiased selection of the island systems of the world, as some taxa (e.g. higher plants, birds) and some archipelagos (e.g. the Canaries) are better studied than others, but we consider the compilation to be a comprehensive representation of the available island species–area data.

We retrieved 601 data sets meeting the above criteria from 312 separate sources (see Appendices S1 & S2 in Supporting Information). Each data set included, for each island, the taxonomic group, the number of species and island size (in km2). The area of the islands was extracted from the respective papers if available. Otherwise, the UNEP Islands Directory ( was used, along with other resources. In all cases, area measurements refer to planar area, thus ignoring topographic complexity (e.g. Triantis et al., 2008b). Measures of latitude were based on the mean value of the southernmost and northernmost island and were restricted to studies with a latitudinal extent of < 7.5° (95% of the cases used in the analyses). The vast majority of the data sets were derived from a single archipelago, or biogeographical sub-region, but in 21 cases the constituent islands were scattered across a large part of the globe (Appendix S2).

The island systems were divided into: (1) systems within inland water-bodies (55 cases, 9%), i.e. lakes, rivers, reservoirs, hereafter inland systems; (2) oceanic (191 cases, 32%), i.e. islands of volcanic origin, formed over oceanic plates, and never connected to continental land masses (Whittaker & Fernández-Palacios, 2007); and (3) continental-shelf islands (355 cases, 59%), including continental-shelf islands and ancient continental fragments. The few cases of island groups with a mixed continental and oceanic origin, e.g. Japan and the Philippines, were include in the continental-shelf category. We also grouped studies under three taxonomic/life form headings (major taxon): (1) invertebrates (231 cases, 38.5%), (2) vertebrates (219 cases, 36.5%), and (3) plants (148 cases, 25%). There were also two lichen data sets and one fungal data set: these were excluded from subset analyses. The invertebrate data sets were spread across a larger array of taxonomic subsets (e.g. beetles, ants, isopods, snails) than was the case for vertebrates, for which 179 out of the 219 cases provided data for either birds or mammals.

The descriptors compiled for each island group/archipelago were thus as follows: (1) Latitude, (2) Island Type, i.e. oceanic, continental shelf and inland, (3) Major Taxon, i.e. invertebrates, vertebrates and plants, (4) number of islands (No. of Islands), (5) total area (AreaTOT), (6) mean area (AreaMEAN), (7) maximum island area (AreaMAX), (8) minimum island area (AreaMIN), (9) the ratio of maximum to minimum island area (AreaSCALE; i.e. AreaMAX/AreaMIN), (10) the range in island area within the island group (AreaRANGE; i.e. AreaMAXAreaMIN), (11) maximum number of species for an island (SMAX), (12) minimum number of species for an island (SMIN), (13) the ratio between the minimum and the maximum number of species (SSCALE), (14) the range in species number within the island group (SRANGE), and (15) the variation of the number of species within the island group, estimated as the variance of species richness (SVAR). These variables encompass two key aspects of scale: first measures of the grain and second of the range in grain of the data sets (cf. Whittaker et al., 2001; Drakare et al., 2006). The grain is represented by AreaTOT, AreaMEAN,AreaMAX and AreaMIN, while the range in grain of the data sets is represented by AreaRANGE and AreaSCALE. All the continuous descriptor variables, apart from Latitude, were log10-transformed to avoid the influence of extreme values and increase normality of residuals in subsequent analyses.


Numerous functions have been proposed for modelling SARs, varying in complexity from two to four parameters. They vary in the general form they produce, theoretical background, origin and justification (for reviews see Flather, 1996; Tjørve, 2009; Williams et al., 2009). Several of these functions were first collated in a paper by Flather (1996) focused on species-accumulation curves rather than ISARs. Recently, Williams et al. (2009) demonstrated that, after resolving problems of mathematical similarity and synonymy, there are 16 different functions that can be classified into nine general families, i.e. a general form of which a number of formulas are slight variants (see Appendix S1 in Williams et al., 2009). Independently, Tjørve (2009, Appendix 1 therein) has recently extended his earlier review (Tjørve, 2003), incorporating more functions.

Here we used a set of 19 functions based on the reviews of Tjørve (2009) and Williams et al. (2009). Following the latter we have not considered the cumulative extreme-value function (EVF) as it requires an estimation of the total number of species present within each study system, which was lacking in many cases. This model is equivalent to Weibull-3 (Table 1) when the asymptote is estimated (Williams et al., 2009). We applied the logistic model of Archibald (1949) in its original form (Table 1, No. 12) and not the version requiring the total number of species present within each study system. To these 19 functions we have added one presented by Rosenzweig (1995), which we term the power Rosenzweig function as it is a modified form of the power model. See Table 1 for details of the 20 functions used.

Table 1.   The functions used in our analyses, their analytical formula, the general family they belong to, their shape, and the presence/absence of an asymptote. S is species richness, A is area, and c, d, f, z are fitted parameters. For the functions with an asymptote, the asymptote’s value is given by the parameter(s) in the last column of the table, i.e. d, c/f, and z/d for the different functions. Note, that the shape for the Extended Power 1 can be either convex or sigmoid depending on the fitted parameters. For the source references for each function see Flather (1996), Tjørve (2009) and Williams et al. (2009).
No.Function nameCodeFamilyNumber of parametersFormulaShape typeAsymptote
 3Power Rosenzweigpower_RPow(B)3cAzConvexNo
 4Extended Power 1epm1Pow(B)3cAzAdBothNo
 5Extended Power 2epm2Pow(B)3S = cAz(d/A)SigmoidNo
 6Persistence Function 1P1Pow(B)3cAz exp(−dA)ConvexNo
 7Persistence Function 2P2Pow(B)3cAz exp(−d/A)SigmoidNo
 8ExponentialexpoExpo(C)2z logAConvexNo
 9Kobayashi LogarithmickobaExpo(C)2S = c log(1 + A/z)ConvexNo
10MonodmonodLogis(D)2d/(1 + cA−1)ConvexYes (d)
11Morgan–Mercer–FlodinmmfLogis(D)3d/(1 + cAz)SigmoidYes (d)
12LogistichelegLogis(D)3c/(Az)SigmoidYes (c/f)
13Negative ExponentialnegexpoWeib(E)2d[1 − exp(−zA)]ConvexYes (d)
14Chapman–RichardschapmanWeib(E)3d[1 − exp(−zA)]cSigmoidYes (d)
15Weibull-3weibull3Weib(E)3d[1 − exp(−cAz)]SigmoidYes (d)
16Weibull-4weibull4Weib(E)4d[1 − exp(−cAz)]dSigmoidYes (d)
17AsymptoticasympAsym(F)3dczAConvexYes (d)
18RationalratioRat(G)3= (zA)/(1 + dA)ConvexYes (z/d)
19GompertzgompertzGom(H)3d exp[−exp(−z(Ac))]SigmoidYes (d)
20Beta-PbetapBeta(I)4d[1 − (1 + (A/c)z)f]SigmoidYes (d)


All analyses were run using an updated version of the ‘mmSAR’ package (Guilhaumon et al., 2010) for the R statistical and programming environment (R Development Core Team, 2011).

Model fitting and comparisons

The linear model was fitted using simple linear regressions, but all other ISAR models were fitted in arithmetic space employing nonlinear regressions by minimizing the residual sum of squares (RSS) using the unconstrained Nelder–Mead optimization algorithm (Dennis & Schnabel, 1983). Assuming normality of the observations, this approach produces optimal maximum likelihood estimates of model parameters (Rao, 1973). Regressions were further evaluated by statistical examination of normality and homoscedasticity of residuals. A model was considered as not providing an adequate fit: (1) if the optimization algorithm did not converge, and/or (2) the Shapiro normality test on the residuals, or the Pearson’s product–moment correlation coefficient between the residuals and area was significant at the 5% level. To avoid numerical problems (e.g. local minima) during the fitting process, we paid particular attention to the starting values that were used to run the optimization algorithm. We first obtained initial values for those parameters that were directly interpretable (e.g. an asymptote) by taking corresponding values in the data sets (e.g. the observed maximum of species richness in the case of an asymptote) and calculated initial values for the remaining parameters using the standard procedures of Ratkowsky (1983, 1990). To enhance the reliability of the parameter estimations, we ran the optimization algorithm using 1000 combinations of starting values randomly chosen in the parameter space relevant to each model. Among the 1001 fits, we retained the one that minimized the RSS.

We discriminated between the different models in an information-theoretic framework designed for the evaluation of multiple working hypotheses (Burnham & Anderson, 2002). This is achieved through the estimation, for each model, of its probability of being the best at explaining the data. Basically, we compared the fit of the ISAR models using the small-sample corrected Akaike’s information criterion (AICc), a modification of the AIC (Akaike, 1973) that contains a bias correction term for small sample size, and which is preferred when the number of free parameters, p, exceeds n/40 (Burnham & Anderson, 2002). The model with the lowest AICc value is considered to fit the data best. We used Akaike weights derived from the AICc (wAICc) to evaluate each model’s probability of being the best at explaining the data. For each data set, we obtained a model selection profile (i.e. the vector of each model’s wAICc) and an adequate fit profile (binary vectors, i.e. {0;1}, describing models that provided an adequate fit) and we used these profiles to evaluate: (1) the 20 different species–area functions, (2) the nine families, (3) the three basic shapes of the models considered, i.e. linear, convex and sigmoid (i.e. a shape with an inflection point), and (4) the relative probabilities of the presence/absence of an asymptote within the range of the data.

In a separate analysis we also fitted and evaluated the logarithmic form of the power model.

Best-fit model of the ISAR

To compare the 20 models we employed two main criteria. First, we calculated the proportion of cases for which the model provided an adequate fit (termed here the generality criterion); and second, we calculated the mean AICc weight (mean wAICc) across all data sets for which the model provided an adequate fit. In essence, the mean wAICc index thus measures the overall probability of a model being the best at explaining ISARs, independently of the ability of the model to provide adequate fits. For ease of reference we term this the efficiency criterion.

To generate an overall ranking reflecting both generality and efficiency of the models we standardized the generality and efficiency values using the formula [(value of the criterion – mean value)/standard deviation] and then summed the two values to determine a synthetic generality/efficiency index, on the basis of which an overall final ranking of models was provided. Finally, although not a strict statistical criterion (Burnham & Anderson, 2002), we also counted the cases where each model provided the single-best adequate fit (i.e. lowest AICc value). In an additional analysis designed to test the sensitivity of our results to the inclusion of data sets with small numbers of islands, we have sequentially removed data sets with between 7 and 19 islands and have calculated Kendall’s tau coefficients in pairwise fashion to test for differences in the ranking of the models based on their generality and efficiency.

Best-fit family of models of the ISAR

We compared the different families by: (1) counting the cases for which at least a single model within the family provided an adequate fit, i.e. their generality; and (2) by summing the wAICc for all the models within each family and then averaging this sum across all data sets for which at least one model of the family provided an adequate fit (i.e. their efficiency). To address the possible influence on the outcome of the number of models per family, we have applied two additional comparisons: (3) standardizing the overall ranking of the families by the number of models in each family; and (4) using only the overall most efficient model in each family.

Best-fit shape of the ISAR

Although the 20 functions are denoted as having specific shapes (Table 1; Tjørve, 2009), the observed shape that these functions take after fitting (i.e. estimation of parameters) can vary according to the character of the data sets themselves. For example, the power model, which is designated as a convex model, can exhibit a linear fit when = 1. Similarly, sigmoid models can sometimes exhibit a convex shape (e.g. when the inflection point lies outside the empirical range of island areas; Tjørve, 2009) and sometimes a linear form, given particular combinations of parameter values. We therefore devised a sequential algorithm to discriminate between linear, convex and sigmoid fitted (i.e. observed) shapes for each combination of data set and model.

First, the algorithm compared the shape of the model with a straight line joining the fitted values for the minimum and maximum area of the data set. For 100 equally spaced increments between the minimum and maximum area of the data set, if all the differences between the number of species calculated from the straight line projection and the fitted model were < 0.001 species, then the curvature of the shape was considered insignificant and the fit was assigned as linear (sensitivity to the 0.001 threshold was assessed by comparison with a larger value; see Appendix S3.1). Second, if the fit was not linear we discriminated between convex and sigmoid shapes by studying the second derivative (with respect to area) of the model functions (parameterized with the same parameter values as the original function) to detect the presence of an inflexion point. If the second derivative of the model changed sign, this tells us that the fit exhibited an inflection point within the observed area range and the shape was assigned as sigmoid (Appendix S3.1). Additionally, we have also compared the different ISAR shapes following the general categorization of the models proposed by Tjørve (2009; see Table 1 herein), ignoring the fitted shape.

Asymptotic versus non-asymptotic models

We used two methods to classify models as asymptotic versus non-asymptotic. First, we used the fitted parameters, i.e. we classified the model as asymptotic only if the estimated value for the asymptote was within the range of the data. In a second more liberal analysis, we again used the general categorization provided by Tjørve (2009; provided also in our Table 1), e.g. considering the logistic model to be asymptotic regardless of whether the estimated value of the asymptote fell within the limits of the particular data set.

Geography, area and richness-based correlates of ISARs

We used constrained analysis of principal coordinates (CAP) (Anderson & Willis, 2003; Oksanen et al., 2007) to investigate the relationship between the taxonomy, geography, area and richness-based descriptors of data sets and ISAR form across all data sets. CAP is an ordination method similar to redundancy analysis, but it allows non-Euclidean distances, such as Jaccard or Bray–Curtis, to be used for the calculation of dissimilarities (Oksanen et al., 2007).

CAP analysis was used to examine: (1) the variation in model selection profiles explained by the various descriptors in turn for best model, best family and best shape; and (2) to rank the predictors with respect to the strength of their effect on the variability in model selection and adequate fit profiles (i.e. vectors of wAICc for each data set and vectors describing models that provided an adequate fit). We used Bray–Curtis dissimilarities to characterize pairwise dissimilarities between the selection profiles of the data sets and Jaccard distance to characterize pairwise dissimilarities between adequate fit profiles, resulting in six separate CAP analyses.

To determine whether possession of an asymptote could be explained by data set characteristics we used multiple logistic regression (generalized linear model, GLM, with higher probability for an asymptote scored one, otherwise scored zero, with a binomial error term and a logit link).

For both CAP and GLM analyses we selected an initial set of eight explanatory variables after investigating for multicollinearity using Pearson correlations. The selected system property variables were: Latitude, Island Type, Major Taxon, No. of Islands, AreaMAX, AreaSCALE,SMAX and SSCALE (Appendix S3.3). The variables were ordered in the models according to their independent contribution (greatest to least) to the total variation in the response variable. We eliminated non-significant terms using a backwards selection procedure, to derive a minimal adequate model. We used the commands ‘capscale’ and ‘anova’ of the ‘vegan’ package (Oksanen et al., 2007) and the commands ‘glm’, ‘summary’ and ‘anova’ from the R statistical and programming environment to parameterize, select the models and perform analyses of deviance.

Logarithmic form of the power model

As the logarithmic form of the power function (Arrhenius, 1920, 1921; log10-transformed values of species and area) (1) is the most frequently applied form for fitting ISARs, (2) remains one of the few functions for which biological significance has been assigned to model parameters, and (3) has a proposed, if debated, theoretical basis (e.g. Preston, 1962; Connor & McCoy, 1979; Rosenzweig, 1995; Martin & Goldenfeld, 2006), we also report separate analyses using the logarithmic power model to allow comparison with preceding literature.

For these analyses we followed a multiple regression approach to investigate the factors related to the shape of the ISAR (above).


Model fits and the ‘best’ ISAR model

In 551 cases of the 601 data sets compiled, at least one function provided an adequate fit as determined by the use of the optimization algorithm, the Shapiro normality test and/or the Pearson product–moment correlation coefficient. However, the AICc could not be calculated for those data sets with fewer than seven islands, so our subsequent analyses were based on 465 data sets, of which 75% have a total land area of < 10,000 km2 and 79% span less than four orders of magnitude in area. Each major taxon is well represented, as are continental-shelf and oceanic island systems, while there are relatively few inland data sets (Table 2).

Table 2.    (a) The distribution of data sets across major taxon and island type categories for the 465 data sets for which adequate fits were obtained and which were used for subsequent analyses of best model, best family of model, ISAR shape and presence of an asymptote. (b) The main categories of the taxonomic and island types included in the 449 data sets that produced significant slopes in the additional analyses carried out for the logarithmic form of the power model (logSpecies−logArea). Note that these analyses are not equivalent mathematically to the nonlinear implementation of the power model reported above, and that the analyses reported in this part of the table are entirely separate from those reported in Tables 2a, 3 and 4.
Major taxonNo. of casesContinental-shelfOceanicInland

In 44 cases (9% of the 465 analysed), the data set was the sum of two or more other data sets, arising either through summing distinct but related groups of islands, or by combining different taxa for a particular set of islands. Although there is a level of interdependency in these cases, sensitivity analyses showed that their inclusion did not affect the results (not shown).

Considering the single ‘best’ model per data set, as judged by the lowest AICc value, four models accounted for 73% of cases; in declining order of performance – the power, linear, Kobayashi and exponential models (Fig. 2a). The generality criterion provided relatively small variability of values, i.e. poor discrimination between the 20 models evaluated, with proportions between 0.467 and 0.839 (mean value of 0.725 ± SD 0.122) of adequate fits among the 465 data sets, with half of the models having virtually identical success rates (Fig. 2b, Table 3). However, according to the efficiency criterion, which is more discriminatory, four models account for more than 50% of the overall probabilities of being the best at fitting ISARs; in declining order they were the power, linear, Kobayashi and exponential models (Fig. 2c, Table 3).

Figure 2.

 Comparison of the performance of the 20 island species–area relationship (ISAR) models across 465 data sets: (a) the proportion of data sets for which each model provided the lowest small-sample corrected Akaike information criterion (AICc) value, i.e. single-best model; (b) generality, i.e. the proportion of the data sets for which each model provided an adequate fit; and (c) efficiency, i.e. the average AICc weight (wAICc) for the cases for which the model in question provided an adequate fit. See Table 1 for details of the models. NB Screening out data sets with < 20 data points results in a pronounced decline in the performance of the linear model, but otherwise the relative performance of the models remains practically the same (see text and Appendix S3).

Table 3.   Model performance based on analyses of the 465 data sets for which small-sample corrected Akaike information criterion (AICc) values could be calculated. Model: models as detailed in Table 1. Generality: the proportion of the data sets for which each model provided an adequate fit out of the 465 cases. Efficiency: the mean AICc weight (mean wAICc) for the cases for which the model in question provided an adequate fit. Rankings for the two criteria are presented in brackets (when two or more models had the same criterion value, they were assigned the highest rank, e.g. weibull3 and mmf for the generality criterion).
ModelGeneralityEfficiencyOverall valueRank
  1. Overall value: the sum of the standardized values of generality and efficiency; the sum of the overall values for all the models equals zero. Rank: model ranking based on the overall value index.

  2. *Note that as per Table 1, the results reported herein are for the non-linear implementation of the power model.

Power*0.798 [9]0.207 [1]2.9961
Koba0.798 [9]0.154 [3]2.0812
expo0.755 [12]0.143 [4]1.5333
linear0.628 [14]0.170 [2]0.9564
P20.839 [1]0.057 [8]0.7235
monod0.731 [13]0.106 [5]0.6986
epm20.815 [7]0.050 [9]0.4057
weibull30.834 [2]0.041 [11]0.4048
mmf0.834 [2]0.040 [14]0.3919
heleg0.830 [4]0.040 [12]0.36010
asymp0.794 [11]0.043 [10]0.11511
ratio0.802 [8]0.033 [16]0.01012
weibull40.830 [4]0.010 [19]−0.15813
betap0.830 [4]0.009 [20]−0.17314
negexpo0.546 [18]0.099 [6]−0.94315
P10.606 [16]0.059 [7]−1.14916
power_R0.600 [17]0.030 [17]−1.70917
chapman0.615 [15]0.012 [18]−1.88318
gompertz0.544 [19]0.040 [13]−1.98619
epm10.467 [20]0.037 [15]−2.67120

The correlation between our generality and efficiency indices is low and statistically non-significant (Appendix S3.4). Hence, the overall ranking of the models (Table 3), combining standardized values of both generality and efficiency values (see Materials and Methods), synthesizes two distinctive aspects of model performance. To assess the robustness of our results we also re-ran the evaluation using the uncorrected AIC and the Bayesian information criterion (BIC). The overall rankings of the models based on the AICc were highly correlated (tau > 0.705, < 0.05) with those obtained using AIC and BIC rankings (Appendix S3.4). Similarly, we found the overall rankings to be robust to the sequential removal of data sets with between seven and 19 islands, although notably the performance of the linear model declines rapidly as data sets with seven, eight and nine islands are eliminated (Appendix S3.4 & Table S12). In each case these sensitivity analyses indicate that the results of the overall model-ranking index are robust to the choice of a model selection criterion and to the inclusion of systems with comparatively small numbers of islands (the decline of the linear model in the rankings notwithstanding). The CAP analyses showed significant effects for some system traits, yet, in combination, system traits explained < 11% variability in both model selection and adequate fits profiles (Appendix S3.3).

Best family of ISAR

The power family [Pow(B)] was ranked first based on the generality and efficiency criteria and was thus first in the overall ranking. It was followed by the exponential family [Expo(C)], which was also ranked second according to the efficiency criterion. The Logis(D) family was ranked third and fourth by the generality and efficiency criteria, respectively, and was third in the overall ranking (Table 4a).

Table 4.   Island species–area relationship (ISAR) model family (a), shape (b) and asymptote (c) performance based on analyses of the 465 data sets for which AICc values could be calculated. Column headings as given for Table 3: for groupings see Table 1. Ranking in brackets.
 No. of modelsGeneralityEfficiencyOverall valueRank
(a) Family
Pow(B)60.959 [1]0.338 [1]2.9901
Expo(C)20.858 [4]0.269 [2]1.6352
Logis(D)30.890 [3]0.162 [4]0.9393
Weib(E)40.901 [2]0.115 [5]0.6144
Asym(F)10.793 [7]0.043 [6]−0.8215
Beta(I)10.830 [5]0.009 [9]−0.8426
Rat(G)10.802 [6]0.033 [8]−0.8457
Lin(A)10.628 [8]0.170 [3]−0.9558
Gom(H)10.544 [9]0.040 [7]−2.7159
(b) Shape
Convex0.989 [1]0.792 [1]2.1701
Sigmoid0.826 [2]0.114 [3]−0.7192
Linear0.630 [3]0.191 [2]−1.4523
(c) Asymptote
Non-asymptotic1.000 [1]0.825 [1]1.4141
Asymptotic0.804 [2]0.218 [2]−1.4142

If the number of models included in each family is taken into account then the overall final ranking is significantly and highly correlated with that shown in Table 4a (Appendix S3.5). Additionally, the results of further analyses using only the overall most precise model in each family are consistent with Table 4a, with the families Pow(B), Expo(C) and Logis(D) always being the top three families. The CAP analyses for families showed significant effects for some system traits, yet in combination they explained < 10% variability in both model selection and adequate fits profiles (Appendix S3.3).

Best shape of ISAR

Based on the algorithm used to detect linearity, convexity and inflection point(s), convex models had the highest generality and efficiency values (Table 4b). The results remain identical when a higher threshold value (0.01 instead of 0.001) was used to detect linearity, and almost identical when basing the assessment on the shape of the single best model for each data set (Appendix S3.1). Were we to follow instead the general shape assignment of Tjørve (2009), as presented in Table 1 herein, the results would remain largely similar, with convex models having the highest generality and efficiency values (Appendix S3.1). Although the sigmoid shape appears almost as often as the convex shape, its efficiency values are generally much lower (as Table 4b), while often the estimated inflexion point occurs outside the range of observed areas, and thus the fitted shape is convex in form and not sigmoid.

The CAP analysis of the wAICc values and the adequate fit profiles for model shape showed significant effects for some system trait variables, yet in combination the system traits explained < 13% variation in both analyses (Appendix S3.3). While the amount of variance explained in the CAP analyses is low, there are significant differences in the range of island area (i.e. the mean values of AreaSCALE) encompassed by each system between the three shape forms, with data sets of linear form having the lowest, and data sets of sigmoid form the largest, range in island areas (Fig. 3).

Figure 3.

 The distribution of the values of AreaSCALE [i.e. log(AreaMAX/AreaMIN)] for the three ISAR shape categories, using the shape that summed the highest AICc weight (wAICc) for each data set. There are significant progressive increases of the mean AreaSCALE value from linear to convex and finally to sigmoid shape (2.185, 2.997 and 4.065, respectively; Kruskal–Wallis rank sum statistic, = 465: 32.900, < 0.001). Note, that the values of AreaSCALE within which a sigmoid shape totalled the highest wAICc (18 cases), ranged from 2.243 to 6.153 with a mean value of 4.065 ± SD 0.21. The results remain the same if instead of the shape that summed the highest wAICc for each data set, the observed shape of the best-fitting model for each data set is considered (see Appendix S3). Furthermore, there is no differentiation of the best shape according to the total area of the island systems considered (see Appendix S3.1), indicating that the pattern is robust regardless of the total area considered, i.e. small or large island groups. Squares represent the mean value, boxes bracket the standard error of the mean (± SE) and whiskers represent 95% confidence intervals of means (± 1.96 SE).

Asymptotic versus non-asymptotic ISAR form

According to the method we used to detect the presence of an asymptote within the range of the empirical data, the non-asymptotic models had the highest generality and efficiency values (Table 4c). If the shape of the best model is considered on a case by case basis, then an asymptote is detected in 62 cases (13%), with no asymptote in 403 (87%) cases. No island traits provided significant differentiation of the presence/absence of an asymptote in a logistic regression analysis. Non-asymptotic models remained predominant when classifying shape using the general classification of Tjørve (2009) (see Appendix S3.2)

The log–log implementation of the power model

The log–log implementation of the power model resulted in a significant ISAR in 449 cases of the 601 original data sets (Table 2a). Of these 449 data sets, 84% are of islands groups of < 50,000 km2 and 73% have an AreaSCALE value of < 10,000 km2, while the number of islands ranges from four to 213. The ratio in richness values (SSCALE) is < 100 in 59% of cases. The R2 for the 449 significant ISARs ranged from 0.065 to 0.993, with a mean value of 0.640 ± SD 0.204. In the multiple regression minimal adequate model explaining variation in the R2 values, only No. of Islands and SMAX were included (R2 = 0.49, = 70.92, < 0.01), indicating a general tendency for R2 values to decrease with number of islands and to increase with maximum number of species.

Previous syntheses based on the log–log power model have suggested that ISAR z-values typically fall within a range of around 0.2–0.4 (MacArthur & Wilson, 1967; Connor & McCoy, 1979; Rosenzweig, 1995), although Williamson (1988) reported exceptions to this generalization, ranging from 0.05 to 1.132. Our analyses produced a mean of = 0.321 ± SD 0.164, and 51% of z-values fell between 0.2 and 0.4, while only 25% of values exceeded 0.4 and the full range was from 0.064 to 1.312. Simple regressions showed that no single explanatory variable had a coefficient of determination as high as 0.10, but the minimal adequate model included AreaSCALE, SSCALE, Island Type, SVAR, No. of Islands and SMAX and explained 69% of the overall variation (= 156.1, < 0.01). The values of logc ranged from −2.197 (c-value: 0.006) to 2.982 (960.157) with a mean of 0.907 ± 0.788. The minimal adequate regression model included AreaMAX, AreaSCALE, SMAX, SSCALE, No. of Islands, SVAR and Major Taxon and explained 84% (= 276.400, < 0.001) of the variation in logc.

There is a progressive increase in the mean z-value from inland systems to continental-shelf and then to oceanic archipelagos; but the difference is only significant between the oceanic islands and the other two categories (Fig. 4a). Logc values show a progressive decrease from inland to continental-shelf and then oceanic archipelagos, with each category significantly different from the next (Fig. 4b). The z-values progressively increase from vertebrate to invertebrate to plant data sets, but only the difference between vertebrates and plants is significant (Fig. 4c). Furthermore, z-values appear to vary in relation to the range of island areas encompassed (Fig. 5). For data sets spanning just two orders of magnitude the mean value of z is significantly higher than for data sets spanning more orders of magnitude of island area (Fig. 5). The logc values increase progressively from vertebrates to invertebrates and finally to plants, with each category being statistically different (Fig. 4d).

Figure 4.

 Comparisons of z and logc values for the main taxonomic groups and island types, for the logarithmic form of the power function. (a) Comparison of z-values across the three main island types. The value for oceanic islands is higher than the two other categories (Kruskal–Wallis rank sum statistic, = 449: 16.133, = 0.0003). (b) The logc values by contrast show a progressive decrease from inland to continental-shelf and then oceanic archipelagos (Kruskal–Wallis rank sum statistic, = 449: 32.130, < 0.0001), with each category significantly different from the next. (c) The comparison of z-values for the main taxonomic groupings show that plant and invertebrate data sets have higher z-values than vertebrates but only the difference between plants and vertebrates is significant (Kruskal–Wallis rank sum statistic, = 447: 14.104, = 0.0009). (d) The logc values increase progressively from vertebrates to invertebrates and finally to plants, with each category being statistically different from each other (Kruskal–Wallis rank sum statistic, = 447: 150.262, < 0.0001). Squares represent the mean value, boxes bracket the standard error of the mean (± SE) and whiskers represent 95% confidence intervals of means (± 1.96 SE).

Figure 5.

 Comparison of the z-values for the main orders of magnitude of AreaSCALE included in the present study. For data sets spanning just two orders of magnitude the mean value of z is 0.438 ± SD 0.216, significantly higher than for all other categories, which exhibit z-values close to 0.3 or even lower (Kruskal–Wallis rank sum statistic, = 439: 47.828, < 0.0001). Note that the categories 100–101 and 107–108 were not considered due to their small sample size: four and six cases, respectively. However, if category 100–101 is merged with 101–102 and category 107–108 with 106–107, the results remain identical (see Appendix S3.5). The logarithm of orders of scale magnitude is presented.


Is there an overall best-fit ISAR model?

Our analyses of 601 data sets and 20 mathematical formulas demonstrate that there is no universal best-fit island species–area relationship model and, in many cases, there will be no clear best model for a specific data set (Connor & McCoy, 1979; Guilhaumon et al., 2008). Tjørve (2009) noted ‘the choice of model will, therefore, depend ultimately on the specific purposes of the exercise’. We concur but also note the importance of the choice of the overall analytical strategy, which may also vary depending on the purpose of the study.

Levins (1966) has suggested that no single mathematical model in ecology can meet all the requirements of realism, generality and efficiency, and so some trade-off of these properties is inevitably involved. With the notable exception of just a few models (e.g. Archibald, 1949; Preston, 1962; May, 1975; He & Legendre, 2002; Martin & Goldenfeld, 2006) ‘realism’ cannot be rigorously assessed for most of the functions considered herein, as an appropriate theoretical, mechanistic background remains lacking. Thus, we have focused our evaluation on generality and efficiency. By these criteria we have a clear final ranking, with the power model performing best (Table 2a). This is consistent with much previous work cited herein but nonetheless represents the first time that the power function’s suitability for describing ISARs has been tested systematically against the many alternative functions currently recognized. The power function is followed in our ranking by other simple functions: the Kobayashi, exponential and linear models.

In general, we may anticipate that increasing the number of parameters will increase model flexibility, and thus the ability to fit data sets spanning a greater range of variation in island area (He & Legendre, 1996; Lomolino, 2000; Tjørve, 2009). Our results indicate that the more complex models are the most general (Fig. 2b) – but by this criterion only slightly out-perform simpler models – and that support for sigmoid models is at its greatest amongst those data sets with high values of AreaSCALE (Fig. 3). However, the simpler models out-performed the more complex ones in terms of model efficiency and in our overall ranking.

Most of the more complex models were originally introduced into the species–area literature in analyses of the species accumulation curve (Flather, 1996): a very different type of construct. Our results indicate that in the absence of a specific theoretical justification for doing otherwise, the start point in ISAR analyses should be to consider four competing simple models, i.e. the power, Kobayasi, exponential and linear models, especially if the spatial range of the values included is less than four orders of magnitude of island area.

Is there a best-fit family of ISAR models?

Judged at the level of model ‘families’, our findings were broadly consistent with the individual-model-based analyses in that the three best-performing families were the power [Pow(B)], exponential [Expo(C)] and Logistic [Logis(D)] families (Table 4). As the best-performing model within the power family is the power model itself, which has just two parameters, and as the exponential family contains only simple functions, and as the best model within the logistic family is the two-parameter monod model, it follows that these family-level analyses again affirm the preference for simple over complex models.

The linear family, represented by a single model, has a high score for efficiency, but it comes low on the family ranking because it has low generality (Tables 3 & 4, and see Fig. S6 in Appendix S3). This paradox is explained in large part by the decreasing performance of the linear model as the number of islands in a data set increases (Table S12 in Appendix S3). Indeed, a linear ISAR appears most characteristic of data sets with a low number of islands, spanning low ranges of area, such that the linear ISAR slips from fourth ranked when considering 465 data sets containing seven or more islands, to 15th ranked when restricting the analyses to the 340 data sets containing 10 or more islands (Fig. 3; and see Table S12 in Appendix S3).

Is there a best-fit ISAR shape and does it includes an asymptote?

There has been considerable debate as to the shape taken by ISARs, whether they exhibit convex or sigmoid forms and whether ISARs reach an asymptote or not (e.g. Connor & McCoy, 1979; Lomolino, 2000, 2002; Williamson et al., 2001; Dengler, 2009; Scheiner, 2009; Tjørve, 2009). Resolving these debates is important to understanding the mechanisms controlling the species richness of isolates.

Our analyses show only weak support for sigmoid ISARs. While we found that at least one of the fitted models exhibited a sigmoid shape in 371 cases out of the 465 data sets, the efficiency values were really low (i.e. mean wAICc for sigmoid models = 0.11 compared with 0.79 for convex models: Table 4b). Moreover, if we were to attribute a sigmoid shape only when this was the form of the overall best model, then a sigmoid shape is observed in just 26 cases (5.5%). The most common shape observed was the convex upwards, a shape typically produced by the simpler models, including the models proposed earliest (Table 2; Arrhenius, 1920; Gleason, 1922).

Williams (1943) may have been the first to note that the slope of the species–area relationship changes with geographical scale (see also Preston, 1960). This notion was codified in Rosenzweig’s (1995) scale-structured model of species–area relationships, in which he proposed four different biogeographical scales of relationship, from point to interprovincial. Our analyses support the proposition that different functions or shapes of ISAR may exhibit scale-dependency (cf. He & Legendre, 1996; Whittaker, 2000; Whittaker & Fernández-Palacios, 2007). The ‘best’ shape is often linear for data sets of very few islands, spanning a small range of area values, and progressively as we move towards coarser scales, convex shapes and finally more complex sigmoid forms become more frequent (cf. Connor & McCoy, 1979; He & Legendre, 1996; Whittaker & Fernández-Palacios, 2007; Figs 1a & 3, Table S12 in Appendix S3).

Contributing factors to this tendency might in theory include the occurrence of a small-island effect across the smallest islands and the effect of in situ cladogenesis producing steep slopes across remote islands of the largest areas (Fig. 1). Thus, a sigmoid curve can sometimes be present, especially when more than three or four orders of spatial magnitude are included. However, sigmoid models performed poorly overall (Table 4) and were found to be ‘best’ in just 5.5% of cases, while few fitted sigmoid models bear much resemblance to the idealized depiction shown in Fig. 1a. Hence, overall, we may conclude that the majority of ISARs are best described as having a convex (upwards) shape.

Next we turn to the issue of the asymptote. Lomolino (2000, 2002) argued that isolated faunas are ultimately derived from a limited pool of species and therefore that the ISAR should level off, asymptotically approaching that maximum value of richness. This was challenged by Williamson et al. (2001), who argued that because both species number and area are finite, the mathematical function describing the ISAR must be limited at both ends and thus there is no theoretical case for the relationship to reach an upper asymptote. Our analyses provide only limited support for asymptotic ISARs. Although 374 data sets can be fitted with a model exhibiting an asymptote, the efficiency values are typically very low compared with those of adequately fitting non-asymptotic models (0.218 vs. 0.825). Moreover, if the shape of the best model is considered in each case, then an asymptote is detected in only 62 cases (13%). Interestingly, an asymptote was detected in combination with a sigmoid ISAR form in just 10 cases (2%; Appendix S3.2), while the other 52 cases were in combination with convex shapes. Based on the very wide sample of published data sets analysed herein we are thus unable to affirm the proposition that when sampled over a full array of island areas, the overall form of the ISAR should be sigmoidal, with an upper asymptote: if such patterns exist they have rarely been sampled.

In conclusion, the convex shape is the most common form, even when a large spatial window is involved. Thus, convex models, without an asymptote, should generally be preferred for fitting ISARs, while consideration may be given to fitting sigmoid models when the spatial range is around, or exceeds, three orders of magnitude (Figs 1b & 3).

Can we infer biological processes responsible for variations in ISAR form by reference to system properties?

The dynamic relationships between immigration, speciation and extinction, and how their rates vary in time and space are fundamental to an understanding of ISARs, as recognized in the equilibrium theory of island biogeography (MacArthur & Wilson, 1967). However, numerous biological mechanisms and theories have been proposed to explain features of ISARs (Whittaker & Fernández-Palacios, 2007, pp. 87–88; and see Schmida & Wilson, 1985; Rosenzweig, 1995; Turner & Tjørve, 2005). The different mechanisms are not mutually exclusive and may operate individually or in combination (Connor & McCoy, 1979; Kohn & Walsh, 1994; Rosenzweig, 1995; Ricklefs & Lovette, 1999; Triantis et al., 2003).

We currently lack a consensus concerning how individual factors and mechanisms contribute to ISAR form across different spatial and temporal scales, environmental conditions and taxa. For instance, the role of isolation is generally regarded as integral to the understanding of ISAR form, as evident in systematic variation in z-values of the power model between our three broad island categories (Fig. 4, and see e.g. MacArthur & Wilson, 1967; Rosenzweig, 1995; Triantis et al., 2008a), but isolation frequently does not have an important role in richness variation within a single archipelago. Similarly, at larger scales, differences in the rate of energy capture across a set of islands and – at even coarser scales – evolutionary history/independence are expected to play significant roles in shaping species richness and modifying ISAR form (Fig. 6). At finer scales, by contrast, mechanisms such as habitat diversity, random placement, and area-based incidence functions more frequently feature in interpretations of variation in island species richness (see Whittaker & Fernández-Palacios, 2007).

Figure 6.

 Schematic interpretation of how causal mechanisms may vary as system scale increases. (a) Factors that influence species richness across scales of space. Darker areas indicate scales of greater influence (see also Schmida & Wilson, 1985; Turner & Tjørve, 2005). The effect of energy input and evolutionary history are expected to affect species richness across the whole spectrum of spatial scale, but at the larger scales their contribution is anticipated to become dominant. The habitat diversity effect, although present across scales, is expected to be reduced at larger scales. At finer scales, mechanisms such as random placement and incidence functions are expected to have greater importance assigned. (b) The relationship between ecological space and area, and the inferred possible effect of each of the factors to the relationship. The spatial scale and the possible extent of the influence of each factor on the relationship between ecological space and area (length of the arrows) are theoretical approximations, used as a working hypothesis.

The explained variation in species richness can be increased by the inclusion into models of variables other than area, representing for example, habitat diversity, energy flow, system age or isolation (Kalmar & Currie, 2006; Whittaker et al., 2008). However, our general ability to fit statistically significant models for the ISAR is consistent with the notion that area is the best general proxy for the available ecological space (sensuGillespie, 2007) provided by an island (Fig. 6). Our results (especially those arising from the CAP analysis) show that while the ISAR form for a particular data set cannot be predicted a priori based on the characteristics of the data set itself, nonetheless a significant if small amount of the variation in ISAR form can be attributed to specific system properties. In particular, there is a degree of scale and system dependence in the relationship between species richness and area (cf. Rosenzweig, 1995; Whittaker, 2000; Whittaker et al., 2001).

In essence then, we retain our focus on island area, and are able to recognize emergent patterns in ISAR form because, despite some independent variation in other factors, area, to a large degree, captures multiple correlated variables that together determine the available ecological space (Gillespie, 2007). In using the term ecological space, we are giving expression to the idea that there is some variation in the capacity for richness that is not fully captured by area alone. Ecological space thus encompasses the combination of abiotic environmental conditions (including area, elevational range, and climatic capacity for productivity) and biotic conditions (including the historically determined species pool and the prevailing propagule rain) that constrain actual levels of island diversity (e.g. Rosenzweig, 1995; Whittaker et al., 2001, 2008; Whittaker & Fernández-Palacios, 2007; Losos & Ricklefs, 2009; Rabosky, 2009; Ricklefs, 2009).

Figure 6 provides a schematic interpretation of how our findings might relate to the above ideas and mechanisms. A strong correlation between species richness and area should be considered as an indication of area effectively capturing the overall characteristics establishing ecological space and thus species richness in the region, and not a priori a direct or an indirect effect of area. In such cases area will most probably be highly correlated with other variable(s) establishing species richness in the system. On the other hand, a low correlation of species richness with area indicates that area is decoupled from the (other) major variables that determine the occupied ecological space (e.g. Wright, 1983; Triantis et al., 2003).

Can the parameters of the logarithmic implementation of the power model be interpreted biologically and ecologically?

The abiding interest in the power model owes much to Preston’s (1962) derivation of the canonical value of = 0.262, based on the assumption of a lognormal distribution of abundance and the subsequent biological interpretation of variation in both the z and c parameters of the power model. Despite much attention to these parameters, their biological significance has been questioned, notably by Connor & McCoy (1979, p. 815), who concluded that ‘… we are sceptical that any biological significance can be attached to these parameters and recommend that they be viewed simply as fitted constants devoid of specific biological meanings’ (see also Williamson, 1988; contrast with Sugihara, 1980).

The z-values describe the rate of accumulation of species with the increase of area in the logarithmic space. In general, higher values correspond to more isolation (Fig. 4; cf. MacArthur & Wilson, 1967). However, z-values are not merely responsive to geographical distance but may vary as a function of an array of other system properties and specific biological processes. For example, in an inland water system of really small islands, with a high degree of nestedness and close to the mainland, for which theory would predict a low value of z (e.g. Rosenzweig, 1995), the rate of species increase from the smallest to the largest island can sometimes be extremely high, as shown by Nilsson & Nilsson’s (1978) study of strictly terrestrial plants for which = 0.72. Given such variation from the general trends reported herein, it is clearly necessary to exercise caution in offering biological interpretation of parameter values. However, neither is it an entirely stochastic pattern, as differing rates of extinction, immigration and speciation combine to produce significant emergent trends (cf. Wilson, 1969; Rosenzweig, 1995; Triantis et al., 2008a; Whittaker et al., 2008; Kisel et al., 2011). In an island group and for a specific taxon with hardly any limitations of dispersal, the increase of species with area will be low and certainly, on average, lower than in a system, for the same taxon, where most of the species originate from in situ speciation. The species overlap will tend to be higher in the first case and thus the z-values lower. As Triantis et al. (2008a) have shown, if only the single-island endemic species are considered, a high z-value can be anticipated: higher than 0.6 and often close to unity [as postulated for Rosenzweig’s (1995) interprovincial SAR]. Hence, we suggest that a general pattern exists relating the z-value of the ISAR with the dominant processes of species addition. As we move from speciation-dominated systems (usually oceanic islands) to immigration–extinction dynamics (e.g. continental-shelf islands) and then to low-dispersal limitation systems (e.g. inland islands, which are typically close to their potential species pool), we will in general observe lower values of z. This generalization is supported by the values for the different island types (oceanic, continental-shelf and inland islands) considered here (Fig. 4a), and broadly supports earlier syntheses by e.g. Preston (1962) and MacArthur & Wilson (1967).

As depicted in Fig. 5, the mean value of z is significantly higher for data sets spanning just two orders of magnitude of AreaSCALE, than for all other data sets. The inclusion of more orders of magnitude of island area leads to a progressive reduction of z-values. This could be an explanation for the general tendency for reported z-values to cluster around 0.2–0.4. When more than two orders of magnitude are included in the system under study, the probability of more processes being involved in establishing species richness for specific scales of the overall spatial scale considered is high; thus they cluster around the values observed for the archipelagic scale, i.e. z of 0.2–0.4 (Rosenzweig, 1995), which is the intermediate state between high and low species overlap, with each of the processes responsible for adding species to islands playing a role.

In terms of the c-values, ‘the politely ignored’ parameter of the species–area relationship (Gould, 1979), MacArthur & Wilson (1967) limited themselves to broad generalizations, suggesting that fitted values will depend on the population density and the innate species diversity of the taxon, the environmental carrying capacity, and the isolation of the system. In our analyses, we detected two main patterns. First, the logc values decreased progressively from inland to continental-shelf to oceanic systems (Fig. 4). This is predicted by island theory because with increasing distance from the possible species pool, dispersal is expected to be reduced and thus fewer and fewer ‘sink species’ (sensuRosenzweig, 1995) will be able to sustain presence through supplementary immigration (the rescue effect sensuBrown & Kodric-Brown, 1977). Thus, in the most isolated islands, which are usually oceanic islands, we would expect fewer species than for the continental-shelf and inland archipelagos (MacArthur & Wilson, 1967). Additionally, a speciation-dominated island would be expected to have fewer species than a dispersal-dominated island of equal size because speciation needs larger areas to produce the same species numbers as dispersal (e.g. Heaney, 2000; Kisel & Barraclough, 2010). Second, logc values generally increase from vertebrates to invertebrates and finally to plants. The lower values for vertebrates are consistent with the general expectation that they require more space to sustain viable species populations than plants or invertebrates. We might have anticipated that invertebrates would have the highest logc values, given the small body size of many species and the density at which invertebrates can often occur; the intermediate mean value obtained may reflect the taxonomic and trophic variation captured within the invertebrate data sets in the analysis. The high logc values of plants indicate that, being autotrophs, they are on average able to pack in more species per unit area in small islands than do animals, and may also reflect superior mechanisms (e.g. dormancy) for persisting in patchy or ephemeral populations.

In terms of geographical context, some studies using nested designs have reported a relationship between z-values and latitude (e.g. contrast Lyons & Willig, 2002; Qian et al., 2007). However, in our analyses of ISARs we failed to find a latitudinal effect, either in respect of z or logc values (cf. Connor & McCoy, 1979). Hence, climatic effects with regard to these parameters are either weakly reflected by latitude or are intertwined with other system variables so as to prevent their emergence (cf. Kalmar & Currie, 2006).


Overall, our analyses confirm that the shape of the ISAR is scale dependent. We conclude that over most scales of space, island species–area relationships are best represented by simple models. While, the form of the ISAR varies considerably between study systems, some part of this variation can reasonably be related to the array of previously identified mechanisms and processes that constrain the ecological space available within an island system (Fig. 6) and the geographical context within which the archipelago is located. Thirty-three years after Connor & McCoy’s (1979) landmark analysis, we offer a different and cautiously positive answer to the question of whether biological significance can be assigned to the parameters of the best general ISAR model: the power model. Z-values are indicative of the process(es) establishing species richness and composition patterns, while c-values are indicative of the realized carrying capacity of the system per unit area. Notable general trends are that c-values vary with system type and major taxon, and that z-values increase as we switch from considering systems with high species overlap to systems with low species overlap (cf. Rosenzweig, 1995).


We thank M. Panitsa, E. Eliadou, M. Mylonas, K. Vardinoyannis, C. Parent and S-J. Lee for making available to us unpublished data and updated species lists. We thank D. Mouillot, F. Rigal, E. Heegaard, D. Charpentier, A.M. Whittaker and A. Oikonomou for assistance in statistical and technical issues, and Y. Kisel and L. McInnes for making available to us their unpublished manuscript. For critical comment and/or discussion we thank S. Scheiner, E. Tjørve, S. Sfenthourakis, C. Rahbek, R. Grenyer, J.T. Kerr, K.C. Burns, M.V. Lomolino and an anonymous referee. K.A.T. was supported in this work by a Marie Curie Intra-European Fellowship Program (project ‘SPAR’, 041095) held in the School of Geography and the Environment, University of Oxford and by a Fundação para a Ciência e a Tecnologia (FCT) Fellowship (SFRH/BPD/44306/2008).


Kostas Triantis has been recently elected as an assistant professor in the Biology Department of the University of Athens. He initiated the present work when holding a Marie Curie Fellowship at the University of Oxford, and has since been involved in a long-running project investigating habitat fragmentation effects within the Azorean archipelago.

François Guilhaumon is interested in applying theoretical and methodological advances in macroecology and functional ecology to conservation biology in both terrestrial and marine systems.

Robert J. Whittaker has a long-term fascination with the biogeography of islands and also works on diversity theory, scale effects, and conservation biogeography.

Author contributions: K.A.T. had the original idea and collected the data; F.G. designed and performed analyses of data, and K.A.T. and R.J.W. contributed to the design of analyses; K.A.T. and R.J.W. wrote the paper with the substantial contribution of F.G.

Chief Editor: Jonathan P. Sadler

Editor: K.C. Burns