### Abstract

- Top of page
- Abstract
- Introduction
- Methods
- The model
- Data sets
- Analyses
- Results
- Discussion
- Acknowledgments
- References
- Biosketches

**Abstract**** Aim** The species–area relationship is a ubiquitous pattern. Previous methods describing the relationship have done little to elucidate mechanisms producing the pattern. Hanski & Gyllenberg (*Science*, 1997, **275**, 397) have shown that a model of metapopulation dynamics yields predictable species–area relationships. We elaborate on the biological interpretation of this mechanistic model and test the prediction that communities of species with a higher risk of extinction caused by environmental stochasticity should have lower species–area slopes than communities experiencing less impact of environmental stochasticity.

**Methods** We develop the mainland–island version of the metapopulation model and show that the slope of the species–area relationship resulting from this model is related to the ratio of population growth rate to variability in population growth of individual species. We fit the metapopulation model to five data sets, and compared the fit with the power function model and Williams's (*Ecology*, 1995, **76**, 2607) extreme value function model. To test that communities consisting of species with a high risk of extinction should have lower slopes, we used the observation that small-bodied species of vertebrates are more susceptible to environmental stochasticity than large-bodied species. The data sets were divided into small and large bodied species and the model fit to both.

**Results and main conclusions** The metapopulation model showed a good fit for all five data sets, and was comparable with the fits of the extreme value function and power function models. The slope of the metapopulation model of the species–area relationship was greater for larger than for smaller-bodied species for each of five data sets. The slope of the metapopulation model of the species–area relationship has a clear biological interpretation, and allows for interpretation that is rooted in ecology, rather than ad hoc explanation.

### Introduction

- Top of page
- Abstract
- Introduction
- Methods
- The model
- Data sets
- Analyses
- Results
- Discussion
- Acknowledgments
- References
- Biosketches

The relationship between the number of species inhabiting a region and its area has been the focus of considerable research for over 100 years (Rosenzweig, 1995). That the number of species increases with sample area, the species–area relationship, is one of the few fundamental ‘laws’ in ecology (Lawton, 1999). Explanations of the species–area relationship have focused on several non-exclusive themes. Species–area relationships may be a consequence of patterns of species abundance (Preston, 1962; May, 1975; Hubbell, 2001) or a combination of abundance and geographical range (Leitner & Rosenzweig, 1997). The number of different habitats and thus the variety of species that can be supported generally increases with area (Williams, 1964). Larger areas may passively ‘sample’ a larger fraction of the species pool (Connor & McCoy, 1979). Larger areas generally support larger populations of individual species, which lowers their risk of extinction (Simberloff, 1976). Finally, the number of species inhabiting a given area may be a dynamic relationship between extinction and colonization, where extinction decreases with increasing area and colonization decreases with increasing isolation (MacArthur & Wilson, 1967; Hanski & Gyllenberg, 1997) or a combination of speciation, colonization and extinction (Durrett & Levin, 1996; Hubbell, 2001).

Despite the voluminous effort expended examining species–area relationships, most work has been essentially descriptive. Several statistical models have been used to describe the species–area relationship (reviewed in Connor & McCoy, 1979). Because it generally fits the data well, the power function model, *S*=*cA*^{z}, and its linear equivalent, log *S*=log *c* + *z* log *A*, have been used extensively to describe the species–area relationship. Here, *S* is the number of species, *A* is site or island area, and *c* and *z* are two parameters. However, the power function model has several well-known, undesirable properties. First, it is unbounded. As area increases the number of species increases without limit, which is contrary to both empirical evidence (Connor & McCoy, 1979; Williamson *et al*., 2001) and theory (MacArthur & Wilson, 1967; Hanski & Gyllenberg, 1997; Hubbell, 2001). Secondly, both *S* and log S often violate the assumptions of normality and homoscedasticity for regression analysis (Williams, 1995). Vincent and Haworth (1983) and Williams (1995) have addressed these statistical problems by assuming more realistic error distributions.

More fundamentally, all previous models have been either descriptive or statistical ‘null-models’ testing random species or individual placement, rather than being based on imputed biological processes. The shortcoming of the descriptive approach is revealed in the difficulty, and often heated debate, surrounding the biological interpretation of model parameters, particularly the slope z of the power function model (Abbott, 1983). This unfortunate circumstance largely results from the fact that methods used to describe the species–area relationship provide little insight regarding the mechanisms proposed to create the pattern. Hanski & Gyllenberg (1997) have shown that the single-species analogue of the equilibrium theory of island biogeography, metapopulation dynamics, also yields predictable species–area relationships. Here, we elaborate on the biological interpretation of the model and test a specific prediction, namely that communities of species with a higher risk of extinction caused by environmental stochasicity should have lower species–area slopes than communities experiencing less impact from environmental stochasticity.

### The model

- Top of page
- Abstract
- Introduction
- Methods
- The model
- Data sets
- Analyses
- Results
- Discussion
- Acknowledgments
- References
- Biosketches

We start with the mainland-island version of Levins's (1969) metapopulation model, which is closely related in spirit to the community-level island model of MacArthur & Wilson (1967). Let *p*_{j} denote the stationary probability that island *j* is occupied, which is given by:

- (1)

where *C*_{j} sets the colonization rate and μ, sets the extinction rate. Following Hanski (1994, 1999), we make the following structural assumptions about the colonization and extinction rates,

where *c* is a parameter, *w*_{i} is the constant density of species *i* (Hanski & Gyllenberg, 1997), *α*_{i} is a parameter describing the migration capacity of species *i*, *x*_{i} scales extinction risk with area, and *d*_{j} and *A*_{j} are the distance from the mainland and the area of island *j*. Substituting these assumptions into equation 1, we obtain the following expression for the incidence of species *i* on island *j*,

- (2)

where *K*_{ij} is the logit-transformed incidence:

- (3)

We now define the species richness *S*_{j} of island *j* as the expected fraction of species present on island *j*, that is,

- (4)

where *R* is the number of species in the pool. Taking equation 2 into account one obtains

- (5)

where *E* denotes expectation over all species.

Next, we approximate the logit-transform of *S*_{j} by interchanging the order of taking expectations and the logit-transform. Using equations (3) and (5) one obtains

- (6)

where the overbar denotes mean value and

- (7)

The model analysis will be presented elsewhere (Gyllenberg & Hanski, in preparation) we consider

- (8)

as an alternative, mechanistic species-area model.

The approximation (6) involves an error of the order

- (9)

where σ^{2} is the variance of *S*_{j} over the species and

- (10)

is the second derivative of the logit-transform evaluated at *S*_{j}. The approximation is accurate by close to *S*_{j}=0.5, but inadequately close to *S*_{j}=0 or 1.

Some comments about the model are in order. The model is based on the standard Levins-type assumptions about single-species metapopulation dynamics, combined with structural assumptions about the effects of island area and isolation on extinction and colonization. The single-species model is, in fact, Hanski's (1994) incidence function model for mainland–island metapopulations (see Hanski, 1992, 1993, 1999). Conveniently, this model can be raised to the community level by observing that the expected value of the incidences of all the species on a particular island equals the expected species number divided by the size of the species pool. We thus arrive at equation 8, which defines an alternative species–area model with two differences to the familiar power function equation. First, the quantity on the left-hand side is the logit-transformed species number divided by the number of species in the species pool, instead of the log-transformed species number. Secondly, the model explicitly includes the effect of isolation. We emphasize that the coefficient of the log *A*_{j} term in equation 8 is not the same as the slope parameter *z* of the power function equation. Unlike *z*, the parameter *x* has a biological interpretation. The parameter *x* of the incidence function model for individual species has been shown to relate the expected lifetime of a population to population size (Hanski, 1992; Cook & Hanski, 1995). Furthermore, Hanski (1998) has shown in the context of a simple extinction model that *x*_{i}=2*r*_{i}/*v*_{i}, where *r*_{i} is the intrinsic rate of population increase and *v*_{i} is its variance for species *i*. Thus, *x* is expected to decrease with increasing environmental stochasticity as indicated by higher variance in the population growth rate.

### Data sets

- Top of page
- Abstract
- Introduction
- Methods
- The model
- Data sets
- Analyses
- Results
- Discussion
- Acknowledgments
- References
- Biosketches

Five data sets were chosen for analysis (Table 1). These data sets contain information necessary for the present analysis, specifically island areas and isolations, species incidences, and mainland species pools. The incidence functions and the parameter *x* for individual species have been determined for these data sets (Hanski, 1992; Cook & Hanski, 1995) providing a useful comparison between patterns at the levels of communities and individual species.

Table 1. The number of species in the species pool, number of islands in the system and the results for the fit of the metapopulation species–area model. Deviances shown in bold contribute significantly to the model as judged by a reduction in *C*_{p} | | | Deviance | Best-fit model |
---|

Data set | No. of species | No. of islands | Area | Isolation | *F* | *P* |
---|

Great Basin birds (Behle, 1978) | 28 | 14 | **6.85** | 0.23 | *F*_{1,12}=7.02 | 0.021 |

New Zealand birds (Diamond, 1984) | 22 | 19 | **25.00** | 0.20 | *F*_{1,17}=27.58 | < 0.001 |

Torres Strait birds (Draffan *et al*., 1983) | 93 | 23 | **51.76** | **2.81** | *F*_{2,20}=51.72 | < 0.001 |

Sea of Cortez birds (Cody, 1983) | 69 | 22 | **79.38** | 0.47 | *F*_{1,20}=85.67 | < 0.001 |

Lake Sysmä mammals (Hanski & Kuitunen, 1986) | 7 | 17 | **13.95** | 0.45 | *F*_{1,15}=14.13 | 0.002 |

The data were subject to the same restrictions as in Cook & Hanski (1995), except that species did not need to be constrained to those species occurring on at least four islands. Thus, our data set contains a larger number of species. As in Cook & Hanski (1995), waterfowl and raptors were not considered in the four avian data sets. The mainland species pool for the Torres Straight data (Draffan *et al*., 1983) was assumed to consist of those non-migratory species occurring in the Cape York region of Australia. For this data set, species distributions were taken from Slater *et al*. (1988). Mainland species pools were described by the authors for the other data sets (Behle, 1978; Cody, 1983; Hanski and Kuitunen, 1986). For the New Zealand data set, the North, South, and Stewart Islands were considered to be the ‘mainland’ (Diamond, 1984).

### Analyses

- Top of page
- Abstract
- Introduction
- Methods
- The model
- Data sets
- Analyses
- Results
- Discussion
- Acknowledgments
- References
- Biosketches

Equation (8) was fit to the data in a stepwise fashion to determine the effects of area and isolation on species number. Island area was entered into the model first and any improvement in the model caused by isolation was assessed using Mallow's *C*_{p} statistic (Insightful, 2001). The metapopulation model including area-only was compared with the nonlinear power function model and to Williams' extreme value function (EVF), which assumes random placement of individuals to generate the species–area relationship (Williams, 1995). All data sets showed underdispersion relative to the binomial expectation of the EVF and metapopulation models. Therefore, when testing for significance the deviance difference was weighted by the underdispersion and compared with an *F*-distribution (Crawley, 1993).

To test the hypothesis that communities consistingofspecies with a high risk of extinction caused by environmental stochasticity should have smaller values of the parameter *x*, we used the observation that, due to higher energetic demands, small-bodied species of vertebrates are more susceptible to environmental stochasticity than large-bodied species (Schmidt-Nielsen, 1979; Root, 1988; Hanski, 1989). Extending the results of Cook & Hanski (1995), who found a positive relationship between the parameter *x* and body size for individual species, our hypothesis specifically examines how well *x*‘scales-up’ to describe the susceptibility of entire communities to environmental stochasticity. The species pools were divided into large and small species based on median body size. We then fit the full metapopulation model (including the effect of isolation) for both species pools. All data showed underdispersion, which was accounted for when deriving confidence intervals for parameter estimates and significance tests.

### Results

- Top of page
- Abstract
- Introduction
- Methods
- The model
- Data sets
- Analyses
- Results
- Discussion
- Acknowledgments
- References
- Biosketches

The metapopulation model showed a good fit to all data sets (Tables 1 & 2). However, including the effect of isolation only improved the model fit for the Torres straight bird data. In general, the metapopulation model including only the area effect, the EVF, and the power function model showed comparable fits to the data (Table 2, Fig. 1). As noted above, the parameter *x* in the metapopulation model and *z* in the power function are not the same. Nonetheless, there was a positive correlation (*r*=0.64) between *x* and *z*. There was also a strong correlation between *x* and *a,* the slope of the EVF (*r*=0.94). As predicted, the value of *x* was greater for larger-bodied than smaller-bodied species for each of five data sets (Table 3, Fig. 2). Similarly, the slope of the power function model was greater for large-bodied than for small-bodied species. This result indicates that the slope of the power function model may to some extent reflect the strength of environmental stochasticity in the dynamics of the species forming the community. The difference in slopes between large and small-bodied species was greater, and significant in two cases, for the metapopulation model compared with the power function model.

Table 2. A comparison between the power function model, the area-only metapopulation species-area model, and the extreme value function model (EVF). Parameter estimates are shown ± the standard error and *R*^{2} was used to assess fit. The parameter *b* is the intercept and a is the slope for the EVF ( Williams, 1995) | Power function | Metapopulation model | EVF |
---|

Data set | *c* | *z* | *R*^{2} | *a* | *x* | *R*^{2} | *b* | *a* | *R*^{2} |
---|

Great Basin birds | 2.04 ± 1.36 | 0.19 ± 0.07 | 0.41 | −3.45 ± 1.26 | 0.34 ± 0.13 | 0.38 | −2.99 ± 0.93 | 0.26 ± 0.09 | 0.39 |

New Zealand birds | 3.04 ± 0.71 | 0.18 ± 0.03 | 0.65 | −2.96 ± 0.61 | 0.45 ± 0.09 | 0.62 | −2.44 ± 0.39 | 0.31 ± 0.06 | 0.64 |

Torres Strait birds | 4.74 ± 1.06 | 0.19 ± 0.03 | 0.70 | 3.13 ± 0.24 | 0.25 ± 0.04 | 0.66 | 3.11 ± 0.23 | 0.23 ± 0.04 | 0.66 |

Sea of Cortez birds | 1.99 ± 0.45 | 0.21 ± 0.02 | 0.81 | 3.82 ± 0.25 | 0.27 ± 0.03 | 0.80 | 3.72 ± 0.23 | 0.24 ± 0.03 | 0.80 |

Lake Sysmä mammals | 1.84 ± 0.18 | 0.35 ± 0.08 | 0.55 | 1.00 ± 0.13 | 0.51 ± 0.14 | 0.43 | 1.17 ± 0.11 | 0.42 ± 0.11 | 0.43 |

Table 3. Parameter estimates (±SE) and a test of differences in the slopes *x* and *z* for large and small-bodied species from the metapopulation and power function species-area models Data set metapopulation model | Large-bodied species | Small-bodied species | Test of slopes |
---|

| *a* | *x* | *α* | *a* | *x* | *α* | *t* | *P* |
---|

Great Basin birds | −4.62 ± 1.53 | 0.46 ± 0.16 | −0.003 ± 0.004 | −2.52 ± 1.32 | 0.26 ± 0.14 | −0.001 ± 0.003 | 0.94 | 0.36 |

New Zealand birds | −5.37 ± 1.09 | 0.81 ± 0.16 | 0.008 ± 0.013 | −1.73 ± 0.65 | 0.23 ± 0.09 | 0.003 ± 0.010 | 3.16 | <0.01 |

Torres Strait birds | −4.41 ± 0.49 | 0.36 ± 0.05 | 0.022 ± 0.013 | −3.11 ± 0.43 | 0.20 ± 0.05 | 0.015 ± 0.012 | 2.26 | 0.03 |

Sea of Cortez birds | −4.53 ± 0.42 | 0.31 ± 0.04 | 0.014 ± 0.007 | −3.49 ± 0.41 | 0.25 ± 0.04 | −0.004 ± 0.008 | 1.06 | 0.30 |

Lake Sysmä mammals | −0.44 ± 1.01 | 0.96 ± 0.49 | 0.954 ± 2.598 | −0.91 ± 0.59 | 0.42 ± 0.23 | −1.443 ± 1.565 | 1.00 | 0.33 |

Power function model | *c* | *z* | | *c* | *z* | | *t* | *P* |
---|

Great Basin birds | 0.46 ± 0.38 | 0.25 ± 0.08 | | 1.95 ± 1.22 | 0.13 ± 0.06 | | 1.20 | 0.24 |

New Zealand birds | 1.03 ± 0.30 | 0.23 ± 0.04 | | 2.40 ± 0.62 | 0.12 ± 0.04 | | 1.96 | 0.06 |

Torres Strait birds | 1.66 ± 0.47 | 0.23 ± 0.04 | | 3.30 ± 0.77 | 0.14 ± 0.03 | | 1.79 | 0.08 |

Sea of Cortez birds | 0.61 ± 0.22 | 0.24 ± 0.04 | | 1.40 ± 0.37 | 0.19 ± 0.03 | | 1.00 | 0.32 |

Lake Sysmä mammals | 0.85 ± 0.17 | 0.45 ± 0.15 | | 0.97 ± 0.17 | 0.27 ± 0.15 | | 0.85 | 0.40 |

### Discussion

- Top of page
- Abstract
- Introduction
- Methods
- The model
- Data sets
- Analyses
- Results
- Discussion
- Acknowledgments
- References
- Biosketches

The metapopulation model, the extreme value function model and the power function model all showed similar fits to the data sets. It is not surprising that a model based on random placement, the EVF, gives results similar to that of the metapopulation model. Both random placement and the metapopulation model assume that the number of individuals at a site is positively related to site area. Barring any density–area relationship (but see Matter, 2000), random placement assumes that the number of individuals at a site is a linear function of area. The metapopulation model also assumes that the number of individuals increases with area, but importantly, further assumes an inverse, nonlinear relationship between the risk of extinction and the number of individuals at a site. In any case, both models positively relate the incidence of a species to site area. The statistical models representing metapopulation dynamics and random placement are also quite similar. The models including only an effect of area differ only in their ‘link’ function, which is a conjugate log–log function for the extreme value function and a logit function for the metapopulation model.

Lack of an effect of isolation for four of five data sets calls into question either the role of isolation in affecting colonization or the importance of colonization from the mainland for these data sets. Our model makes the simple assumption that all colonization occurs from the mainland and thus distance from an island to the mainland sets the level of isolation. Potentially, colonization occurs among islands. Thus, isolation may be a more complicated function of interisland distances and distance to the mainland. Incorporating these effects into a model for single species is not difficult (e.g. Hanski, 1994), but raising such a model to the community level is more complicated (Hanski & Gyllenberg, 1997).

A potential criticism of our interpretation of the parameter *x* is that small islands may not be large enough to support large species. However, we found no difference between the mean size of the smallest inhabited island by large or small species for any of the data sets (Mann–Whitney *U*, *P* > 0.10).

Because of the impracticality of conducting experiments over large spatial scales, ecologists will always attempt to infer processes from patterns and observational data. The interpretation of parameters derived from such data has sparked a great deal of debate in community ecology and biogeography. At the core of these debates is how we should infer processes from patterns. In the case of the species–area relationship, we have shown that a simple mechanistic model provides a framework for inferring mechanisms from patterns. Specifically, we have shown that the parameter *x* of the metapopulation model of the species–area relationship has a clear biological interpretation, in contrast to the slope of the power function equation. We suggest that the present model makes a contribution to the study of species–area relationship that is rooted in ecology, rather than *ad hoc* interpretation.

### Biosketches

- Top of page
- Abstract
- Introduction
- Methods
- The model
- Data sets
- Analyses
- Results
- Discussion
- Acknowledgments
- References
- Biosketches

**Stephen F. Matter** is a research assistant professor of Biology at the University of Cincinnati, Cincinnati, OH USA. His research interests are in spatial population and community dynamics.

**Ilkka Hanski** is a professor of Ecology at the University of Helsinki, Finland, and is the Director of the Metapopulation Research group in the Department of Ecology and Systematics at the same university. His primary research interest is metapopulation biology, but he has worked on many topics in population and community ecology and conservation biology.

**Mats Gyllenberg** is a professor of Mathematics at the University of Turku, Finland, and is the Leader of the Biomathematics Research Group in the Department of Mathematics at the same university. His research interests are in dynamical systems, classification of binary vectors, and mathematical applications to population dynamics, ecology, and taxonomy.