Ranked species occupancy curves reveal common patterns among diverse metacommunities


David G. Jenkins, Department of Biology, University of Central Florida, Orlando, FL 32816-2369, USA. E-mail: dgjenkin@mail.ucf.edu


Aim  Community ecologists often compare assemblages. Alternatively, one may compare species distributions among assemblages for macroecological comparisons of species niche traits and dispersal abilities, which are consistent with metacommunity theory and a regional community concept. The aim of this meta-analysis is to use regressions of ranked species occupancy curves (RSOCs) among diverse metacommunities and to consider the common patterns observed.

Location  Diverse data sets from four continents are analysed.

Methods  Six regression models were translated from traditional occupancy frequency distributions (OFDs) and are distributed among four equation families. Each regression model was fitted to each of 24 data sets and compared using the Akaike information criterion. The analysed data sets encompass a wide range of spatial scales (5 cm–50 km grain, 2–7000 km extent), study scales (11–590 species, 6–5114 sites) and taxa. Observed RSOC regressions were tested for the differences in scale and taxa.

Results  Three RSOC models within two equation families (exponential and sigmoidal) are required to describe the very different data sets. This result is generally consistent with OFD research, but unlike OFD-based expectations the simple RSOC patterns are not related to spatial scale or other factors. Species occupancy in diverse metacommunities is efficiently summarized with RSOCs, and multi-model inference reliably distinguishes among alternative RSOCs.

Main conclusions  RSOCs are simple to generate and analyse and clearly identified surprisingly similar patterns among very different metacommunities. Species-specific hypotheses (e.g. niche-based factors and dispersal abilities) that depend on spatial scale may not translate to diverse metacommunities that sample regional communities. A novel set of three metacommunity succession and disturbance hypotheses potentially explain RSOC patterns and should be tested in subsequent research. RSOCs are an operational approach to the regional community concept and should be useful in macroecology and metacommunity ecology.


Local community concepts have long dominated community ecology (Lawton, 1999; Ricklefs, 2008). Consequently, many ecologists have analysed within and among assemblages using familiar statistics (e.g. relative abundance, similarity indices, alpha and beta diversity; Magurran, 2004; Dornelas et al., 2009). Analyses within or among assemblages in a species-by-sites matrix (where rows = species and columns = sites) are called Q-mode analyses (Gotelli & Graves, 1996) and correspond to a figurative, vertical perspective of community ecology in which species ‘pile up’ in a site (Ricklefs, 2008). An unfortunate side-effect of a vertical, assemblage-centric focus is that the influence of regional processes (i.e. processes occurring at scales beyond the sites) on local assemblages has remained less clear, though community ecology is increasing its spatial and temporal scales (e.g. Ricklefs & Schluter, 1993; Holyoak et al., 2005; Ricklefs, 2008). As community ecology expands its spatial and temporal scales to approach those of biogeography, additional approaches are needed.

Ricklefs (2008) proposed a regional community concept that emphasizes populations distributed across ecological and geographical gradients. One approach to the regional community concept is to use an R-mode analysis, which compares species (rows in a species-by-sites matrix; Gotelli & Graves, 1996) distributed among localities and is orthogonal to a Q-mode analysis. In addition to the regional community concept, the R-mode perspective is consistent with studies of comparative ecology (e.g. Westoby et al., 1996), macroecology (Brown, 1995), adaptive evolution (e.g. Martins, 2000) and niche (Chase & Leibold, 2003).

Species distribution data are recorded as presence and absence among multiple sites in either R- or Q-mode analyses. To consider both presence and absence is to analyse incidence (Rita & Ranta, 1993). Analyses that include absence require that absence is reliably demonstrated, which may be problematic for some purposes (Brotons et al., 2004; Elith et al., 2006; Lobo et al., 2010). To focus on presence alone is to analyse occupancy or occurrence, analogous to ecological analyses of relative abundance (e.g. Dornelas et al., 2009) in that absent species have no abundance. The term ‘occupancy’ is often applied to local- or regional-scale analyses of species distributions (Gotelli & Simberloff, 1987; Tokeshi, 1992; Collins & Glenn, 1997; McGeoch & Gaston, 2002) whereas ‘occurrence’ is often applied to range-scale analyses (Scott et al., 2002). I use the term occupancy here.

Just as Q-mode analyses may be hierarchical (e.g. the Jaccard index analyses detail while species richness summarizes), R-mode analyses may also be viewed as detailed or summary. The details of R-mode pattern can be analysed by species co-occurrence (Gotelli & McCabe, 2002) and nestedness (Rodriguez-Gironés & Santamaria, 2006). To date, occupancy frequency distributions (hereafter OFDs) most closely approach a summary R-mode analysis. Multiple OFD shapes are possible (eleven by Tokeshi, 1992; four by Collins & Glenn, 1997; eight by McGeoch & Gaston, 2002) but have problems. As in all frequency distributions, data are sorted into categorical bins, and the choice of bin widths may alter interpretations (Gray et al., 2006). An OFD is not information rich (sensuTufte, 2001); if published in the absence of a species-by-sites matrix an OFD does not enable subsequent analyses of inter-specific differences, detailed change through time (e.g. invasive or endangered species, response to disturbance), meta-analyses (e.g. this work) or phylogenetic community analyses that require species identity (e.g. Webb et al., 2002). Most importantly, OFDs can be difficult to analyse (McGeoch & Gaston, 2002). A method that yields results that are more readily analysed and more informative will help to fully analyse species occupancy data for regional community structure, spatial or temporal change, or the potential effects of phylogeny.

Ranked species occupancy curves

Here I apply a multi-model inference approach to regressions of empirical ranked species occupancy curves (RSOCs) as a new, efficient summary R-mode analysis that relates directly to OFDs (Fig. 1). The ranking of occupancy itself is not new; Willis (1922) ranked species by area, and ranked occupancy is a component of some modern diversity analyses (e.g. Gardezi & Gonzalez, 2008; Heino, 2008; Muneepeerakul et al., 2008). However, ranked occupancy has not been quantitatively examined for general patterns among diverse empirical systems. Moreover, ranked occupancy has not been related to OFDs and their associated hypotheses, let alone diverse other concepts.

Figure 1.

Occupancy frequency distributions (OFDs) translate to ranked species occupancy curves (RSOCs) (also see Table 1). (a) OFDs are histograms (shown here as lines for consistency with part b below), where species are binned into categories of the percentage of sites occupied. OFDs depicted here mimic those of Collins & Glenn (1997) plus a random OFD (McGeoch & Gaston, 2002), in which species were distributed among bins (mean = 10, SD = 2) to obtain multiple modes (see Table 1). The solid black line = bimodal-core dominant; long-dashed line = unimodal-central mode; short-dashed line = unimodal-satellite dominant; dotted line = unimodal-core dominant; grey line = random (based on the terminology of McGeoch & Gaston, 2002). (b) RSOCs are based on the same data as the OFDs in (a) but rank the species rather than assign them to histogram bins. Line patterns in (b) match those in (a).

RSOCs translate directly from previously described OFDs (Fig. 1, Table 1), in much the same way that ranked species-abundance distributions are related to frequency distributions of abundance (e.g. MacArthur, 1957; Hubbell, 2001; McGill et al., 2007). For example, an asymmetric bimodal OFD (solid line in Fig. 1a) translates to an asymmetric sigmoidal RSOC (solid line in Fig. 1b) because the large OFD mode of low-occupancy (i.e. satellite) species comprises species in the long RSOC tail, and the weaker OFD mode of high-occupancy (core) species comprises species in the upper RSOC shoulder. Likewise, a random OFD pattern (i.e. no clear modality; grey line in Fig. 1a) translates to a linear RSOC (grey line in Fig. 1b).

Table 1.  Ranked species occupancy curves (RSOCs), hierarchically listed by model family and equation. Each potential RSOC equation corresponds to a previously described occupancy frequency distribution (OFD), listed here by three major papers on OFDs. Thumbnail image of

Each RSOC can be fitted to one of four standard model families (exponential, normal, sigmoidal, linear) that correspond to described OFDs (Table 1, Fig. 1). Finally, specific regression models within or among model families can be compared using multi-model inference (Anderson et al., 2000). I emphasize comparisons of model families here, though sigmoidal models are analysed as either symmetric or asymmetric to match OFDs (Table 1, Fig. 1).

Multi-model inference of RSOC regressions permits definitive tests of alternative OFD-based hypotheses. For example, OFDs at scales less than those of species geographic ranges are expected to be right-skewed (i.e. many species are infrequently observed), whereas bimodal OFDs are expected at the scale of species geographic ranges (McGeoch & Gaston, 2002; Hui & McGeoch, 2007a; Heino, 2008). I tested for the artefactual effects of spatial scale (spatial grain and extent; Wiens, 1989) and study scale (number of species, number of sites; Jenkins, 2006) on RSOCs. In addition to artefactual scale effects (Wilson, 2008), species-specific biological mechanisms (e.g. metapopulation processes, niche-related factors, dispersal ability) have also been invoked to explain OFDs (e.g. Hanski, 1982; Brown, 1984; Nee et al., 1991; Tokeshi, 1992; Collins & Glenn, 1997; van Rensburg et al., 2000; McGeoch & Gaston, 2002; Hui & McGeoch, 2007a,b). However, to invoke a biological mechanism for an OFD or RSOC is to assume that a species-specific effect (e.g. rapid dispersal) is consistent across all species in the metacommunity (i.e. multiple communities linked by dispersal; Leibold et al., 2004). I treat a metacommunity here as a sample of a regional community, and I consider the assumption that species of a metacommunity are similar for a trait (e.g. dispersal) to be at odds with an analysis that compares diverse species with diverse dispersal potentials and life histories.


I collected 24 data sets from the peer-reviewed scientific literature and publicly available web sites (North American Breeding Bird Survey and Konza Prairie, KS, USA) to represent a variety of taxa, spatial scales and study scales (Table 2). Three data sets were split and then analysed to account for inherent differences in data. The data of Snodgrass et al. (2000) consisted of markedly different taxa (fish and amphibians) with different distributions in the study system (Table 2). Likewise, endemic plants were analysed separately from alien plants in the serpentine soils data set of Harrison (1999), and data for primary forests were analysed separately from data for secondary (i.e. former agriculture) forests in the data set of Vellend (2004). Of the 25 OFD studies reviewed by McGeoch & Gaston (2002), only four provided complete occupancy data in a form that could be extracted per species and thus enable comparison with McGeoch & Gaston's review (Table 2) – a problem that confirmed the need for a more information-rich (Tufte, 2001) approach.

Table 2.  Analysed data sets, sorted in increasing order of number of species. Spatial grain is the size of the sample (e.g. plot width) and spatial extent is the greatest distance between sample sites (Wiens, 1989). Grain and extent were estimated from information in cited papers or personal contact with study authors.
No.Study systemSpatial grain (km)Spatial extent (km)No. of speciesNo. of sitesData source
  • *

    Also discussed in McGeoch & Gaston (2002). No. 1 was identified as a bimodal, core-dominated OFD, no. 6 as a bimodal OFD, no. 10 as either unimodal or lognormal and no. 13 as unimodal. As in Table 1 (above), these should correspond to: no. 1 sigmoidal, no. 6 sigmoidal, no. 10 either exponential or lognormal, no. 13 exponential, respectively.

  • The most inclusive 1988 data were analysed here.

  • Table 1 data were analysed here.

  • §

    Konza Prairie LTER Data Set PVC02, David C. Hartnett, principal investigator. Data analysed were for transects A–D in 2007, all watersheds, all soil types, all treatments. Data obtained 23 July 2008 from http://www.konza.ksu.edu/datasets/knzdsdetail.aspx?currMenu=0&datasetcode=PVC02

  • The North American Breeding Bird Survey's Cfifty1 to Cfifty10 data sets for the year 2000 were analysed. Data represent all of Canada and the USA except Hawaii. Data obtained 2 May 2008 from ftp://ftpext.usgs.gov/pub/er/md/laurel/BBS/DataFiles/

 1Bumblebees in montane meadows0.15501112 Durrer & Schmid-Hempel (1995) *
 2Fishes in depressional wetlands0.130126 Snodgrass et al. (2000)
 3Monogenean parasites of Lobeo coubie0.000215001335 Guégan & Hugueny (1994)
 4Small mammals in forest fragments0.015161535 Nupp & Swihart (2000)
 5Fishes in beaver-influenced streams0.1121623 Schlosser & Kallemeyn (2000)
 6Helminth parasites of Sorus araneus0.0001217114 Haukisalmi & Henttonen (1993) *
 7Cladocera in new ponds0.0000552002025 Louette & De Meester (2005)
 8Plants in urban fragments0.052722423 Bastin & Thomas (1999)
 9Amphibians in depressional wetlands0.1302522 Snodgrass et al. (2000)
10Bumblebees in northern Spain0.12202827 Obeso (1992) *
11Endemic plants in serpentine landscape0.015452924 Harrison (1999)
12Alien plants in serpentine landscape0.015453324 Harrison (1999)
13Helminth parasites of Hydromys chrysogaster0.000415504534 Smales & Cribb (1997) *
14Herbaceous plants in secondary forest5.627.54910 Vellend (2004)
15Ants in coastal southern California0.02434940 Suarez et al. (1998)
16Cumulative zooplankton in new ponds0.010.1126112 Jenkins & Buikema (1998)
17Macroinvertebrates in alpine ponds0.011.66225 Oertli et al. (2008)
18Plants in US remnant prairies0.000527006463 Diamond & Smeins (1988)
19Herbaceous plants in primary forests5.627.57017 Vellend (2004)
20Pteridophytes in Amazonian rainforest0.16008428 Tuomisto & Poulsen (1996)
21Trichoptera in Danish streams0.05435109157 Wiberg-Larsen et al. (2000)
22Mammals in Europe5045001192183 Heikinheimo et al. (2007)
23Plants in Konza Prairie0.003551651040Hartnett PVC02 (2007)§
24Breeding birds in USA and Canada39.471415905114Breeding Bird Survey (2000)

Each data set (species listed in rows and sites listed in columns) was handled as described below. Steps 1–4 can be followed for any given data set, while step 5 compares data sets and evaluates the effects of other variables.

  • 1Count the number of study sites in which each species was observed (occupancy). Calculate relative occupancy per species as (occupancy/total number of sites in a data set). This step ensures equitable comparisons among data sets; absolute occupancy may be suitable for analyses within any one data set. All analyses described below are based on relative occupancy.
  • 2Sort species by relative occupancy values, in decreasing order. The species observed in the most sites is ranked first and the species observed in fewest sites is ranked last. The plot of species relative occupancy as a function of rank is a RSOC.
  • 3Compute nonlinear and linear regressions of RSOCs (see Table 1 for equations). Note that the primary goal for regressions here was to determine which model family (i.e. exponential, sigmoidal, normal or linear) fits best. For example, a concave exponential equation may be expressed in different forms that vary slightly in fit, but this finer point is secondary to first discovering that a concave exponential is better than another family of models. Nonlinear regressions used the Levenberg–Marquardt algorithm (< 999 iterations) and both linear and nonlinear regressions were computed by ordinary least squares (OLS) in spss v.16. Because OLS regressions assume normality, homogeneity of variance and independent error terms, residuals were also evaluated graphically (Quinn & Keough, 2002).
  • 4Compute the Akaike information criterion (AIC) scores and weights (wi; Anderson et al., 2000) for each regression equation on each data set. The regression equation with the greatest wi represents the model that best retained information, adjusted for the number of parameters in the equation (range = 2–4 for equations in Table 1).
  • 5Evaluate curve families and models for differences in spatial scale (i.e. grain, extent; Wiens, 1989), study scale (number of species, number of sites; Jenkins, 2006), taxonomic groups and dispersal modes. Grain was estimated for each metacommunity as the linear distance of a sample unit (e.g. transect length, plot diameter), and extent was estimated as the greatest distance among sample sites, based on information in the analysed papers or other sources. Curve families and models were compared for differences in log10(grain), log10(extent), log10(number of species) and log10(number of sites) by analysis of variance (ANOVA); log-transformed data met parametric assumptions. Two of the 24 data sets were not clearly identified as fitting one model, so these two data sets were omitted from ANOVAs for clarity. The distributions of curve families and models among coarse taxonomic groups (plants, invertebrates, vertebrates) and dispersal modes (active or passive) were tested by χ2 analyses. Statistical tests were computed with spss v.18.


Though the 24 data sets were quite diverse (Table 2), only exponential and sigmoidal RSOC models were observed (Table 3). Linear and lognormal models never represented data more effectively than exponential or sigmoidal models. Twenty-two of the 24 analysed data sets were clearly classified as either exponential (10 RSOCs) or sigmoidal (12 RSOCs); 2 of the 24 data sets were identified as being either an exponential or a sigmoidal RSOC (Table 3). Thus, about half of the data sets (sigmoidal RSOCs) were consistent with a bimodal OFD largely composed of core and satellite species, but about half of the data sets were consistent with a unimodal-satellite OFD, largely composed of satellite species. Two of the four data sets also analysed by McGeoch & Gaston (2002) did not correspond to their designated corresponding OFD shape (Table 3).

Table 3.  Results summary for species occupancy distribution analyses. Numbered data sets are those listed in Table 2. All possible models (see Table 1) were analysed for each data set, but only models identified as retaining the most information by Akaike information criterion (AIC) are listed here. R2 is the coefficient of determination from nonlinear regression models, AICc is the corrected AIC coefficient, and wi is the Akaike weight of that model (Anderson et al., 2000). Values of wi indicate the probability that model i is the best model among those tested.
ModelNo.Study system R 2 AICc wi
  • *

    This data set is listed in both exponential concave and symmetric sigmoidal based on its AIC wi values.

  • This data set is listed in both exponential concave and asymmetric sigmoidal based on its AIC wi values.

Exponential concave RSOCs (exponential concave function)   
 3Monogenean parasites of Lobeo coubie*0.908−59.90.373
 7Cladocera in new ponds0.991−137.20.935
 9Amphibians in depressional wetlands0.976−166.50.904
 10Bumblebees in northern Spain0.983−196.70.885
 11Endemic plants in serpentine landscape0.965−172.60.705
 13Helminth parasites of Hydromys chrysogaster0.964−392.50.492
 17Macroinvertebrates in alpine ponds0.975−431.61.000
 19Pteridophytes in Amazonian rainforest0.990−689.20.994
 20Herbaceous plants in primary forests0.988−478.71.000
 21Trichoptera in Danish streams0.979−805.30.999
 23Plants in Konza Prairie0.977−1292.21.000
 24Breeding birds in USA and Canada0.988−5508.31.000
Asymmetric sigmoidal RSOCs (cumulative Weibull function)   
 1Bumblebees in montane meadows0.970−50.60.954
 2Fishes in depressional wetlands0.976−64.50.964
 4Small mammals in forest fragments0.985−84.40.990
 8Plants in urban fragments0.949−135.10.921
 12Alien plants in serpentine landscape0.965−204.90.969
 13Helminth parasites of Hydromys chrysogaster0.959−392.50.492
 15Ants in coastal southern California0.974−323.51.000
 18Plants in US remnant prairies0.973−436.51.000
Symmetric sigmoidal RSOCs (logistic function)   
 3Monogenean parasites of Lobeo coubie*0.907−59.80.347
 5Fishes in beaver-influenced streams0.934−72.00.859
 6Helminth parasites of Sorus araneus0.980−108.70.895
 14Herbaceous plants in secondary forests0.989−328.30.992
 16Cumulative zooplankton in new ponds0.990−401.61.000
 22Mammals in Europe0.99−925.31.000

Within the exponential family of models, only concave RSOCs were observed; all 10 exponential data sets had a satellite mode. Sigmoidal models were divided into asymmetric (eight data sets) and symmetric distributions (six data sets; Table 3). Symmetry was strictly defined because it is required for the logistic equation, whereas asymmetry was typically apparent for data the were fitted best by the cumulative Weibull model because those data included numerous satellite species (which made a long tail in the distribution).

RSOCs were not significantly different for spatial scale and did not clearly differ for study scale. Data sets with exponential and sigmoidal RSOCs were not significantly different for spatial grain (F1,20= 0.03; P= 0.85) or spatial extent (F1,20= 0.78; P= 0.39). Likewise, data sets in the three observed models were not significantly different for spatial grain (F2,19= 0.37; P= 0.70) or spatial extent (F2,19= 0.91; P= 0.42). Data sets with exponential RSOCs had marginally significantly more species (F1,20= 3.9; P= 0.06) than data sets with sigmoidal RSOCs, but this effect was due to the species-rich Breeding Bird Survey data set (when this one data set was removed from analysis the significant difference was removed). In addition, the three models were not significantly different for the number of species (F2,19= 2.0; P= 0.16). The number of study sites was not significantly different between model families (F1,20= 0.39; P= 0.53) or between the three models (F2,19= 0.29; P= 0.74).

RSOCs were also not sorted by taxon or dispersal mode. Plants, invertebrates and vertebrates were randomly distributed among model families (P= 0.82, χ2) and models (P= 0.94, χ2), as were active and passive dispersers (model families P= 0.90; equations P= 0.30, both χ2).

Two data sets were split as an example of comparisons that are possible with RSOCs. Endemic and alien species of serpentine soils (Harrison, 1999) had different RSOC curves (Table 3, Fig. 2). Endemic species were best described by an exponential curve while alien species were best described by an asymmetric sigmoidal curve with a long tail. Endemic and alien species were comparable in species richness (29 and 33, respectively) but alien species were more likely to be limited in distribution than endemic species in this system and at the time samples were collected.

Figure 2.

Ranked species occupancy curves (RSOCs) for data of Harrison (1999), where solid circles represent endemic plants and open circles alien plants. Endemic plants were best described by an exponential RSOC, while alien plants were best described by an asymmetric sigmoidal RSOC (also see Table 3). Species codes are the first three letters of the genus plus first three letters of the species name (see Tables 1 & 2 in Harrison, 1999, for full names).

Primary and secondary forest herbaceous plants (Vellend, 2004) also differed. Primary forest vegetation was best described by an exponential curve while secondary forest vegetation was best described by a symmetrical sigmoidal curve (Table 3, Fig. 3). Herbaceous vegetation in secondary forests had fewer total species and the RSOC declined more rapidly than in primary forest because multiple species were more narrowly distributed. In addition, many species changed position in their occupancy rank and relative occupancy values (Fig. 3). Mean change in relative occupancy was –0.14 (i.e. on average, forest herbs occupied 14% fewer sites in secondary forests than in primary forests), and 23 species may have been extirpated in secondary forests as a result of land-use practices (Fig. 3).

Figure 3.

Ranked species occupancy curves (RSOCs) for data of Vellend (2004); solid circles represent primary forest herbaceous vegetation and were best described by an exponential RSOC, open circles are secondary forest herbaceous vegetation and were best described by a symmetric sigmoidal RSOC (also see Table 3). Solid grey arrows track selected species that increased in rank and/or relative occupancy between primary and secondary forests, whereas dotted grey arrows track selected species that decreased in rank and/or relative occupancy. Mean change in relative occupancy was –0.14 between primary and secondary forests. Species present in primary forest but absent in secondary forests are indicated with a letter a, and two species that apparently colonized secondary forests are indicated with a letter c. Species codes are first three letters of the genus plus the first three letters of the species name (see Vellend, 2004, Appendix C for full names).


Only two general shapes of RSOCs described 24 very different metacommunities. However, the consistent patterns were not related to spatial scale as expected from hypotheses based on species-specific processes and that derive from OFDs. Processes operating at metacommunity scales (Table 4) may be required to explain the common patterns observed among diverse metacommunities; a set of three hypotheses is proposed below. More practically, analysing RSOCs by regression and multi-model inference is simple, adds to the existing method for analysing OFDs (Tokeshi, 1992) and readily permits a more information-rich (Tufte, 2001) outcome (e.g. Fig. 3).

Table 4.  Hypothesized ecological mechanisms for occupancy frequency distributions (OFDs), which correspond to ranked species occupancy curves (RSOCs). Species-specific hypotheses have already been published, and some have been the subject of much research. Metacommunity hypotheses are novel to this work. Thumbnail image of

The RSOC regression approach used here accurately described 22 of 24 (92%) occupancy data sets, and the modest ambiguity observed for the remaining two data sets greatly simplified OFD-based possibilities. The results obtained here are consistent with the observation that two OFD shapes are most common (McGeoch & Gaston, 2002). Beyond that simplicity, two sigmoidal RSOC forms were detected, which helps delineate differences among data sets.

Spatial scale is expected to be important for OFD shape (McGeoch & Gaston, 2002; Hui & McGeoch, 2007a). Translated to RSOCs, larger spatial scales should obtain exponential RSOCs and smaller spatial scales should obtain sigmoidal RSOCs. The data sets analysed here varied widely in spatial scale and study scale (Table 2) but exponential and sigmoidal RSOCs were not significantly different for grain or extent (Wiens, 1989). Spatial scale depends in part on organismal dispersal scales (Collins & Glenn, 1997), which must differ widely in a diverse regional community (sensuRicklefs, 2008) and are often not well quantified. Thus, heterogeneity in regional community dispersal scales may nullify scale effects expected by OFD theory and based on species-specific mechanisms. In addition, investigators already scale their studies to their study system; the spatial grain of migratory bird occupancy data necessarily exceeds that of data for intestinal parasites. Regardless, the fact that spatial scale and study scale did not differ among RSOCs indicates that RSOC analyses apply broadly to diverse study systems.

Three RSOC models in two model families were observed, but linear, convex and lognormal models were not observed here. A linear model is consistent with uniform or random OFDs. Given the diversity of data sets analysed here, linear RSOCs may be rarely observed, and then may be transient, such as early in the colonization history of similar habitats. A convex model is consistent with a metacommunity that approaches a single assemblage dominated by multiple strong dispersers (Collins & Glenn, 1997; Kneitel & Miller, 2003). The absence of convex models among diverse data sets analysed here suggests that dispersal limitation and/or species sorting processes (Leibold et al., 2004) are prevalent in multiple metacommunities. Convex RSOCs may also occur if distributions are actually sigmoidal RSOCs with truncated tails: that this was not the case indicates that the analysed data sets were sufficiently sampled to detect rare species in RSOC tails. Conversely, a convex RSOC observed in the future would indicate that the sampling regime be evaluated for its potential inability to detect rare species. Finally, the absence of lognormal distributions here indicates that no metacommunity was composed of species roughly similar in occupancy. Thus RSOCs would appear to reflect diverse historical and deterministic processes (Ricklefs & Schluter, 1993), operating at multiple spatial and temporal scales on multiple species. This outcome does not seem consistent with neutral theory (Hubbell, 2001), though neutral theory does not explicitly address occupancy patterns.

Two data sets (Harrison, 1999, and Vellend, 2004) were dissected for details. The difference between endemic and alien plant RSOCs in serpentine soils (Harrison, 1999) is consistent with an ongoing alien invasion, especially because endemics did not resist invasion (Harrison et al., 2006). In the Vellend (2004) data set, logging and agriculture apparently extirpated multiple populations of herbaceous forest species and fragmented the remaining populations in subsequent secondary forests. This result suggests long-lasting effects on seed banks and/or growth conditions in secondary forests for herbaceous plants.

That only exponential and sigmoidal (asymmetric or symmetric) RSOCs were observed among diverse data sets begs explanation. Spatial scale has been hypothesized to drive OFD patterns (van Rensburg et al., 2000; McGeoch & Gaston, 2002; Hui & McGeoch, 2007a; Heino, 2008) but was not a significant factor in RSOC analyses here. All ecological hypotheses on OFDs to date have focused on species-specific mechanisms (e.g. niche-based factors and dispersal abilities). However, an OFD or RSOC is an aggregate, emerging property of multiple species-specific mechanisms. Alternatively, hypotheses based on processes operating at the regional scales analysed by OFDs and RSOCs may be required to explain metacommunity patterns. Below I briefly outline a set of three metacommunity-scale hypotheses that may explain common patterns among diverse metacommunities. To be clear, observations reported above are not also used to ‘test’ the subsequent hypotheses in order to avoid post-hoc retrofitting and circular reasoning. Instead, hypotheses will best be tested with additional analyses of systems with known successional and disturbance regimes. Finally, the hypotheses proposed here represent alternative, conditional models that can serve as the basis for highly rigorous inference in subsequent tests (McGill et al., 2006).

Metacommunity hypotheses for RSOC patterns

Three hypotheses represent major stages in a successional sequence or disturbance gradient (Fig. 4) in which RSOCs shift from exponential (E1) to sigmoidal (S) and back to exponential (E2). Hereafter the entire series is referred to as ESE, which itself is an overall hypothesis. The ESE hypotheses build on rich histories of research on succession and disturbance (e.g. Pickett & White, 1986; Glenn-Lewin et al., 1992; Mackey & Currie, 2001; Svensson et al., 2009). Succession and disturbance are interdependent in time, but succession is a temporal process, while persistent disturbance gradients can also be observed in space. Thus, the ESE series may be considered a successional sequence or a spatial disturbance gradient for a metacommunity. Disturbance in the ESE gradient is related to the intermediate disturbance hypothesis (Connell, 1978) which has been much-discussed but partially supported (Mackey & Currie, 2001), in part because ‘disturbance’ can have various meanings (Svensson et al., 2009). Disturbance is defined here as an exogenous change in conditions which alters the occupancy of at least some species in a metacommunity, where spatial extent is standardized to the spatial extent of the metacommunity and so frequency (disturbance events per unit time) and severity (Σ|change in occupancy|) of the disturbance are important. The hypotheses below should apply most easily to synchronous succession and consistent disturbance across a metacommunity – heterogeneous succession and disturbance should promote the chance of E1 and S RSOCs and reduce the chance of E2 RSOCs.

Figure 4.

Hypotheses to explain exponential and sigmoidal RSOCs among diverse metacommunities. In metacommunity succession and disturbance regimes, the exponential RSOC (E1) shifts to a sigmoidal RSOC (S) and then back to a second exponential RSOC (E2). Within the sigmoidal RSOC lies a transition from an asymmetric sigmoidal RSOC (dashed line) toward the symmetric RSOC (solid line, if attained), and then back again to an asymmetric sigmoidal RSOC. See text for further explanation.

An RSOC pattern alone cannot diagnose the mix of regional and local processes in a metacommunity because: (1) pattern does not prove process (Cale et al., 1989) and (2) the ESE transition involves different mechanisms to obtain an exponential RSOC before (E1) and after (E2) a sigmoidal (S) RSOC in metacommunity succession. Teasing apart the processes requires detailed analyses of information beyond the summary RSOC pattern. For example, a species in the RSOC tail may be dispersal limited (E1) or it may be a fugitive species (Horn & MacArthur, 1972) with distributions driven by competitive interactions (E2).

Hypothesis E1: recruitment limitation early in succession or strong/frequent disturbance causes an exponential RSOC

A metacommunity may be recruitment limited if spatial isolation of habitats exceeds organismal dispersal distances (Collins & Glenn, 1997) or if the metacommunity is early in succession. Recruitment limitation may also occur with large-scale disturbance, including the cumulative effects of multiple smaller-scale events (e.g. human land use; McKinney, 2002). Severe and/or frequent disturbance that is roughly consistent across the metacommunity should extirpate some species from some habitats and inhibit their recolonization of other habitats. Other species may resist or be resilient to the frequent and/or intense disturbance across multiple habitats. The result is an exponential RSOC (Fig. 4), where resistant/resilient species are most prevalent. If all species potentially access all habitats of the metacommunity, then any one assemblage represents a random sample of the disturbance regime and the metacommunity species pool. In that case, relative occupancy of a species represents the probability that the species will be sampled in any one assemblage and a local assemblage ‘sample’ from a metacommunity's exponential concave RSOC will obtain relatively low species richness because most species are regionally rare.

Hypothesis S: a mixture of local niche limitation and regional recruitment limitation at intermediate successional time and/or disturbance causes a sigmoidal RSOC

In this case, recruitment rates are sufficient through time and space to permit more species to colonize more habitats. This hypothesis assumes an intermediate successional interval and thus intermediate disturbance frequency, and assumes that metacommunity processes (e.g. species sorting; Leibold et al., 2004) have become more important. The metacommunity thus shifts from regional recruitment limitation (E1) through a mixture of local niche-based processes and neutral processes that may typify some metacommunities (Gravel et al., 2006) toward local niche-based processes as primary drivers of occupancy patterns (i.e. a quorum effect; Jenkins & Buikema, 1998; Jenkins, 2006). An asymmetric (long tail) sigmoidal RSOC is hypothesized here as transitional between exponential and symmetric sigmoidal RSOCs (Fig. 4) and can range from nearly exponential to nearly symmetric to represent a broad band of transitional RSOC shapes. A symmetric sigmoidal RSOC (Fig. 4) represents an upper limit to an occupancy distribution for a metacommunity, given that an exponential convex RSOC would indicate a single assemblage rather than a metacommunity (Kneitel & Miller, 2003). An asymmetric sigmoidal RSOC may be obtained if keystone predation is consistently mitigated to a narrow sphere of influence (Brose et al. 2005) in trophic networks across the metacommunity. Metacommunities with predator-mediated competitive coexistence or strong habitat heterogeneity should obtain a symmetric sigmoidal RSOC, given that these factors enhance the probability that more species can occupy more metacommunity habitats. A local sample from a sigmoidal RSOC will obtain greater species richness than from an exponential RSOC because more species are more regionally common to be ‘sampled’ by the local habitat.

Hypothesis E2: strong niche-based interactions late in succession and/or with weak/infrequent disturbance cause an exponential RSOC

Hypothesis E2 relies on niche-based metacommunity theory, especially species-sorting processes such as competition and predation (Chase & Leibold, 2003; Leibold et al., 2004) and should apply to metacommunities with little or no trophic cascade driven by specialist predators; otherwise predator-mediated coexistence among competitors throughout the metacommunity should contribute to a sigmoidal RSOC. Also, hypothesis E2 is expected with strong, wide-ranging generalist predators and prey species that either resist (high occupancy) or are vulnerable (low occupancy) to predation. As for hypothesis E1, a local sample from an E2 RSOC again obtains low species richness, but importantly, accumulated knowledge of species listed in the RSOC can diagnose E1 versus E2. For example, slow-dispersing, competitive dominant species that are ranked highly in an exponential RSOC would be consistent with hypothesis E2 but not hypothesis E1. If habitats are generally similar throughout the metacommunity, then more highly ranked competitors (Keddy & Shipley, 1989) should occupy more sites. Otherwise, habitat heterogeneity should disrupt a correlation between competitive rank and occupancy as well as the transition from a symmetric sigmoidal RSOC to asymmetric sigmoidal and then exponential RSOCs (Fig. 4).


Regressions of RSOCs improve on OFDs by retaining more detail while providing more definitive analyses of occupancy patterns. Thus, RSOCs should be useful in ecology and biogeography, including palaeontological and microbiological studies that often rely on occupancy data (e.g. Noguez et al., 2005; Foote et al., 2007). Outcomes of RSOC regressions reveal surprising commonality among very different metacommunities, but OFD-based expectations that spatial scale causes such commonality were not supported. Species-specific hypotheses already proposed to explain OFD patterns may not apply to metacommunities composed of diverse species; instead, metacommunity-scale hypotheses may be needed to explain commonality among metacommunities in occupancy distributions. Three hypotheses based on metacommunity succession and disturbance are proposed. Going forward, tests of these hypotheses will require metacommunities with documented successional and disturbance regimes, which should be readily available.

Regressions of RSOCs should be useful for multiple approaches in ecology and biogeography. For example, RSOC regressions provide a summary analysis to accompany detailed R-mode analyses of occupancy patterns based on co-occurrence (Gotelli & McCabe, 2002) and nestedness (Rodriguez-Gironés & Santamaria, 2006). RSOC regressions are also consistent with macroecology (Brown 1995), metacommunity ecology (Holyoak et al., 2005) and a regional community concept (Ricklefs, 2008). Ricklefs (2008) discussed a regional community concept in which a list of species occupy an average length v of an ecological gradient V. The probability v/V that a species occupies a site is the same as relative occupancy analysed here.


This work was improved by comments from Kim Medley, Pedro Quintana-Ascencio, Steve DeClerck, Joost Vanoverbeke, Brian McGill, David Currie and anonymous referees.


Dave Jenkins is interested in too many topics for his own good, but one theme is the analysis of pattern to indicate the relative roles of regional and local processes in ecology.

Editor: Brian McGill