betapart: an R package for the study of beta diversity


  • Andrés Baselga,

    Corresponding author
    1. Departamento de Zoología, Facultad de Biología, Universidad de Santiago de Compostela, c/Lope Gómez de Marzoa s/n, 15782 Santiago de Compostela, Spain
    Search for more papers by this author
  • C. David L. Orme

    1. Division of Biology, Department of Life Sciences, Silwood Park, Ascot, Berkshire SL5 7PY, UK
    Search for more papers by this author

Correspondence author. E-mail:


1. Beta diversity, that is, the variation in species composition among sites, can be the result of species replacement between sites (turnover) and species loss from site to site (nestedness).

2. We present betapart, an R package for computing total dissimilarity as Sørensen or Jaccard indices, as well as their respective turnover and nestedness components.

3.betapart allows the assessment of spatial patterns of beta diversity using multiple-site dissimilarity measures accounting for compositional heterogeneity across several sites or pairwise measures providing distance matrices accounting for the multivariate structure of dissimilarity.

4.betapart also allows computing patterns of temporal difference in assemblage composition, and its turnover and nestedness components.

5. Several example analyses are shown, using the data included in the package, to illustrate the relevance of separating the turnover and nestedness components of beta diversity to infer different mechanisms behind biodiversity patterns.


The term ‘beta diversity’ is applied in a broad sense to any measure of variation in species composition (Anderson et al. 2011). In the narrowest sense, it is the simple ratio between gamma and alpha diversities (Jost 2007; Tuomisto 2010; Jurasinski & Koch 2011), which only differs from 1 when local sites differ in species composition. A wide range of broader measures exist (see Anderson et al. 2011), including measures of differentiation and proportional diversity (Jurasinski, Retzer & Beierkuhnlein 2009; Jurasinski & Koch 2011), but all broadly aim at providing a measure of the difference between the assemblages present at each site, taking into account the identities of all species. This last characteristic makes beta diversity studies complementary of analyses of the variation in species richness, which ignores species identity. Therefore, compared to species richness, the analysis of beta diversity allows testing of different hypotheses about the processes driving species distributions and biodiversity.

The concept of ‘change in species composition’ or the question ‘how different are two species assemblages’ may apparently seem straightforward, but, as argued elsewhere (Baselga 2007, 2010, 2012; Baselga, Jiménez-Valverde & Niccolini 2007), there are two potential ways in which two species assemblages can be ‘different’. One is species replacement (i.e. turnover), which consists in the substitution of species in one site by different species in the other site. The second way is species loss (or gain), which implies the elimination (or addition) of species in only one of the sites, and leads to the poorest assemblage being a strict subset of the richest one (a pattern called nestedness). Therefore, the selection of the dissimilarity measure used to quantify the differences between assemblages can be crucial, because different dissimilarity indices account for the two phenomena in different ways. For example, the strict sense definition of beta diversity (the ratio of gamma and alpha diversity: Whittaker 1960; Tuomisto 2010) yields a measure that accounts for turnover and nestedness as being equivalent, as both turnover and nested patterns make alpha diversity lower than gamma diversity. The same applies for the widely used Jaccard and Sørensen indices, which are monotonic transformations of gamma/alpha (Jost 2007; Chao, Chiu & Hsieh 2012). In contrast, the Simpson index of dissimilarity (Simpson 1943; Lennon et al. 2001) accounts only for turnover (species replacement), and building on this, Baselga (2010, 2012) proposed a method for partitioning total dissimilarity (i.e. Sørensen and Jaccard indices, both monotonic transformations of beta diversity) into two separate components accounting for the dissimilarity derived solely from turnover and the dissimilarity derived from nestedness. The two decompositions for a single pair of cells are shown below for the Sørensen (eqn 1) and Jaccard (eqn 2) indices, where a is the number of shared species between two cells, b the number of species unique to the poorest site and c the number of species unique to the richest site.

image(eqn 1)
image(eqn 2)

where βsor is Sørensen dissimilarity, βsim is Simpson dissimilarity (= turnover component of Sørensen dissimilarity), βsne is the nestedness component of Sørensen dissimilarity, βjac is Jaccard dissimilarity, βjtu is the turnover component of Jaccard dissimilarity, and βjne is the nestedness component of Jaccard dissimilarity. Pairwise dissimilarity between all pairs of sites can be used to investigate spatial patterning of turnover and nestedness-resultant dissimilarity. In addition, multiple-site measures of compositional dissimilarity across a set of sites can be calculated by substituting the multiple-site analogues of the values a, b and c into the two equations (see Baselga 2010, 2012). Note that these are the sum across all pairs of sites for b and c analogues, but not for the shared species a, whose multiple-site analogue is inline image, where Si is the number of species in site i, and ST is the number of species in the total pool of sites. This component makes the indices real multiple-site measures and not averaged pairwise dissimilarities. We use capital letters to differentiate these multiple-site measures from pairwise measures (Baselga 2012): βSOR = βSIM + βSNE and βJAC = βJTU + βJNE.

The betapart package: partitioning beta diversity

This application note introduces an R package, betapart, to compute these dissimilarity measures. The package provides two basic analytical functions (beta.multi and beta.pair), which calculate the multiple-site and pairwise partitions of beta diversity. It also provides a function (beta.sample), which uses randomly selected sites to generate a distribution of multiple-site dissimilarity measures for a given number of sites. Finally, the function beta.temp takes two presence–absence matrices of the same sites and species at two different time steps and computes pairwise turnover components over time within each site. In all functions, total dissimilarity can be computed as Sørensen or Jaccard dissimilarity, as both are monotonic transformations of strict sense beta diversity (Jost 2007; Tuomisto 2010).

The betapart package is written entirely in the scientific computing language R (R Development Core Team, 2011) and can be installed from the Comprehensive R Archive Network ( The raw data accepted by all functions in the package is a matrix (x) codifying the presence (1) or absence (0) of m species (columns) in n sites (rows). The functions provided by the package are as follows:

  • 1betapart.core(x) computes the basic quantities needed for computing the multiple-site beta diversity measures and pairwise dissimilarity matrices from the presence–absence matrix (x), including pairwise matrices of shared and non-shared species between sites. As these matrices are used in several functions and can be time-consuming to calculate for large matrices, precalculating them using the betapart.core function can markedly improve the speed of subsequent analyses. The function returns a new object of class ‘betapart’ containing these quantities and which can be used as the input (x) to all remaining functions.
  • 2beta.multi(x, This function computes the total dissimilarity across all n sites, along with the turnover and nestedness components of that dissimilarity. The input x may be a presence–absence matrix or a betapart object. The argument selects whether the Sørensen or Jaccard index is used as a measure of total dissimilarity (βSOR or βJAC) and the respective components of turnover (βSIM or βJTU) and nestedness (βSNE and βJNE). The function returns three values, which are the total multi-site dissimilarity across the sites, and its turnover and nestedness components.
  • 3beta.pair(x, This function computes the same three dissimilarity metrics as for the previous function and can again be set using the argument to use the Sørensen or Jaccard index of total dissimilarity. Rather than returning three single values as in the previous function, beta.pair returns three matrices containing the pairwise between-site values of each component of beta diversity. The dissimilarity matrices yielded by beta.pair are objects of class dist and can be submitted to further analyses as, for example, Mantel tests, non-metric multidimensional scaling, cluster analysis using other R packages as vegan (Oksanen et al. 2011) or cluster (Maechler et al. 2005).
  • 4beta.sample(x,, sites, samples) will resample the three multiple-site dissimilarities for a subset of sites of the original data frame. The number of sites in the subset can be specified along with the number of random samples used to calculate the distribution of dissimilarity measures. The function returns a data frame containing the individual sampled measures along with vectors of the means and standard deviations across samples of each measure.
  • 5beta.temp(x, y, This function computes dissimilarity values between matched sites from two data sets (x, y) describing the presence and absence of species across the same set of sites at two separate times. Again, the index family may be set to use the Sørensen or Jaccard index of total dissimilarity. The function returns a data frame of the three values for the temporal dissimilarity within each site.

Although some of these computations can also be conducted in previously available R packages, that is, vegan (Oksanen et al. 2011) and simba (Jurasinski & Retzer 2011), betapart provides a unified framework for the partitioning of total dissimilarity into turnover and nestedness components. The package betapart implements a highly efficient workflow (Fig. 1) to analyse such beta diversity patterns, and the function betapart.core provides an underlying data structure (the ‘betapart’ object type) that permits efficient implementation for use with large matrices and for performing resampling.

Figure 1.

 Workflow of the betapart package. Functions are represented by arrows, and all functions appear twice as they can be applied to the raw data or ‘betapart’ objects. Arrow tone informs whether applying the function to a ‘betapart’ object implies saving computational effort (light grey) compared to applying the same function to a raw data table.

Example analyses

The package contains two data sets containing the presence and absence of 634 species of longhorn beetles (Cerambycidae) in southern (ceram.s, 15 countries) and northern (ceram.n, 19 countries) European countries (see Danilevsky 2007; Baselga 2008 for details). The code below shows how the betapart package can be used to compare the multi-site and pairwise beta diversity components arising from turnover and nestedness, using the default choice of the Sørensen index.

  • library(betapart)

  • data(ceram.s)

  • data(ceram.n)

  • # get betapart objects

  • ceram.s.core <- betapart.core(ceram.s)

  • ceram.n.core <- betapart.core(ceram.n)

  • # multiple site measures

  • ceram.s.multi <- beta.multi(ceram.s.core)

  • ceram.n.multi <- beta.multi(ceram.n.core)

  • # sampling across equal sites

  • ceram.s.samp <- beta.sample(ceram.s.core, sites=10, samples=100)

  • ceram.n.samp <- beta.sample(ceram.n.core, sites=10, samples=100)

  • # plotting the distributions of components

  • dist.s <- ceram.s.samp$sampled.values

  • dist.n <- ceram.n.samp$sampled.values

  • plot(density(dist.s$beta.SOR), xlim=c(0,0·8), ylim=c(0, 19), xlab=‘Beta diversity’, main=‘‘, lwd=3)

  • lines(density(dist.s$beta.SNE), lty=1, lwd=2)

  • lines(density(dist.s$beta.SIM), lty=2, lwd=2)

  • lines(density(dist.n$beta.SOR), col=‘grey60’, lwd=3)

  • lines(density(dist.n$beta.SNE), col=‘grey60’, lty=1, lwd=2)

  • lines(density(dist.n$beta.SIM), col=‘grey60’, lty=2, lwd=2)

  • # pairwise for south

  • pair.s <- beta.pair(ceram.s)

  • # plotting clusters

  • dist.s <- ceram.s.samp$sampled.values

  • dist.n <- ceram.n.samp$sampled.values

  • plot(hclust(pair.s$beta.sim, method=“average”), hang=-1, main=‘‘, sub=‘‘, xlab=‘‘)

  • title(xlab=expression(beta[sim]), line=0·3)

  • plot(hclust(pair.s$beta.sne, method=“average”), hang=-1, main=‘‘, sub=‘‘,, xlab=‘‘)

  • title(xlab=expression(beta[sne]), line=0·3)

These analyses provide the multiple-site dissimilarities across all sites and the estimated distribution of those values, controlling for the number of sites (Fig. 2a). Note that despite the similar values of total dissimilarity (βSOR) in southern and northern Europe, turnover (βSIM) is much higher in the south and nestedness-resultant dissimilarity (βSNE) is much higher in the north. The pairwise analysis allows the identification of patterns of dissimilarity between Southern European countries derived from turnover and nestedness-resultant dissimilarity (Fig. 2b,c). In addition, the betapart package contains data on the presence and absence of 569 bird species across the United States. Raw data were taken from the North American Breeding Bird Survey ( and have been aggregated to the state level for two time periods (1980–1985 and 2000–2005). This data are used below to explore the temporal differences in species composition of states and the relative contributions of nestedness and turnover (Fig. 2d).

Figure 2.

 Use of the betapart package. (a) The partition of βSOR (solid line) into βSIM (dashed line) and βSNE (dotted line) for southern (black lines) and northern (grey lines) European countries, using 100 samples of 10 sites from each data set. (b, c) Clustering using average linkage of the βsim and βsne components of species dissimilarity between southern European countries. (d) Comparison of the square root transformed βsim and βsne components of βsor between 1980 and 2000 for US birds by state, with points labelled by state abbreviation.

  • data(bbsData)

  • bbs.t <- beta.temp(bbs1980, bbs2000,“sor”)

  • # plotting root transformed components

  • with(bbs.t, plot(sqrt(beta.sim) ∼ sqrt(beta.sne), type=‘n’, ylab=expression(sqrt(beta[sim])), xlab=expression(sqrt(beta[sne]))))

  • with(bbs.t, text(y= sqrt(beta.sim), x=sqrt(beta.sne), labels=rownames(bbs1980)))

Citation of betapart

Researchers using betapart in a published paper should cite this article. Users can also cite the betapart package directly. Citation information can be obtained by typing:

  • citation(“betapart”)


AB is funded by the Spanish Ministry of Science and Innovation (grant CGL2009-10111), and DO is funded by a fellowship from the RCUK.