#### Calculation of diversity

We start with the definition of Simpson’s diversity or the Gini coefficient (Simpson 1949; Pielou 1969):

- (eqn 1)

where *p*_{k} is the proportion of species *k* in an infinite population, and *S* is the total number of species present. When applying the index to vegetation data, it is common practice to use the proportional cover or biomass of each species rather than the proportion of individuals; and hence an adjustment to prevent bias in a finite sample of individuals (Simpson 1949) is not applied here.

The Gini–Simpson diversity index, ranging from 0 to 1, represents the probability that two individuals chosen at random from the sample are from different species or, in the case of data expressed as proportional cover, the probability that two randomly chosen points within the sample area are occupied by different species. As with other frequently used indices of diversity it is a function of both the number of species present (species richness) and the frequency distribution of these species within the sample (species evenness). The average index of α-diversity across a set of *n* samples is:

- (eqn 2)

The diversity index of a pooled set of samples (γ-diversity) is

- (eqn 3)

Here we define an new index of diversity *H*(*h*) which is a function of separation (‘lag’) distance *h*. For vegetation cover data this can be easily interpreted as the probability that a pair of points within two samples a distance h apart are occupied by different species. For a pair of samples *i* and *j*, this is defined as:

- (eqn 4)

where *p*_{ik} is the proportional cover of species *k* in sample *i* and *p*_{jk} the proportional cover of species *k* in sample *j*. For a finite set of samples, the mean value of *H* at a separation distance *h* is

- (eqn 5)

where *N*(*h*) is the number of pairs of samples that are a separation distance *h* apart. When samples are equally spaced, for example along a transect or grid-based sampling system, *H*(*h*) is calculated at intervals of *h* corresponding to the sample spacing. Where sample spacing varies, for example in a randomized sampling strategy, *H*(*h*) is calculated at a series of lag distances *h* with *N*(*h*) the set of pairs of samples that are within a given tolerance of *h*. Plotting *H*(*h*) against *h* gives a plot analogous to a (semi)variogram in geostatistics (Fig. 1). Following Jost (2006), the ‘true’ diversity, in units commensurate with the ‘numbers equivalent’ of the diversity index, can be calculated as:

- (eqn 6)

Key features of these plots can be identified:

When *h* = 0, *i *= *j* and *N*(*h*)* = n*; eqn 5 simplifies to eqn 2. The values of *D*(*h*) and *H*(*h*) when *h* is zero, are therefore equal to and , which is the mean within-sample or α-diversity (and conceptually equivalent to the variogram ‘nugget’, or non-spatial variance; Issacs & Srivastava 1989). This value is a function of the size and shape of the sampling unit and hence grain size (i.e. quadrat size). *D*(*h*) will usually tend to increase with *h* due to spatial segregation of species – as the distance between paired samples increases, their species composition becomes less similar. At a separation distance *h,* where the species composition of plots are truly independent, i.e. there is no spatial autocorrelation at this length scale, the probability of resampling a species from paired samples is equal to the probability of resampling from an infinitely large pooled set of samples; *H*(*h*) becomes an unbiased sample of *H* and *D*(*h*) an estimate of γ-diversity. In a domain where spatial patterns in species distribution consist solely of aggregated pseudo-random patches (and not, for example, regular patterns or more complex spatial pattern), the value of *h* at which *D*(*h*) reaches the sill (analogous to the ‘range’ of a variogram) represents the characteristic length scale above which no further pattern in β-diversity is recognized.

We assessed the significance of spatial patterns in the plots using a Monte-Carlo bootstrap method. For each lag distance *h*, *N*(*h*) pairs of samples were selected at random from the entire data set, and values of *H* and *D* were calculated for these sets of paired samples. As these pairs of samples were selected randomly, without reference to the distance between samples, the calculated value of *H*, the average probability that two points within the samples are occupied by different species, is an unbiased estimate of *H*, the probability that any two points in the domain are occupied by different species. These random-pair values of *H* and *D* therefore provide estimates of γ-diversity. This procedure was repeated 1000 times for each value of *h* and the 2.5% and 97.5% quantiles of this distribution were plotted on the diversity–distance graphs. Where the calculated values of *H*(*h*) and *D*(*h*) fall outside these limits, the paired-sample diversity differs significantly (*P* < 0.05) from γ-diversity. At these lag distances spatial structure exists and contributes towards β-diversity.

The method can be generalized to incorporate distances between paired samples along further spatial or environmental axes. In the case of raised bogs, niche differentiation between hummock and hollow species occurs along a gradient in the vertical position of samples (i.e. the height of the surface above the water table; Soro, Sundberg & Rydin 1999; Økland, Rydgren & Økland 2008). In this study we also plot paired-sample diversity as a function of both the lag distance between samples *h* and the measured difference between samples in height above the water table between samples *z;* hence *h* represents the separation between paired samples in horizontal space, and *z* the separation in vertical space (the hydrological gradient):

- (eqn 7)

- (eqn 8)

In this case *H*(*h,z*) may be plotted against both *h* and *z* to examine the effects of spatial and environmental components of diversity separately (Fig. 2).

#### Fitting surfaces

As with (semi)variograms, empirically derived models may be fitted to plots of *H*(*h*,*z*) and *D*(*h*,*z*). In this study we estimate *H*(*h,z*) as the linear sum of log-transformed variables *h* and *z* (in cm) with an interaction term:

- (eqn 9)

Values of *a*, *b*, *c* and *d* were fitted to each transect using the GLM procedure in R. v2.2.1, with *H*(*h,z*) as the dependent variable and log(z + 1), log(h + 1), and the interaction term as fixed factors. All terms and combinations of terms were tested for significance at *P* < 0.05, and final model selection between significant models was on the basis of the minimum model Akaike Information Criteria (AIC).

#### Field Site

Wedholme Flow (54.86^{°} N, 3.23^{°} W) is an ombrotrophic lowland raised bog of 780 ha situated on the Solway Plain, Cumbria, UK. Much of the site has been affected by historical peat cutting and recent commercial peat extraction, which ceased in 2002. As this date site management has included blocking ditches and other groundworks in order to maintain a stable high water table for the recovery and maintenance of peat-forming vegetation. This study focuses on a relatively intact peat dome of the north bog. Two plots with contrasting hydrological conditions were selected for this study; the first, the central bog dome transect, was located near the centre of the intact bog peat dome, where hummock–hollow topography is clearly defined. The second, the bog margin transect, has less clearly defined hummock–hollow topography and was located *c.* 30 m from, and roughly parallel to, the eastern edge of the bog dome where a sharp transition between raised bog vegetation and a modified lag, dominated by *Betula pubescens* carr woodland, occurs.

##### Vegetation survey

At each site, all species of vascular plant and bryophyte were identified and their fractional cover estimated in 200 5 × 5 cm adjacent quadrats along a 10-m transect running north to south. Quadrats were defined in a horizontal plane and cover was estimated with the ‘any-part’ system (Williamson, 2003). Nomenclature follows Stace (1999) and Hill (1992). The bog dome transect showed marked fine-scale hummock–hollow topography typical of an intact raised bog, with a low canopy of the dwarf shrubs *Calluna vulgaris*, *Erica tetralix* and *Andromeda polifolia* and the peat moss species *Sphagnum capillifolium* and *S. magellanicum* dominant on hummocks and *S. tenellum* and *S. papillosum* on lawns. The bog margin transect showed less marked hummock–hollow topography and a mixture of *S. papillosum, S. magellanicum* and *S. tenellum* under a low canopy of *E. tetralix* and *C. vulgaris*.

##### Measurement of surface depth to water table

The laser-scanning method used to determine the bog surface height above the water table has been described in detail elsewhere (Anderson, Bennie & Wetherelt 2010a) and is summarized briefly here.

At the mid-point of each transect a hydrological dipwell with a barometric pressure sensor was installed in March 2008. At each site, the depth to water table was calculated relative to the dipwell cap, whose horizontal and vertical location was determined from a differential global positioning system (DGPS) survey. This allowed water table depth to be expressed relative to other surveyed points in the vicinity. In order to accurately measure the height above the water table along the transect, fine-scale microtopographic data describing the peatland surface structure were collected using a close-range laser scanner (HDS3000; Leica Geosystems, San Ramon, CA, USA). The scanner tripod was elevated above the peatland surface on the flat-bed trailer of a tracked vehicle, with an instrument height of *c.* 3.5 m above the surface. At each test site scans were taken from three different viewpoints, towards a 10-m diameter region of interest centred on the dipwell (Fig. 3).

During the laser scanning data capture, the tops of the dipwell caps were used as reference points to enable registration (linking up of scans) for each site. Three additional reference markers were positioned at the perimeter of the scan region. Their positions were measured using DGPS and proprietary software (Cyclone 5.4, Leica Geosystems) was used to register the three scans using the known position of the markers and the top of the dipwell as reference targets. This produced a combined point cloud referenced to the Ordnance Survey GB National Grid in the horizontal plane and metres above sea level in the vertical plane. Points within the 10-m diameter region of interest were selected and exported as ASCII text files for analysis.

A digital surface model (DSM) of the region of interest was created by selecting the minimum height of each point within a 5-cm grid, then using a smoothing function to smooth the surface using firstly the minimum value within a five-pixel window, then the mean value. The resultant DSM of the bog surface was used to derive the relative height above the dipwell water table of pairs of quadrats in this analysis. It should be noted that this value represents the height of the surface above the water table at the central dipwell – the actual height of the capitulum surface above the water table at a point may differ from these values due to fine-scale differences in capillary action associated with hummocks and hollows.