Quantifying spatial phylogenetic structures of fully stem-mapped plant communities

Authors

  • Guochun Shen,

    Corresponding author
    1. State Key Lab of Biological Control and School of Life Sciences, Guangdong Key Lab of Plant Resources, SYSU-Alberta Joint Lab for Biodiversity Conservation, Sun Yat-sen University, Guangzhou, Guangdong, China
    2. Department of Ecological Modelling, UFZ Helmholtz Centre for Environmental Research-UFZ, Leipzig, Germany
    3. Tiantong National Field Observation Station for Forest Ecosystem, East China Normal University, Shanghai, China
    Search for more papers by this author
  • Thorsten Wiegand,

    1. Department of Ecological Modelling, UFZ Helmholtz Centre for Environmental Research-UFZ, Leipzig, Germany
    Search for more papers by this author
  • Xiangcheng Mi,

    1. State Key Laboratory of Vegetation and Environmental Change, Institute of Botany, The Chinese Academy of Sciences, Beijing, China
    Search for more papers by this author
  • Fangliang He

    1. State Key Lab of Biological Control and School of Life Sciences, Guangdong Key Lab of Plant Resources, SYSU-Alberta Joint Lab for Biodiversity Conservation, Sun Yat-sen University, Guangzhou, Guangdong, China
    2. Department of Renewable Resources, University of Alberta, Edmonton, AB, Canada
    Search for more papers by this author

Summary

  1. Analysis of the phylogenetic similarity of co-occurring species at different spatial scales is increasingly used for decoding community assembly rules. Here, we integrated the analysis of phylobetadiversity and marked point pattern analysis to yield a new metric, the phylogenetic mark correlation function, kd(r), to quantify spatial phylogenetic structure of fully stem-mapped communities.
  2. kd(r) is defined as the expected phylogenetic distance of two heterospecifics separated by spatial distance r, and normalized with the expected phylogenetic distance of two heterospecifics taken randomly from a study area. It measures spatial phylogenetic turnover relative to spatial species turnover and is closely related with the spatially explicit Simpson index.
  3. We used simulated fully stem-mapped plant communities with known spatial phylogenetic structures to assess type I and II errors of the phylogenetic mark correlation function kd(r) under a null model of random phylogenetic spatial structure, and to test the ability of the kd(r) to detect scale-dependent signals of phylogenetic spatial structure. We also compared the performance of the kd(r) with two existing measures of phylobetadiversity that have been previously used to analyse fully stem-mapped plots. Finally, we explored the spatial phylogenetic structure of a 24-ha fully stem-mapped subtropical forest in China.
  4. Simulation tests showed that the new metric yielded correct type I and type II errors and accurately detected the spatial scales at which various processes (e.g. habitat filtering and competition) were invoked to generate spatial phylogenetic structures. The power of the kd(r) was not affected by a phylogenetic signal in species abundance and different topologies of the phylogenetic tree.
  5. Replacing phylogenetic distance by functional distance allows for application of the kd(r) to estimate spatial correlations in functional community structure. Thus, the kd(r) allows trait and phylogenetic structure to be analysed in the same framework. The phylogenetic mark correlation function is a powerful and accurate tool for revealing scale-dependent phylogenetic/functional footprints in community assemblages and allows ecologists to keep up with the increasingly available data of fully stem-mapped plots, functional traits and community phylogenies.

Introduction

One of the persistent challenges in community ecology is to explain how species coexist in communities, particularly in species-rich communities such as tropical forests (Chesson 2000). Numerous ecological and evolutionary processes have been identified to play roles in species coexistence and assembly of communities (Wright 2002), but their relative importance is not well understood. Analysis of the phylogenetic similarity of co-occurring species has increasingly been used to aid in this task (Webb et al. 2002; Kembel & Hubbell 2006; Swenson et al. 2006; Kraft, Valencia & Ackerly 2008; Paine et al. 2012; Swenson 2013). For example, a relationship between local species co-occurrence and their phylogenetic relatedness may point to the operation of habitat filtering and/or competitive exclusion (Webb et al. 2002), and the phylogenetic relatedness of neighbours may be an important predictor of density-dependent mortality (Metz, Sousa & Valencia 2010; Paine et al. 2012).

An important result emerging from studies of phylogenetic alpha diversity is that different processes such as competition and environmental filtering are expected to imprint signals at different spatial scales (Goldberg 1987; Webb et al. 2002; Johnson & Stinchcombe 2007; Swenson et al. 2012). For example, phylogenetic evenness (i.e. co-occurring species are phylogenetically more distantly related than expected by chance) is often observed at small scales and clustering (i.e. co-occurring species are phylogenetically more closely related than expected) at larger scales (Webb et al. 2002; Cavender-Bares, Keen & Miles 2006). This suggests that quantification of scale-dependent phylogenetic structure should be helpful in determining the relative importance of biotic and abiotic filters in governing community assembly if the niche conservatism assumption holds (Losos 2008).

Especially suitable for spatially explicit analyses of local-scale phylogenetic spatial structure are spatially referenced data, such as fully stem-mapped forest plots from the Center for Tropical Forest Science in which every tree with a diameter at breast height ≥ 1 cm has been identified, mapped and measured in an area of 16–50 ha (Condit 1998). Such data permit the detailed measurement of the spatial arrangement of individuals with respect to their ecological similarity and hence the analysis of correlation between spatial and phylogenetic distances of individuals, independently on the overall phylogenetic community structure. This provides means for hierarchical analyses of phylogenetic structure by decoupling analysis of the overall phylogenetic community structure of a plot (e.g. Kraft et al. 2007) and the smaller-scale spatial phylogenetic structures. For example, individuals of ecologically similar or dissimilar species may tend to be located close to each other (small-scale clustering or evenness, respectively).

Previous phylogenetic and functional analyses of community assemblages have generally considered alpha diversity and spatial scale (Swenson et al. 2012), but Graham & Fine (2008) and Swenson et al. (2012) proposed phylogenetic beta diversity (phylobetadiversity) as a complementary approach for assessing scale effects. For example, low phylobetadiversity and high species beta diversity at small spatial scales are expected if competitive exclusion limits local coexistence of phylogenetically closely related species in a similar habitat (Graham & Fine 2008). This suggests that metrics that relate phylobetadiversity across space to species beta diversity should be especially suitable for revealing scale effects in phylogenetic spatial structure. However, phylobetadiversity has rarely been used in the analysis of fully stem-mapped plots (but see Swenson et al. 2012).

To capitalize on the full power of phylogenetic analyses in community ecology when fully stem-mapped data are available, we propose to integrate the analysis of phylobetadiversity (Hardy & Senterre 2007; Graham & Fine 2008; Swenson et al. 2012) with marked point pattern analysis (Illian et al. 2008). Our method utilizes point pattern data where the spatial location and the species identity of every individual in the plot are known and where a phylogenetic distance measure between all pairs of species is available.

The core of our method is a metric based on mark correlation functions (Stoyan 1984; Schlather 2001; Illian et al. 2008). The new metric, the phylogenetic mark correlation function kd(r), builds on previous theory of species beta diversity (Shimatani 2001; Chave & Leigh 2002) and phylobetadiversity (Hardy & Senterre 2007; Graham & Fine 2008; Swenson et al. 2012). It is defined as the expected phylogenetic distance of two heterospecifics separated by spatial distance r, normalized with the expected phylogenetic distance cd of two heterospecifics taken randomly from the plot. The kd(r) measures spatial phylogenetic turnover relative to spatial species turnover and is especially powerful for revealing phylogenetic patterns across spatial scale. An especially attractive feature of the phylogenetic mark correlation function is that it is able to measure the correlation between spatial and phylogenetic distances of individuals independent of the overall phylogenetic community structure. It can therefore complement alpha diversity measures of phylogenetic community structure (e.g. NRI and NTI; Webb et al. 2002; Kraft et al. 2007) by quantifying smaller-scale spatial phylogenetic structures.

The objectives of this study are to (i) present and exemplify the phylogenetic mark correlation function for analysing phylogenetic spatial structure in fully stem-mapped plots, (ii) assess the performance and accuracy of the method over a range of variations in spatial scale, phylogenetic topology and phylogenetic signal in abundance, and (iii) compare the performance of the kd(r) metric with that of others metrics used to analyse phylogenetic spatial structure in fully stem-mapped plots. We also show explicitly that the power of our metric is independent of a phylogenetic signal in species abundance and the topology of the underlying phylogeny.

We simulated fully stem-mapped plant communities with and without signals of scale-dependent spatial phylogenetic structure that resembles typical situations expected in real census plot data and determine the type I and II error rates of the phylogenetic mark correlation function. As a ‘proof of concept’ example, we applied the phylogenetic mark correlation function to data from the 24-ha fully stem-mapped Gutianshan subtropical forest census plot in China. Finally, we compared the performance of the new metric with two existing abundance weighted and quadrat-based metrics, the mean nearest phylogenetic dissimilarity and the mean pairwise phylogenetic dissimilarity (Swenson 2011).

Materials and methods

Simulation of fully stem-mapped communities

All simulated fully stem-mapped communities comprised approximately 12 000 individuals of 10 species that were distributed within a 300 × 300 m plot (Fig. 1). A phylogeny for each simulated community was generated by randomly clustering the tips according to Paradis (2012). Branch lengths of the phylogenetic tree in each simulated community were sampled from a gamma distribution. To quantify the sensitivity of the different metrics of phylobetadiversity to the topology of the phylogenetic tree, we varied the topology of the phylogeny among simulated communities by changing the shape and rate parameters of the gamma distribution (see detailed settings of the parameters and examples of simulated phylogenies in Appendix S1). Phylogenetic relatedness between species was represented by a matrix of pairwise distances between the pairs of tips from the simulated phylogenetic tree using its branch lengths. Species abundances in each simulated community were generated by Brownian motion along a given phylogeny. The phylogenetic signal in the simulated species abundance was quantified by the K-statistic (Blomberg, Garland & Ives 2003) (see Appendix S1). We generated six types of fully stem-mapped communities based on four assembly rules (Table 1). Simulation of communities with higher species richness or more complex assembly rules was limited by extensive computational time.

Table 1. Known spatial phylogenetic structure of the simulated fully stem-mapped communities that were based on four different assembly rules (see Materials and methods section for detail). For each assembly rule, we simulated 999 communities without spatial phylogenetic signal (scenarios c1–c4). Additionally, spatial phylogenetic signals were included in communities based on the habitat association and the competition assembly rules (scenarios c5 and c6). The expected phylogenetic patterns are random (r) in scenarios c1 to c4, phylogenetic clustering (+) in scenario c5, and in scenario c6, it is evenness (−) at distances < 5 m and random (r) at distances > 10 m. Note that the topology of the phylogenies and the phylogenetic signal in species abundance varied among all simulated communities (see Materials and methods section for detail)
 Random placementIndependent clusteringHabitat associationCompetition
No phylogenetic signalc1; (r)c2; (r)c3; (r)c4; (r)
With phylogenetic signal  c5; (+)c6; <5 m (−); >10 m (r)
Figure 1.

Illustration of spatial patterns of the simulated communities for different assembly rules within 300 × 300 m plots. For clarity, we show only the spatial distributions of two of the ten species in the simulated community. Points with the same colour represent conspecifics. (a) Community generated by random placement, (b) community generated by independent cluster point processes that mimic dispersal limitation, c) community generated by habitat association where the collared strips represent trough and peak variation in environmental variable v(x) that determines the niches of the species and d) community with intra- and interspecific competition among individuals separated less than 5 m.

(1) Random placement. The spatial pattern of the community is completely random. We simulated these communities by independently superimposing the patterns of individual species generated by homogeneous Poisson processes (Wiegand & Moloney 2004) (Fig. 1a). The random placement communities act as reference for communities that contain no spatial structure (i.e. no clustering, no co-occurrence or habitat association) and no phylogenetic signal (scenario c1 in Table 1).

(2) Independent clustering. The spatial structure of the simulated communities was characterized by intraspecific clustering, but different species were independently placed. We generated these communities by independently superimposing the patterns of individual species generated by a homogeneous Thomas process (Wiegand, Huth & Martínez 2009) with parameters μ = 8 (i.e. an average of eight points per cluster) and σ = 5 m (i.e. 95% of all points were located closer than distance 2σ = 10 m from the cluster centre) (Fig. 1b). This assembly rule corresponds to situations where the spatial pattern of the community is only determined by dispersal limitation, but not by habitat association or interspecific interactions (scenario c2 in Table 1). As a consequence, these communities contain no spatial phylogenetic structure.

(3) Habitat association. In this case, the spatial structure of the simulated communities is only driven by habitat association. For simplicity, we assumed that the heterogeneous environment was characterized by one variable v(x) generated by a sine function along the x axis with period of 32 πm ≈ 100 m (Fig. 1c). Next, we assigned each species s an optimal niche value ns that was drawn from an uniform distribution between −1 and 1. The intensity function λs(x, y) of species s was given by λs(x, y) = λ (1 + sin(−nsπ π*x/100)), where λ Ns/(300*300) is the density of species s in the study plot and Ns the abundance of species s. The distribution of each species s was generated by a heterogeneous Poisson processes based on λs(x, y) (Wiegand & Moloney 2004). Thus, two species with Δns = 0 (Δns is the difference of two optimal niche values of the two species) have identical niches, and with Δns = 1 or −1, they have most dissimilar niches. Because the optimal niche value of each species was not related with the phylogeny of the simulated community, the expected phylogenetic spatial structure was random at all spatial scales (scenario c3 in Table 1).

To generate stem-mapped communities with phylogenetic spatial structure caused by habitat association, we defined that the niche difference Δns between two species was highly correlated (R2adj > 0·95) with their phylogenetic relatedness. Because the period of the habitat is 32 πm, we expect in this case significant phylogenetic clustering within 25 m (scenario c5 in Table 1).

(4) Competition. Communities driven by competition were based on intraspecific and interspecific interactions and were simulated by a multitype Strauss point process described in Geyer & Møller (1994). Direct competition was limited to plants that were located closer than 5 m. We assigned each pair a, b of species an index of ecological similarity sim(a, b) that was randomly drawn from a uniform distribution between 0 and 1. Thus, phylogenetic spatial structure of these communities was expected to be random at all spatial scales (scenario c4 in Table 1).

To generate communities with phylogenetic spatial structure driven by competition, we positively correlated (R2adj > 0·95) the strength of competition between two species in the multitype Strauss point process with their phylogenetic relatedness. Thus, more similar species tended to locally exclude each other, and as a consequence, phylogenetic evenness was expected to occur for plants located at distances below 5 m (i.e. the range of direct competition) (scenario c6 in Table 1). However, we may find frequently cases where three plants are arranged linearly as ABC where the distance between A and C is just outside the range of competition (say 6–10 m). Because AB and BC will show large phylogenetic distances (due to competition), the pair AC may show a small phylogenetic distance. Therefore, we may find a tendency towards phylogenetic clustering just outside the range of competition which will smoothly disappear with increasing spatial distances (say twice the range of competition) (scenario c6 in Table 1).

An example of a real community

We applied the phylogenetic mark correlation function to data of the 24-ha Gutianshan (GTS) subtropical forest plot, China (see Legendre et al. 2009 for the description of this plot). We used the data of 17 707 living, large individuals (DBH > 10 cm), belonging to 107 species, of the first census. Phylogenies of the tree community in the GTS plot were constructed using three DNA sequence regions (rbcL, matK and trnH-psbA) in Gang et al. (2012). The phylogeny used in our analysis is shown in Fig. S2 in the Appendix S1.

Definition of the phylogenetic mark correlation function

Mark correlation functions allow testing if marks (e.g. size of a tree) of a point pattern (e.g. trees) are spatially correlated, conditional on the spatial locations of the points (Illian et al. 2008). A mark correlation function kt(r) yields the expectation of a test function t(mi, mj) involving the marks mi and mj of two points i and j, taken over all pairs of points that are distance r apart and is normalized with the expectation of the test function t(mi, mj) taken over all pairs of points regardless of their distances.

The phylogenetic mark correlation function kd(r) evaluates only heterospecific pairs of individuals and uses the phylogenetic distance d(a, b) (or a functional distance) between two species a and b as test function. It therefore yields the expected phylogenetic distance of two heterospecific individuals separated by spatial distance r and is normalized with the expected phylogenetic distance cd of two heterospecific individuals taken randomly from the plot. More formally, the kd(r) can be estimated as:

display math(eqn 1)

where the indicator function I(spispj) yields one if the individuals i and j are different species (i.e. spispj) and zero otherwise, the d(spi, spj) is the phylogenetic (or functional) distance measure between individuals i and j of species spi and spj, respectively, the kernel function inline image yields one if the individuals i and j are located at spatial distance ± h/2 (h is the bandwidth in the kernel function κ) and zero otherwise, and N is the total number of individuals in the surveyed area. The normalization constant cd represents the overall phylogenetic community structure and is estimated as:

display math

where S is the total number of species in the plot, and Ns is the abundance of species s.

Link to indices of beta diversity

Equation (eqn 1) can be rewritten to represent the kd(r) as the (normalized) ratio of two quantities representing phylobetadiversity βphy(r) and species beta diversity βS(r) (see Appendix S2 of Supporting Information):

display math(eqn 2)

where βS(r) is the spatially explicit Simpson index that measures species betadiversity (Shimatani 2001; Chave & Leigh 2002), β*S is the classical Simpson index (Simpson 1949), the βphy(r) is a phylogenetic extension of βS(r) which measures phylobetadiversity, the β*phy is the phylogenetic analogue of β*S, and cd=β*phy/β*S.

First, the spatially explicit Simpson index βS(r) is a mark correlation function with test function I(spispj) that can be estimated as:

display math(eqn 3)

where the summation goes over all pairs of individuals i and j (with ij). Thus, βS(r) yields the probability that two arbitrarily chosen individuals a distance r apart are heterospecifics and quantify species turnover.

Secondly, we generalized the spatially explicit Simpson index βS(r) to yield an index βphy(r) of spatial phylobetadiversity (Graham & Fine 2008). This can be done by replacing the binary test function I(spispj) in βS(r) (eqn (eqn 3)) with a continuous measure d(spi, spj) of phylogenetic (or functional distance) between two species spi and spj (Hardy & Senterre 2007). The ‘spatially explicit phylogenetic Simpson index' βphy(r) is therefore the mean phylogenetic distance of all pairs of individuals a distance r apart and quantifies phylobetadiversity (Graham & Fine 2008). The β*phy is the phylogenetic analogue to β*S and yields the mean pairwise phylogenetic distances (MPD) of all individuals in the plot (i.e. the index DP of Hardy & Senterre 2007).

Because βS(r) and βphy(r) quantify spatial species turnover and spatial phylogenetic turnover, respectively, the phylogenetic mark correlation function (eqn (eqn 1)) can be interpreted as a measure of spatial phylogenetic turnover relative to the spatial species turnover (eqn (eqn 2)). The constant cd = β*phy/β*S in eqns (eqn 1) and (eqn 2) yields the MPD of all heterospecific individuals in the plot and normalizes the phylogenetic mark correlation function kd(r) to a value of one if there is no spatial phylogenetic signal in the data (i.e. phylogenetic turnover is perfectly correlated with species turnover; Fig. S3 in Appendix S2). However, if individuals separated in space by distance r are more closely related than expected by species turnover, we have kd(r)<1 (i.e. phylogenetic clustering), and if they are more distantly related than expected, we have kd(r) > 1 (i.e. phylogenetic evenness).

Null models of random phylogenetic spatial structure

The selection of the null model is dependent on the objective of the study and the summary statistic used. In our study, the primary objective is to detect small-scale spatial phylogenetic structures such as phylogenetic evenness or clustering, generated by phylogenetically related processes. Thus, we need to test the null hypothesis of a community without phylogenetic spatial structure. This can be achieved by contrasting the observed data to null communities generated by the species shuffle null model that randomly shuffles the species label over the species present in the community phylogeny (Fine & Kembel 2011; Swenson et al. 2012). Because this randomization of the phylogenetic distance matrix d(a, b) breaks down the actual phylogenetic relationships among species (while keeping the matrix elements intact), the resulting kd(r) functions are representative of random spatial phylogenetic structure, conforming the null hypothesis to be tested (Hardy & Senterre 2007). Note that this null model constrains all potentially confounding properties of the data and is particularly powerful for studies of beta diversity, because it fixes all observed spatial patterns except spatial phylogenetic structure (Hardy & Senterre 2007; Swenson et al. 2012).

Significance test

We generated for each simulated community 999 Monte Carlo realizations of the species shuffle null model. To test departure of the observed kd(r) from that of the null model, we used the 25th largest and smallest values of kd(r)null as approximate 95% simulation envelopes of kd(r) at spatial distance r. If the observed kd(r) was outside of the simulation envelopes, the community showed a significant departure from the null model at spatial distance r. If a community contains no phylogenetic spatial structure, the expectation yields kd(r) = 1 and we expect kd(r) < 1 for phylogenetic clustering and kd(r) > 1 for phylogenetic evenness.

Calculations of type I and type II errors

For each of the six types of simulated communities (Table 1), we generated 999 replicates and calculated the type I and type II error rates for the phylogenetic mark correlation function at different spatial scales. Type I and II errors were calculated as the percentage of cases (out of the 999 realizations) where a metric rejected a true hypothesis (according to Table 1) or accepted a false hypothesis, respectively. For communities simulated with the competition assembly rule, type II and I errors were calculated before 5 m and after 10 m, respectively (the expected errors between 5 to 10 m cannot be calculated accurately because this is a transition zone according to our competition setting).

Comparison with other metrics

We compared the performance of the phylogenetic mark correlation function with two metrics of phylobetadiversity that have been applied to fully stem-mapped plots, the abundance weighted mean nearest phylogenetic dissimilarity, D'nn(r), and the abundance weighted mean pairwise phylogenetic dissimilarity, D'pw(r). Both are estimated between local communities in two small quadrats that are distance r away (see detailed explanation of D'nn and D'pw in Appendix S3 and Swenson (2011) and Swenson et al. (2012)). A simulated community was divided into 10000 non-overlapping 3 × 3 m quadrats, but 1 × 1 m quadrats were used to evaluate the small-scale performance of the conventional metrics for the communities in scenario c6 (Table 1). Significance of phylogenetic spatial structure was also tested by the species shuffling null model.

Implementation

The algorithm for calculating the phylogenetic mark correlation function and the significance test under the species shuffling null model were implemented in C and embedded into the R environment (see https://github.com/guochunshen/sce) and in the software Programita (Wiegand & Moloney 2004). The software can be requested by the first two authors.

Results

Type I and type II errors

The phylogenetic mark correlation function kd(r), when teamed with the species shuffle null model, performed well under the different communities without phylogenetic spatial structure. For all assembly rules, it yielded the expected type I error (around 0·05) at all spatial scales (blue lines in Fig. 2). In comparison, D'nn and D'pw yielded on average relatively large type I errors (Fig. 2).

Figure 2.

Type I errors estimated for the phylogenetic mark correlation function kd(r) (blue lines), the abundance weighted mean nearest phylogenetic dissimilarity, D'nn(r) (green lines), and the abundance weighted pairwise phylogenetic dissimilarity, D'pw(r) (red lines) under the species shuffle null model across spatial scale. To assess type I error, we used simulated communities in scenarios c1–c4 in Table 1. A suitable metric should detect only in approximately 5% of all cases a significant effect (i.e. a 5% error; dashed black horizontal line).

The phylogenetic mark correlation function kd(r) yielded below 5% type II error rates (Fig. 3a) for communities where habitat associations were correlated with phylogenetic relatedness (i.e. closely related species tended to occur in the similar habitat; scenario c5 in Table 1). The D'nn and D'pw performed well at small scales, but showed larger type II error rate above the 20 m scales (Fig. 3a). The kd(r) correctly revealed the signal of competition at scales of 1–5 m (Fig. 3b), but D'nn and D'pw had high type II error rates within 5 m (Fig. 3b). The kd(r) correctly identified the absence of competition at distances larger than the double of the 5 m competition range, but D'nn and D'pw produced higher type I errors at distances above 10 m (Fig. 3c).

Figure 3.

Type II errors in dependence on spatial scale estimated for the phylogenetic mark correlation function kd(r) (blue lines), the D'nn(r) (green lines) and the D'pw(r) (red lines) under the species shuffle null model. To assess type II error, we used simulated communities where phylogenetic relatedness was correlated with the niche (panel a) or the pairwise competition strength (panels b and c). A suitable metric should detect departures from the null model in most cases (e.g. approximately 95% cases or 0·05 type II error). Because competition reached only up to 5 m, we calculated in scenario c6 type II error for distances  5 m (panel b), but type I error at distance  10 m (panel c).

Impact of a phylogenetic signal in species abundance and the topology of the underlying phylogeny on type I and type II errors

The strength of phylogenetic signal in species abundance (see the Blomberg' K-statistic and its P-value in Appendix S1) and the topology of the phylogeny (see the skewness and kurtosis of the phylogenetic distances among tips in the simulated community in Appendix S1) did not influence the type I and type II error rates of the phylogenetic mark correlation function across spatial scales (Figs 2-4). However, the performances of D'nn and D'pw were influenced by topology (skewness and kurtosis) of the phylogeny (Fig. 4). The error rates of D'nn and D'pw increased substantially with increasing strength of the phylogenetic signal in species abundance (Fig. 4).

Figure 4.

Means (points) and standard deviations (vertical bars) of type I and II error rates of the three phylobetadiversity metrics under different groups of phylogenetic signal in species abundance, kurtosis and skewness of the phylogeny of the simulated communities. The strength of the phylogenetic signal in species abundance was quantified by Blomberg's K-statistic.

Phylogenetic Mark correlation function for the simulated and real communities

To illustrate the behaviour and interpretation of the phylogenetic mark correlation function, we show in Fig. 5 the results for simulated fully stem-mapped communities and the GTS plot. The fully stem-mapped example community assembled by independent clustering showed strong spatial structure as expected by dispersal limitation (Fig. 1b), but no spatial phylogenetic structure was expected. As expected, the phylogenetic mark correlation function kd(r) was completely located within the simulation envelope of the species shuffle null model (Fig. 5a).

Figure 5.

Shapes of the phylogenetic mark correlation function (red lines) in simulated communities assembled by independent clustering without phylogenetic signal (panel a; c2 in Table 1), habitat association with a phylogenetic signal (panel b; c5 in Table 1, 10π period of sine waved environment along x axis), competition with a phylogenetic signal (panel c; c6 in Table 1, competition occurred within 5 m distance) and the data from the GTS forest plot (panel d). Spatial distances in bottom panels were log-transformed. The grey area represents the 95% simulation envelopes under the species shuffle null model. Phylogenetic evenness or clustering is indicated where the phylogenetic mark correlation function falls above or below the simulation envelopes, respectively. Vertical dashed lines are the reference lines at 5nπ (= 1, 2,…) distances (panel b) and 5, 10 m distance (panel c).

The habitat-driven community showed both spatial and phylogenetic structure because phylogenetic distance and habitat preference of species were correlated (Fig. 1c). The kd(r) recovered the spatial scales of the phylogenetic structure imprinted by the periodic habitat association (Fig. 5b; the period of sin like habitat was 10π in this example). It revealed that individuals at spatial distances 10nπ (= 1, 2, …) were phylogenetically more similar than expected (they showed maximal niche overlap, Fig. 5b) and more dissimilar at spatial distances (1 + 2n)5π (= 0, 1, …) (they showed minimal niche overlap, Fig. 5b).

In the community-driven by intraspecific and interspecific competition (Fig. 1d), phylogenetic distances between species were negatively correlated with the strength of competition between the two species (e.g. Metz, Sousa & Valencia 2010); thus, nearby individuals had a larger phylogenetic distance than expected. Indeed, Fig. 5c showed that the phylogenetic mark correlation function yielded values larger than the simulation envelope (i.e. phylogenetic evenness) at spatial distances within the competition range (<5 m) and within the simulation envelopes for individuals located more than 10 m apart. For individuals located in the transition zone between 5 and 10 m, however, the phylogenetic distance was smaller than expected. This is the spatial correlation effect mentioned previously. The results of the phylogenetic mark correlation function were highly consistent among communities created by the same assembly rules (Fig. S4 in Appendix S4).

Finally, the phylogenetic mark correlation function of the real fully stem-mapped GTS tree community revealed that two heterospecific individuals (DBH > 10 cm) located at spatial distances <90 m were on average less related than expected under the species shuffle null model (Fig. 5d).

Discussion

Analysis of the ecological and evolutionary similarity of co-occurring species has increasingly been used to determine the processes underlying the diversity and assembly of communities (Swenson 2013). Here, we integrated previous work on phylobetadiversity (Hardy & Senterre 2007; Graham & Fine 2008; Swenson et al. 2012) with marked point pattern analysis (Schlather 2001; Illian et al. 2008) to yield a framework that is especially adapted to data sets of fully stem-mapped plots that include the exact position of all individuals of a community (e.g. Condit 1998).

The phylogenetic mark correlation function kd(r) we presented measures phylogenetic turnover at spatial distance r relative to the corresponding species turnover. The kd(r) together with the species shuffle null model performed consistently well in a wide range of situations. Because the species shuffle null model only randomized phylogenetic spatial structure while conditional on all spatial structures (Swenson et al. 2012), our analyses factored out the potentially confounding effects of spatial structures (e.g. caused by dispersal limitation) that were independent of phylogenetic relatedness (i.e. scenarios c3 and c4 in Table 1). Importantly, the kd(r) precisely detected the effect of processes that generated scale-dependent phylogenetic structure (i.e. scenarios c5 and c6 in Table 1). Additionally, our method is explicitly designed to allow trait and phylogenetic structure to be analysed in the same, and hence in a directly comparable, framework (see Baraloto et al. 2012 for a common framework for community-wide phylogenetic and functional structure). The distance matrix that defines in the kd(r) distances between species can be based equally on phylogenetic or functional distance. These features make the metric kd(r) a powerful tool for revealing scale-dependent phylogenetic or functional spatial structures in fully stem-mapped plant communities and should substantially enhance our ability to infer ecological processes which are expected to imprint phylogenetic or functional signals at different spatial scales (Webb et al. 2002; Graham & Fine 2008; Swenson et al. 2012).

An interesting feature of the kd(r) is that it allows for an analysis of the correlation between spatial and phylogenetic distances of individuals, independent of the overall phylogenetic community structure. This has two consequences. First, this separation of scales between non-spatial effects on the plot scale and local spatial effects allows for specific and unbiased assessment of phylogenetic signal that directly influence the locations of individuals. Secondly, because the kd(r) is independent of the overall phylogenetic community structure, its power is not affected by a phylogenetic signal in species abundance and the topology of the phylogeny which confounded most of the conventional phylogenetic metrics (Kraft et al. 2007; Hardy 2008). Our results in Figs 2-4 confirmed this expectation from the conditional property of kd(r). The mathematical reason for this is that the kd(r) is normalized with the term cd (i.e. the mean pairwise phylogenetic distances of all heterospecific individuals in the plot, in eq.1), which controls the influence of the overall phylogenetic community structure. As a consequence, the kd(r) is exclusively focused on the correlation between spatial and phylogenetic distances of individuals. In contrast, the two measures of phylobetadiversity D'nn and D'pw mix up phylogenetic patterns caused by spatial arrangement of individuals and overall phylogenetic community structure. As a consequence, a phylogenetic signal in species abundance and the topology of phylogeny influence the power of D'nn and D'pw under the species shuffling null model.

Because a principal motivation in the development of the phylogenetic mark correlation function is application to species-rich communities, future work should test the performance of the method for species-rich communities. Systematic variation in local diversity (e.g. due to habitat structuring) may potentially affect the statistical properties of our methods. However, restricting the analysis to single habitat (e.g. Kembel & Hubbell 2006) can remove much of this effect (T. Wiegand unpublished analysis). The highly consistent performance of the phylogenetic mark correlation for the data sets presented here suggests that it may perform equally well for more complicated communities.

Finally, although the phylogenetic mark correlation function has an outstanding ability to detect spatial phylogenetic structures, caution is needed in interpreting these patterns because many processes can generate similar spatial phylogenetic structure in ecological communities (Losos 2008). Once spatial structures are correctly detected, more complex rules for simulating null communities in concert with field experiments are required to make further inference on the underlying processes.

Conclusions

The proposed new method has broad ramifications. It provides a general framework for developing other metrics for analysis of the correlation between spatial distance of individuals and their phylogenetic or functional distance. For example, we can define a cumulative mark correlation function using all pairs of individuals with spatial distances < r instead of spatial distances ≈ r. This cumulative metric quantifies the expected phylogenetic distance of two heterospecifics separated by spatial distances smaller than r, normalized with the expected phylogenetic distance cd of two heterospecifics taken randomly from the plot. Additionally, a wide array of analyses is possible by selecting subsets of the full community for the pairs of individuals analysed. First, we can conduct species-centred analyses where the first individual of the pair is taken from a given focal species and the second individual is selected from all other species in the community. This analysis supplements the community-wide analysis and reveals which species drive the observed community-wide spatial phylogenetic structures and if all species show the same or opposed patterns. Secondly, we may analyse phylogenetic spatial structures between different life stages such as juveniles and adult trees in the community by restricting the focal tree to adult trees and the second tree to juvenile trees. Finally, we can find out if dead trees (taken as first individual of the pair) and surviving trees (taken as the second individual of the pair) show spatial phylogenetic structure; for example, dead trees may be surrounded by more phylogenetically similar species than expected (Metz, Sousa & Valencia 2010). In this case, the appropriate null model is to conduct ‘random labelling’ (Wiegand & Moloney 2004) where the mark ‘dead’ is randomly re-allocated over all surviving and dead individuals in the data set analysed. Adopting the framework of phylogenetically marked point patterns will thus allow ecologists to keep up with the increasingly available data of fully stem-mapped plots, species traits and community phylogenies to empower the inference on processes that shape community assemblages.

Acknowledgments

This work was supported by Sun Yat-sen University and NSERC (Canada) to FH, the ERC advanced Grant 233066 to TW, the NSFC 31170401 to XC and the NSFC 31100309 to GS. We thank Jinlong Zhang for preparing the molecular phylogeny of the GTS plot. We also thank Luke Harmon and five referees for their constructive suggestions and comments.

Ancillary