Bayesian change-point analyses in ecology


Author for correspondence: Brian Beckage Tel: +1 802 6560197 Fax: +1 802 6560440 Email:


  • • Ecological and biological processes can change from one state to another once a threshold has been crossed in space or time. Threshold responses to incremental changes in underlying variables can characterize diverse processes from climate change to the desertification of arid lands from overgrazing.
  • • Simultaneously estimating the location of thresholds and associated ecological parameters can be difficult: ecological data are often ‘noisy’, which can make the identification of the locations of ecological thresholds challenging.
  • • We illustrate this problem using two ecological examples and apply a class of statistical models well-suited to addressing this problem. We first consider the case of estimating allometric relationships between tree diameter and height when the trees have distinctly different growth modes across life-history stages. We next estimate the effects of canopy gaps and dense understory vegetation on tree recruitment in transects that transverse both canopy and gap conditions.
  • • The Bayesian change-point models that we present estimate both threshold locations and the slope or level of ecological quantities of interest, while incorporating uncertainty in the change-point location into these estimates. This class of models is suitable for problems with multiple thresholds and can account for spatial or temporal autocorrelation.


Anthropogenic climate change is likely to transform many ecological communities over the next century. The mean global temperature has risen by c. 0.6°C over the past century, and the rate of warming since 1976 has been greater than any other period during the last 1000 yr (Mann et al., 1998; Easterling et al., 2000; IPCC, 2001). Anthropogenic climate change is likely to continue at the same or an accelerated rate for the foreseeable future (Hansen et al., 2005; Meehl et al., 2005), with global temperatures predicted to rise by another 1.4–5.8°C by the year 2100 (IPCC, 2001; Wigley, 2005). Climate is an important determinant of species’ ranges: rising temperatures associated with anthropogenic greenhouse gas emissions are predicted to lead to species migration poleward or upward in elevation (Krajick, 2004). Forest composition will shift as populations of some species decline and new species become established as regional climates respond to global warming. Climate-linked range shifts have already been observed in a variety of taxa (Walther et al., 2002; Parmesan & Yohe, 2003).

Ecological systems may transition rapidly to altered states as climatic conditions cross critical thresholds, rather than slowly responding to changes in climate. Threshold responses are characteristic of diverse processes from climate change to the desertification of lands from overgrazing (van de Koppel et al., 1997; Higgins et al., 2002; Walker & Meyers, 2004). Threshold behaviors can result from nonlinear responses to incremental changes in underlying processes in which a gradual change in a process causes a disproportionate response once a critical threshold is reached (Maslin, 2004), such as might be caused by a positive feedback loop between an underlying driver and the system response (Hoffman et al., 2002; Crespi, 2004). Transitions between ecological states may also occur along spatial gradients in resource availability or disturbance frequency. Similarly, patterns of growth or allocation within individuals can shift once a critical age or size has been reached (LaDeau & Clark, 2001). In these cases, the systems display a threshold response in which the system switches states once a boundary region has been crossed in the underlying driver (e.g. resource level or age of individual). In this paper, we distinguish two threshold responses: an abrupt change in a rate of a process (Fig. 1a) or a jump in the level of a process (Fig. 1b).

Figure 1.

Conceptual model of abrupt climate change. An abrupt change can occur in the rate (a) or level (b) of a process once a threshold is crossed. An abrupt change in both the rate and level of a process could also occur. We do not specifically consider this case, but the models we present could be easily generalized to this case.

Estimating the location of thresholds in space or time and corresponding ecosystem responses (e.g. change in rate or level of an ecological process; e.g. Figure 1) can be a challenging problem that is likely to become increasingly important as ecosystems respond to anthropogenic climate change. The noisy nature of ecological data tends to obscure the identification of thresholds, introducing potentially large uncertainty into estimates of their location. The subsequent estimation of ecological quantities depends on the location of the underlying spatial or temporal threshold. A modeling approach is needed that simultaneously estimates the location of the threshold and the state of the system on either side of the threshold boundary, and that incorporates uncertainty in the threshold location into estimates of the state variables.

We illustrate this problem using two examples from our own research in forest dynamics and apply a class of models for addressing these challenges. In the first example, we estimate the allometric relationship between tree diameter and tree height in longleaf pine (Pinus palustris Mill.) for use in a forest simulation growth model. Longleaf pine shifts its allometric patterns as a function of life-history stage, producing a distinctly different relationship between tree height and diameter as a function of tree size. Our objective is to estimate the relationship between tree diameter and height across life-history stages, which may necessitate estimating the location of change-points (i.e. thresholds, abrupt changes or discontinuities in this relationship). In the second example, we estimate tree seedling recruitment along a transect from closed canopy forest through gaps in the forest canopy created by the death of overstory trees. Light is often a limiting resource in forest understories (Canham, 1988; Pacala et al., 1994) and its availability plays a central role in tree regeneration (Platt & Strong, 1989). Estimating changes in seeding densities associated with canopy and gap conditions requires identification of the transition from low to elevated light levels in the forest understory. The identification of this transition can be difficult in the absence of spatially extensive measurements of light levels because: (1) light gaps are offset towards the north in the northern hemisphere, with the degree of offset dependent on latitude, slope, and canopy height (Canham et al., 1990); and (2) the canopies of bordering trees are irregular and change with time as they grow into the light gap (Valverde & Silvertown, 1997). Our objective is to estimate the locations of the canopy to gap transition in light levels, and corresponding levels of seedling recruitment of red maple (Acer rubrum L.) associated with gap and closed canopy conditions. In this example, the analysis is further complicated by consideration of spatial correlation between adjacent measurements of seedling recruitment. Spatial correlation in seedling counts in adjacent quadrats can result from patchy soils, local seed sources or other spatially variable processes that effect seedling establishment.

We address these estimation problems using a class of statistical models referred to as change-point models. We first use a simple change-point model to estimate the allometric relationship between tree height and diameter. We next employ a hierarchical change-point model to estimate the probable transitions from canopy to gap conditions with a hierarchical dependence between multiple transects and associated seedling densities in closed canopy and gap environments. Finally, we extend this model to allow for spatial correlation between adjacent seedling counts. We fit these models using Bayesian methods.

Data and model description

Tree allometry

Platt et al. (1988) collected extensive data on the demography of longleaf pine in an old-growth stand in southern Georgia. The sampling of longleaf pine was stratified into two general size classes; stems with a diameter at breast height (d.b.h.) > 2 cm and stems with a d.b.h. < 2 cm. For the larger size class (referred to as ‘trees’ or ‘adults’), a random sample of 399 individuals was selected across the 40 ha study site, and the height and d.b.h. of each sampled stem was recorded. Individual trees ranged up to 75.4 cm d.b.h and 244 yr of age. For the smaller size class, four 1-ha plots were randomly selected out of a total of 40 1-ha plots. All stems < 2 cm d.b.h. were censused in the four selected plots, resulting in a total of 222 juveniles. These juveniles consisted of individuals that had not yet reached breast height, c. 1.4 m, and so the diameter at the base and the height of their terminal bud were measured.

Our objective is to predict the height of longleaf pine trees as a function of their diameter. Previous studies have used power relationships or other nonlinear model forms to relate tree height to diameter and usually estimate model parameters through linear regression on log transformed scales (O’Brien et al., 1995; Colbert et al., 2002), but we model tree height using a linear model on untransformed tree height and diameter. While a single linear relationship is unlikely to be appropriate across life-history stages (e.g. seedling, sapling, adults), the relationship may be piecewise linear, meaning linear over restricted ranges of tree diameters where the transitions between these linear regions are change-points. Both the slope and the residual variance of the linear model are expected to vary among these different regions. Finally, we expect a discontinuity between the juvenile and adult size classes that reflects the offset introduced by measuring the diameter of these two size classes at different locations on the stem– juveniles are measured at their base while adults are measured at breast height. Our model must account for all these aspects of the data.

Model description  Our change-point model is a piecewise linear regression with the transition points between adjacent linear regions unknown (Fig. 2). We allow for separate slopes and variances within each linear region, and estimate the offset introduced by measurement of stem diameter at different points on the stem. All parameters were estimated simultaneously using Bayesian methodology, allowing us to quantify both the uncertainty in the change point locations and the parameters defining the species abundance by means of probability distributions. In the Bayesian paradigm, before seeing the current data, our knowledge of the model parameters, including change-points, is described by a joint prior probability distribution. Once we have collected and modeled the current data through the change-point model's likelihood function, we revise our previous distributions to posterior distributions in light of this new information. These posterior distributions may then be used to make updated probability statements (i.e. inferences) about the parameters describing the species abundance as well as the change-point locations.

Figure 2.

Graphical illustration of our change-point model for tree allometry. The model is piecewise linear with the change-point (cp) locations defining the separate linear regions. Within these regions, β1–β3 are the slopes, σ1–σ3 represent the standard deviations, and β0 represents the intercept. We estimate a separate variance for each linear region. In addition, we estimate an ‘offset’ that represents the difference in stem diameter at the base and at breast height since stems were measured at alternative locations depending on stem height.

We modeled the height, hi, of individual i as:


(di is the diameter at the stem base; dbhi is the stem diameter at breast height; k is the offset that converts diameter at breast height to basal diameter for adult data, cp1 and cp2 are the first and second change-points; and the β parameters are regression coefficients). The precision parameter τi, which is the inverse of variance, is given by the conditional:


The posterior distribution of the model parameters is proportional to


where X represents the observed data and Θ represents the prior parameter vector. The likelihood function is normal conditional on the mean and precision as indicated above. Bayesian analyses require a prior distribution over all unknown parameters. The parametric form of the prior probability distributions for the model components are:


The form of these prior probability distributions was chosen both to facilitate computations and to represent beliefs regarding model parameters. Our general strategy was to employ diffuse or noninformative prior distributions, so that final inferences will depend solely or almost solely on the data. The exception to this was the uniform prior on k, which was constrained to lie between − 1 and 5 (cm). The prior parameter vector that we used, and that was assigned a priori, was:


The normal prior on the βj coefficients, for example, is centered at 0 with a small precision (equivalent to a large variance), which allows the βj's to be primarily determined by the data. Nevertheless, it is sometimes possible that even ‘noninformative’ prior distributions will have some influence on the posterior distributions, so that the sensitivity of the model to the priors is assessed. We varied both the form of the prior distributions (for example, replacing the gamma priors on cp1 and cp2 with uniform priors over the range (0,80)) as well as the parameterization of the prior distributions to assess model sensitivity. Our inferences were insensitive to variation in our diffuse priors. Of course, the priors may be selected to reflect results from previous studies or one's knowledge of the system and be allowed to drive the inference to a greater degree than in the current analysis.

While expression (1) above is proportional to the posterior distribution of the model parameters, normalizing the posterior distribution requires integration of the expression over its entire parameter space. Although this is intractable, an alternative approach is to simulate a large number of random samples from the posterior distribution of model parameters, represented by a vector θ, and to use these random samples to make inferences about θ. We did this using Markov Chain Monte Carlo (MCMC) methods (Gelman et al., 2003). With a large number of vectors of θ, drawn from the joint posterior distribution of θ, we can abstract any component parameter of interest and use the large number of simulated values to approximate the marginal posterior distribution of the parameter. We programmed our MCMC sampler in the open source language R and the code is available from the author.

Seedling recruitment

Beckage et al. (2000) studied tree recruitment with respect to canopy gaps and presence of the understory shrub Rhododendron maximum L. The research was conducted in mixed-oak forests at the Coweeta Hydrologic Laboratory located near Franklin, NC, USA (in the Southern Appalachians). The mixed oak forests are dominated by Quercus prinus L., Quercus coccinea Muenchh., Quercus rubra L., and Quercus velutina Lam., but Acer rubrum dominates the seedling bank. Much of the forest understory is dominated by R. maximum, an ericaceous, evergreen shrub that occurs at all elevations in the Coweeta basin (Swank & Crossley, 1988). Rhododendron forms a dense subcanopy layer 3–7 m in height. Stem densities range from 5000 to 17 000 ha−1 (Baker & van Lear, 1998) and leaf area indices (LAI) range from approximately 4.8 to 6.6. Stem densities in our plots were approx. 8900 ha−1 with diameters most frequently ranging from 4 to 7 cm d.b.h., but sometimes > 10 cm d.b.h.

Beckage et al. (2000) created a series of 12 artificial gaps under two understory conditions: half of the plots had a dense Rhododendron understory, while the remaining plots lacked Rhododendron. Rhododendron has a patchy distribution at these sites, permitting experimental gaps (with and without Rhododendron) to be located in close proximity, thus allowing for consistent overstory composition, slope, soils and microclimate. Up to five canopy trees were girdled to create each gap. Gaps were approx. 20 m in diameter (for expanded gap definition see Runkle, 1981) with standing dead trees and, thus, resulted in minimal disturbance to the understory.

A transect comprised of 40 contiguous 1-m2 quadrats was established across each planned gap before its creation. Transects included 20 central quadrats spanning the diameter of the gap and 20 outer quadrats beneath the surrounding canopy. However, the border between gap and canopy conditions is irregular, reflecting the canopy shapes of bordering trees so that all 20 central quadrats might not subtend the canopy gap (Fig. 3). Consistency of transect orientation is an important consideration in the present analysis because elevated light levels, offset toward the north side of canopy gaps in the northern hemisphere (Canham et al., 1990), affect the expected distribution of change-point locations. Seven of the twelve transects were oriented in a general north–south direction and these were the focus of our analysis. Only three of these transects contained Rhododendron in their understory. Beckage et al. (2000) conducted annual surveys of tree seedlings in the transects in mid to late summer. Recruits that germinated in the year in which the census was conducted (i.e. seedlings) were distinguished from older recruits by the presence of cotyledons and a lack of terminal bud scale scars. Acer rubrum was the only species with abundant seedlings that was widely distributed across transects. In the current analysis, we focus on seedlings of A. rubrum that occurred in the seven transects, identified above, in a single year (1997).

Figure 3.

(a) Layout of a typical experimental gap. The long axis of the transect was oriented in a north–south direction. The experimental gaps were designed so that the central 20 quadrats should subtend the canopy opening but the irregular shapes of the surrounding canopies make this unlikely. (b). The change-point model for a single transect. The first change-point occurs at quadrat Si for transect i while the second change occurs at quadrat Ti. inline image is the rate parameter of a Poisson distribution describing the seedling counts in canopy conditions while inline image describes the rates in gap conditions.

Model description  The Bayesian change-point model for seedling counts has two components: a model describing seedling abundance and a model describing change-point locations. The seedling model is a hierarchical Bayesian model, so that similar transects or portions of transects (e.g. with Rhododendron, gap conditions, etc.) were united by higher levels of the model. This is the Bayesian analogue to classical random effects models.

The hierarchical model can be described in three stages or levels. We represent the seedling counts in quadrats by a vector Xi,j, with subscript i designating the transect and j representing the quadrat within transect i. At the first level of the hierarchy, seedling counts in transect i are assumed to be independent and to follow Poisson distributions with respective rate parameters inline image and inline image depending on whether the portion of the transect occurs under gap or canopy conditions (Fig. 3). The second level of the hierarchy links the λi's for similar conditions (e.g. gap vs canopy and Rhododendron vs nonRhododendron) across transects and assumes that the λi's follow a Gamma distribution. We choose the Gamma distribution because it is commonly used to represent variability in the means of Poisson random variates. This level allows for the ‘borrowing of strength’ or the use of information from separate sampling units at a lower level in the hierarchy to estimate a higher level parameter that spans across sampling units, which is an important advantage of hierarchical models. While the seedling counts in individual quadrats are independent Poisson variates, the means of these Poisson variables are assumed to be similar, but not identical, across transects. In general, if only a small amount of data are available for a particular Xi,j, then the parameter value for the mean level in the gap or canopy portion of the transect will be close to the mean level of the other similar transects, there being little evidence to the contrary. Conversely, if many data are available, the estimated mean will be close to the mean of the data. With moderate amounts of data, the estimated mean will be a data-based compromise between the observed mean level and the overall mean levels of all similar transects. At the top level of the hierarchy, the b−1 parameter of the Gamma distribution is itself distributed as an Inverse Gamma. The hierarchical random effects model structure allows global inferences to be made on the two recruitment rate parameters for similar transects while still allowing for individual transect parameters to vary. Within groups of similar transects, inferences are made through lambda stars, λ*'s, which are the expected value of a randomly selected lambda in that treatment combination group. Our seedling model is then


where the subscripts g or c indicate gap or canopy conditions, respectively.

The parameters αc, αg, γc, γg, ςc, and ςg were assigned values a priori and determined the prior distributions of the seedling portion of the model. We used the same priors for both canopy and gap portions of transects allowing the prior parameter vector to be written as θp = (α, γ, ς). Our set, θp, of prior parameters was (1.05, 0.01, 0.01). We explored the sensitivity of our inferences using two other prior sets: alternative prior set 1 (1.0, 0.01, 0.01) and alternative prior set 2 (12.0, 0.5, 0.5). The prior specification was diffuse for all three prior sets, allowing the current data to play a dominant role in the final inference (i.e. the posterior predominately reflects the data with the prior exerting little influence). In fact, the prior sets carry the equivalent weight of 0.05, 0 and 11 seedlings, respectively. In the present case, the results from the three prior sets were very similar, indicating that they had little influence on the posterior distribution.

We modeled each transect as having two change-points. In transect i, the first change-point occurs in quadrat Si and represents the unknown point where canopy conditions end and gap conditions begin (Fig. 3). Similarly, Ti is the quadrat where the second change-point occurs and represents the transition back from gap to canopy conditions. Si was modeled as following a multinomial distribution (psj, j = 1… 20) where psj is the probability of the change occurring in quadrat j. The prior probabilities for the vector ps were given a Dirichlet (asj, j = 1 … 20) distribution: the Dirichlet distribution is conjugate to the multinomial distribution, which facilities computation of the posterior distribution (Gelman et al., 2003). The second change-point, Ti, was modeled in a similar manner. We set, a priori, the components of the vectors as and at to 1/200 so that these priors were diffuse; their influence on the posterior distribution was equivalent to only 1/10 of a transect. It is apparent from our model description that we have constrained the first change-point to occur in the first half of each transect and the second change point to occur in the second half of the transect. Furthermore, we have, a priori, said that the change-point is equally likely to occur in any quadrat within this region. The change-point of our model is given by


(Si and Ti refer to the first and second change-points in transect i; aj and aj are the Dirichlet priors on the probabilities, p and p′, of the multinomial distribution for the first and second change-points).

We combined the Poisson and multinomial portions of our model to estimate the posterior distribution of all model parameters. The posterior distribution is proportional to


We again estimated model parameters using MCMC, but in this case we sampled the posterior distribution using the freely distributed winbugs software that generates samples from the posterior distribution of a user-specified model (Spiegelhalter et al., 1995,

Spatial correlation  The model in the preceding section assumes conditional independence between adjacent quadrats within each transect. This assumption may be reasonable, since, given the mean values that apply to a pair of adjacent quadrats, the actual seedling recruitment within each quadrat may be independent from neighboring quadrats. Nevertheless, it is also possible that, for example, soils are more similar in adjacent areas within a transect compared with quadrats further away, so that some correlations may appear across adjacent quadrats, even given the overall mean level. Therefore, we extended our model to account for potential correlation in adjacent quadrats. The change-point model remains identical to that above, but we now allow correlations between seedling counts in adjacent quadrats by including a Gaussian Markov random field (GMRF) prior for random effects on the λ across quadrats within a transect (Besag & Kooperberg, 1995; Besag & Higdon, 1999).

We modified the structure of the seedling model to incorporate the GMRF prior by incorporating a linear model for treatment effects that allowed for modeling of spatial effects, i.e. the inline image, of being in quadrat j:


(Y is the design matrix that codes for gap and canopy conditions as well as for transects; β is the vector of regression parameters describing canopy, gap and transect effects). The transect effect is assumed to be random with mean 0 and precision τ0, where τ0 has a Gamma prior distribution (a hierarchical model structure). Our design matrix is not full rank but is nevertheless estimable using Bayesian methods. The prior vectors were given values of (a0, b0) = (0.1,0.1) and (τc, τg) = (1.0 × 10−5,1.0 × 10−5). We placed a locally linear GMRF prior on inline image:


where the spatial effect of being in quadrat j depends on only quadrats immediately adjacent to the focal quadrat j. The spatial effects (z's) model spatially structured residual variance in seedling abundance that remains after the main effects (e.g. gaps, Rhododendron, transects) have been estimated. The prior on the precision parameter inline image is


with (az, bz) = (0.001,0.001). The diffuse specification of the prior on means that its estimated value will be primarily determined by the data. We fit this spatial model using the ‘car.normal’ functionality within the GeoBUGS module of winbugs.


Tree allometry

Our two change-point model fit the observed data well, with the regression line and change-point locations bisecting the cloud of observed data points (Fig. 4b). The discontinuity created by the diameter measurements being taken alternatively at the stem base or breast height has been accounted for through estimation of an offset (i.e. ‘k’ in Table 1): The data points in Fig. 4b have been adjusted by k. The estimates of variance among the three change-point regions were very different, confirming the need to allow for separate variances (Table 1). Summary statistics for all model parameter estimates are presented in Table 1. We also fit a three change-point model to these data but there was no strong support for a third change-point, as shown by a flat likelihood and subsequent strong correlation between change-point locations and k in this model.

Figure 4.

Allometry of longleaf pine in an old growth forest. (a) Heights and diameters of individual trees. Stem diameter is measured at the base for stems < 1.4 m in height and at 1.4 m for all other stems. Note the discontinuity this imposes on the plots of raw data. (b) Model fit with location of change-points (vertical dotted lines) and estimated tree height (solid line). Note that the discontinuity has been adjusted for in the model fit (data points have been adjusted by the parameter ‘k’ that estimates the difference in diameter between stem base and breast height).

Table 1.  Parameter estimates for the two change-point model of tree allometry
  1. Estimates are based on 40 000 samples following a burn in of 10 000 samples. ‘Burn in’ is the practice of discarding early Markov Chain Monte Carlo (MCMC) iterations to allow convergence to the target distribution before using samples to learn about model parameters (Gelman et al., 2003). Although our model was parameterized in terms of precisions (inverse of variance), we report results in terms of variances for ease of interpretation.

inline image0.0005520.0004420.000688
inline image1.150.7931.59
inline image11.59.5113.9

Seedling recruitment

Our models captured the observed variability in seedling counts across transects (Figs 5 and 6). Three of the four nonRhododendron transects showed evidence of a strong gap effect (e.g. higher recruitment rates in the center portion of the transect) with a much weaker effect in the remaining transect (Fig. 5). All three of the Rhododendron transects had higher recruitment rates in the central region of the transect where the gap was located (Fig. 6). The region of elevated seedling density was offset to the north in the Rhododendron transects, as expected in the northern hemisphere. The seedling model that included spatial correlation captured much of the local variation in seedling density within transects outside of the main gap-canopy effects (Figs 5b and 6b).

Figure 5.

The mean λ's or ‘seedling recruitment rates’ (solid lines) along with 95% credible interval (broken lines) for each 1 m2 quadrat (1–40) in the four transects that lack Rhododendron. The four transects are displayed in separate vertical panels. The open circles represent the observed seedling counts. The left column (a) display results for the nonspatial model while the right column (b) displays results for the spatial model.

Figure 6.

The mean λ's or ‘seedling recruitment rates’ (solid lines) along with 95% credible interval (broken lines) for each 1 m2 quadrat (1–40) in three transects with Rhododendron. The three transects are displayed in separate vertical panels. The left column (a) display results for the non-spatial model while the right column (b) displays results for the spatial model.

The expected seedling recruitment rate λ* was greater in gaps compared with closed canopy conditions, with a probability between 0.92 and 1.00 depending on model and Rhododendron presence (Table 2). The spatial model resulted in a higher probability of a gap effect than the nonspatial model: 0.99 compared with 0.92, and 0.97 compared with 1.00 for nonRhododendron and Rhododendron conditions, respectively (Table 2). The number of seedlings was 2.8 times greater in gaps without Rhododendron compared with adjacent canopy, but 5.4–7.4 times greater in gaps with Rhododendron. Despite the larger gap effect with Rhododendron, seedling density was greater without Rhododendron by a factor of 2.6 (gap conditions) to 5.1 (canopy conditions). Seedling density was greater without Rhododendron regardless of canopy condition with probabilities ranging from 0.90 to 0.98 (Table 2).

Table 2.  Results for seedling counts in gap and canopy conditions in the presence or absence of the understory shrub Rhododendron
 NonRhododendron (R)Rhododendron (R+)
Nonspatial modelSpatial modelNonspatial modelSpatial model
  1. inline image and inline image refer to the expected recruitment rate of seedlings in gap or closed canopy conditions, while R and R+ refer to the absence or presence of Rhododendron, respectively. The estimates presented are means while the values in the parentheses given the 95% credible intervals, based on 40 000 samples following a burn in of 10 000 samples. The ‘Multiplicative Factor’ represents the factor by which the number of seedlings found in quadrats is increased in gap relative to closed canopy conditions.

inline image0.920.990.971.00
inline image4.82 (1.65,12.9)1.83 (0.50,5.89)
inline image1.73 (0.60,4.69)0.34 (0.08,1.13)
Multiplicative factor2.84 (1.68,4.11)7.45 (2.48,21.5)
Canopy Gap 
P(R > R+)0.980.970.900.91

Our estimates of seedling recruitment rate integrated across the uncertainty in change-point locations (Fig. 7). The change-point locations were better defined in the Rhododendron compared with the nonRhododendron transects (Fig. 7a,c vs 7b,d). Most of the probability was distributed in narrow regions for both the first and second change-points in the Rhododendron plots but was more diffusely distributed in the nonRhododendron plots. This result is consistent with the larger gap effect estimated in the Rhododendron plots (Table 2).

Figure 7.

The probable locations of the first and second change-points for new (first-year) seedlings along transects in 1997 estimated for (a) nonRhododendron plots using nonspatial model, (b) Rhododendron plots using nonspatial model, (c) nonRhododendron plots using spatial model and (d) Rhododendron plots using spatial model. The first change-point is constrained to occur in the first half of each transect (left of broken line) while the second change-point must occur in the second half of the transect. The height of the bars represents the probability of the change point occurring in a given quadrat.


Change-point models provide a methodology for concurrently estimating the location of thresholds in ecological or biological processes, while also estimating parameters that describe that process on either side of the threshold. We were able to incorporate the uncertainty in our change-point locations into estimates of parameters of interest (e.g. slopes and variance of an allometric relationship, and recruitment rates of seedlings). We anticipate that this methodology will be particularly appropriate for modeling ecological responses to current and past global climate change. Anthropogenic forcing of the climate system is increasingly expected to result in large nonlinear system responses as thresholds are crossed (CACC, 2002): increasing evidence suggests that ecological systems will display similar threshold responses (Higgins et al., 2002; Maslin, 2004). The change-point methodology presented here provides a technique for modeling these threshold processes.

Tree allometry

Our two change-point model captured the relationship between tree height and diameter for longleaf pine (Fig. 4b). We believe that the piecewise linear formulation of our change-point model is a biologically appropriate model for this process since the change-points reflect significant life-history events in longleaf pine. The first change-point, for example, represents the transition from the ‘grass’ stage during which longleaf pine juveniles experience growth in stem diameter and in their root system but little growth in stem height. This is an adaptation to frequent fires: the juveniles have dense tufts of needles (superficially resembling a grass clump) that protect their apical meristems from fire damage until they have stored sufficient energy reserves for rapid height growth. This ‘bolting’ strategy safely removes the apical meristem from the region in the understory where fire damage is most likely (Platt et al., 1988; Platt & Rathbun, 1993). The second change-point reflects the entry of the longleaf into the canopy, after which height growth slows with respect to diameter growth. Once the stem transitions into the canopy, height growth is less important as competition for light is reduced (i.e. the stem has captured a region of the canopy with unrestricted exposure to sunlight). We note that inspection of Fig. 4b suggests that an additional change-point may be justified in the diameter range of 60–70 cm, where height growth appears to cease (perhaps as a result of recurrent hurricane damage; Platt & Rathbun, 1993). There are currently insufficient data to support definitively calling this a change-point: the likelihood is flat and a third change-point tends to be placed at smaller diameters. One advantage of Bayesian methods is that we can use our own judgment to place a strong prior distribution on the location of the change-point in the region of 60–70 cm, forcing a change-point in this region.

Seedling recruitment

Our change-point analysis found strong evidence for an effect of both gaps and Rhododendron on seedling recruitment. The density of seedlings was greater in gaps compared with closed canopy regardless of the presence of Rhododendron (Table 2). Seedling density was greater in areas outside of Rhododendron in both gap and canopy conditions compared with areas lacking the shrub (Table 2). The change-point locations were more clearly identified in the Rhododendron transects than in transects without Rhododendron (Fig. 7), which was consistent with a stronger gap effect, relative to canopy conditions, within Rhododendron transects (Table 2). Light levels are generally much lower beneath Rhododendron than in forests lacking the shrub (Beckage et al., 2000); our results suggest that even modest increases in light levels associated with small overstory gaps can increase seedling recruitment under very low light conditions, such as occur beneath Rhododendron. The more equivocal identification of the change-point locations in areas without Rhododendron may be because background light levels beneath the intact canopy in areas lacking Rhododendron were relatively high (Beckage et al., 2000), which would tend to obscure the gap/canopy boundary when the gap effect on light levels is not large. Our canopy gaps produced only modest increases in light levels because of their relatively small size and because of the presence of standing dead trees, which would tend to reduce increases in insolation (Beckage et al., 2000).

Bayesian models

We fit out change-point models using Bayesian methodology because of advantages associated with the Bayesian approach compared with classical approaches. The Bayesian analysis easily allows inclusion of information from previous studies through the prior distribution (Gelman et al., 2003). This provides a simple way of building on the results of previous work rather than basing all inferences on the current data only. Furthermore, while some change-point models are estimable using classical statistical methodology, extending these methods to more complex problems that are hierarchical or that incorporate spatial autocorrelation can be problematic. Bayesian methods can easily accommodate a large class of complex models (Clark, 2005). Finally, the types of inferences available from Bayesian models are better able to address questions of direct scientific interest (Ellison, 2004). The results of Bayesian analyses are straightforward to interpret (i.e. they are probability statements about model parameters conditional on the data and any available prior information, which is not the case in classical statistics).

Change-points models have been used extensively in the statistical literature, but have not been commonly used in ecology. Edge detection methods have been used by ecologists to identify regions of rapid change; this is a similar objective to change-point analyses (Fortin & Drapeau, 1995). Ver Hoef (1996) presented a change-point model for vertical cover in a single transect through a grassland, and fitted the model using an empirical Bayes procedure. The methodology used here differs from this application in several aspects. Our transect model accommodates multiple transects by combining them into several similar groups and using a hierarchical model to combine information from across groups. Our Bayesian approach also avoids the practice of using the current data to estimate parameters of the prior distribution, advocated in empirical Bayes methods, but generally not accepted by Bayesians (Ver Hoef, 1996). Finally, we explicitly model spatial dependence between adjacent quadrats: these methods could also be used to model temporal dependence in time series data.


We thank Jay Ver Hoef for his insightful comments on an earlier version of this manuscript. This paper was prepared as a contribution to an Ecological Society of America Symposium on ‘Mucking through multifactor experiments: design and analysis of multifactor studies in global change research’ in Memphis, Tennessee, USA.