Estimating mean plant cover from different types of cover data : a coherent statistical framework

Plant cover is measured by different methods and it is important to be able to estimate mean cover and to compare estimates of plant cover across different sampling methods in a coherent statistical framework. Here, a framework that incorporates (1) pin-point cover data, (2) visually determined cover data, and (3) ordinal cover classification systems (e.g., Braun-Blanquet cover data) is presented and tested on simulated plant cover data. The effect of measurement error when applying a visual determination of plant cover is considered. Generally, the estimation of the mean plant cover was well-behaved and unbiased for all the three methods, whereas the estimate of the intra-plot correlation tended to be upward biased and especially so if the plant cover data was collected using the Braun-Blanquet method. It was surprising that the Braun-Blanquet sampling procedure provided mean plant cover estimates that were comparable to the other sampling schemes. This method shows promise in the attempt to use the large amount of historic Braun-Blanquet plant cover data in the investigation of the underlying causes for observed vegetation changes.


INTRODUCTION
In many basic and applied plant ecological studies, plant abundance is measured by plant cover, i.e., the relative projected area covered by a species.Plant cover takes the size of individuals into account and is an important and often measured characteristic of the composition of herbal plant communities (Kent and Coker 1992).Depending on the type of vegetation analysis, the requirement of plant cover data differs.When the dynamics of individual species or functional types are investigated among treatments or along time or space, it is essential that the estimates of plant abundance are both accurate and unbiased whereas, for habitat classification or ordination of plant communities, the accuracy of abundance is less important than the correct determination of the species pool.
Plant cover is measured by different methods.The most common way to measure plant cover in herbal plant communities is to make a visual assessment of the relative area covered by the different species in a small circle or quadrate (Kent and Coker 1992), and often the visual estimates of cover percentages are categorized using different ordinal classification schemes (e.g., Braun-Blanquet 1964).However, the visual assessment of cover has been criticized for being too subjective, i.e., too strongly dependent on the person who makes the observation, and can be quite variable (Floyd and Anderson 1987, Ken-nedy and Addison 1987, Klimeš 2003, Milberg et al. 2008, Vittoz et al. 2010), and an alternative, more objective methodology, called the pin-point method (or point-intercept method), has been widely employed.In the pin-point method, a thin pin is inserted vertically into the vegetation a number of times in a fixed design, and the cover of a species is measured by the proportion of the inserted pins that touches the species (Levy andMadden 1933, Kent andCoker 1992).However, the pin-point method is not relevant for measuring the abundance of rare species and has been shown to underestimate species richness (Bra ˚kenhielm and Qinghong 1995).
In the near future, it is expected that plant cover estimation will be automated using, e.g., image analysis, and new methods of measuring plant cover are currently being developed (e.g., Seefeldt and Booth 2006).Nevertheless, there is an enormous amount of visually determined plant cover data that have been classified by different methods (e.g., Dengler et al. 2012, Walker et al. 2013), and there exists a need (1) to analyze visually determined plant cover data in a well-behaved statistical framework and (2) to estimate relevant sample statistics of plant cover data that have been sampled by different methods and to place these statistics in a coherent statistical framework.
Many plant species have been shown to have an aggregated spatial pattern due to e.g., the size of the plant, clonal growth, and limited seed dispersal (Pacala and Levin 1997, Herben et al. 2000, Stoll and Weiner 2000, Chen et al. 2006, 2008).At the local scale, it has been demonstrated that pin-point plant cover data, typically, are over-dispersed relative to the binomial distribution for many dominating species (Damgaard 2008(Damgaard , 2009)).It has been demonstrated that conclusions of statistical inferences on plant cover data depend critically on whether local spatial aggregation has been taken into account (Damgaard 2013), and it is consequently important that the statistical framework is able to accommodate local spatial aggregation.Here, the within-site spatial aggregation of plant cover data is modeled using the flexible beta-distribution which previously has been shown to model plant cover data adequately (e.g., Chen et al. 2006, 2008, Damgaard 2012, 2013).
The aim of this study is to develop a statistical framework that is suited for modeling different types of plant cover measurement in such a way that it will be possible to compare samples of plant cover data that have been obtained by different methods through time or space.More specifically, the statistical framework will incorporate the modeling of (1) pin-point cover data, (2) visually determined cover data, and (3) ordinal cover classification systems (e.g., Braun-Blanquet cover data) and will be readily generalizable to other measures of plant cover.Furthermore, the effect of measurement error when visually determining plant cover will be briefly touched upon.

MATERIALS AND METHODS
The distribution of plant cover data Since the distribution of plant species at a site typically is spatially aggregated, the true plant cover, m, is assumed to be beta-distributed with mean parameter q and intra-plot correlation parameter d, m ; Betaðq=d À q; ð1 À qÞð1 À dÞ=dÞ ð1Þ where both q and d are bounded between 0 and 1 (Damgaard 2012).This parameterization of the beta distribution is the same as used in the betabinomial mixture distribution (or Po ´lya-Eggenberger distribution), which, in a number of cases, has been used successfully to model pin-point plant cover data in a pin-point frame with n pins f ðy; n; q; dÞ where u is the Pochhammer function (Damgaard 2008(Damgaard , 2009(Damgaard , 2012(Damgaard , 2013)).
As mentioned above, the visually measured plant cover data are often classified into a system of ordinal cover classes, e.g., the Braun-Blanquet classification, but it could be argued that even in the case in which the plant cover percentages are recorded, plant cover data is not a true continuous stochastic variable, since only integer values are used for recording the observed plant cover percentages.Furthermore, some values, such as ''32%'' tend to be less frequently recorded than, e.g., ''35%.''Consequently, it is suggested that visually determined plant cover data generally be analyzed using the same statistical model based on a discretized beta distribution and that the different types of visually determined data, e.g., percentage data and Braun-Blanquet data, only differ with respect to the used cover classification system.
If c is the visually determined plant cover and d ¼ fd 1 , ..., d i , d j , ..., d m g is a vector that specifies the internal boundaries in the used classification system, e.g., d BB ¼ f0.01, 0.05, 0.25, 0.5, 0.75g specify the Braun-Blanquet classification system and d m ¼ f0.01.0.02, ..., 0.99g specify a system in which percentage plant cover is recorded as integer values, then the parameters of interest q and d (Eq. 1) may be estimated from the following density function gðc; d; q; dÞ where B(z, a, b) is the regularized incomplete beta function I z (a, b) (Wolfram 2013).The parameters used are summarized in Table 1.

Error model of visually determined plant cover data
In order to investigate the effect of measurement errors (Milberg et al. 2008, Vittoz et al. 2010) when estimating plant cover from a sample of visually determined plant cover observations, a simple error model was constructed in which the magnitude of the error was assumed to be unbiased and largest for intermediary plant cover values where e ; Nð0; r 2 e Þ and m e 2 [0, 1].

Simulated plant cover data
The three different methods for obtaining plant cover data were compared using simulated plant cover data of 100 plots (the ''true'' plant cover) that were randomly generated using the beta distribution, Eq. 1, for different fixed values of q and d.The pin-point cover data of the 100 plots were generated by sampling from the ''true'' simulated plant cover of the plot, m, using 16 pins, y ; Binomial(16, m), the visual cover data was generated by applying classification system d m on the ''true'' simulated plant cover data of the 100 plots subjected to measurement error, m e , where r e ¼ 0.5, and the Braun-Blanquet cover data were generated by applying classification system d BB on the ''true'' simulated plant cover data of the 100 plots subjected to measurement error, m e , where r e ¼ 0.5.

Estimation
The joint posterior distribution of the two parameters for the three different likelihood functions were estimated using a Bayesian MCMC algorithm (Metropolis-Hastings), where both parameters were assumed to have a uniform prior distribution in their domain.The MCMC iterations converged relatively quickly to a stable joint posterior distribution (results not shown), and the estimations were made from All calculations were done using Mathematica version 9 (Wolfram 2013), and all the used functions, distributions and calculations may be downloaded from the authors homepage: ''Applied plant ecology: Mathematica notebooks.''

RESULTS
The joint posterior distribution of mean cover (q) and intra-plot correlation (d) was calculated from 100 simulated plots when assuming a uniform prior distribution in their domain for (1) the simulated ''true'' plant cover, (2) the generated pin-point cover data, (3) the generated visual cover data, and (4) the generated Braun-Blanquet cover data (Fig. 1).The marginal posterior distribution of mean cover was wellbehaved and almost independent of the sampling method, whereas the marginal posterior the intra-plot correlation parameter varied considerably among the sampling methods (Fig. 1).These results were more or less identical across the domain of the two parameters, where it was found that the estimate of the median cover was unbiased for all the three methods (Fig. 2), whereas the estimate of the median intra-plot correlation tended to be upward biased, especially if the plant cover data were collected using the Braun-Blanquet method (Fig. 2).
The visual classification scheme had almost no effect on the estimated parameters.If the classification system d m was applied on v,the ''true'' simulated plant cover data, then the marginal posterior distribution of the mean cover (q) and intra-plot correlation (d) was almost indistinguishable from the marginal posterior distribution of the parameters when the classification system d m was not applied (black lines in Fig. 1; result not shown).

DISCUSSION
The presented statistical framework may be used to test ecological hypotheses on mean plant cover and the spatial distribution of plant cover.For example, in order to investigate the effect of an abiotic environmental variable, e.g., climate change, on the mean cover or within-site spatial aggregation of a plant species, or to investigate the change in mean cover over time.
Generally, the estimation of the mean plant cover (q) was found to be well-behaved and unbiased for all the three investigated sampling methods.It was surprising that the rather rough Braun-Blanquet sampling procedure provided plant cover estimates that were comparable in accuracy to the other sampling methods.This is promising in the attempt to use the large amount of historic Braun-Blanquet plant cover data (e.g., Dengler et al. 2012, Walker et al. 2013) to investigate the underlying causes of vegetation changes.
Most plant species show a significant amount of both within-and among-site spatial variation in abundance and it is important to take this spatial variation into account when analyzing v www.esajournals.orgFig. 2. Density plots of the difference between the median of the estimated marginal posterior distribution of mean cover (q) and intra-plot correlation (d) and the true simulated plant cover from 100 simulated plots for variable values of q and d.Note that the scale of differences vary among the plots.v www.esajournals.orgspecies cover (Damgaard 2013).The intra-plot correlation parameter (d) is expected to depend on plot size and, typically, it will not be relevant to compare the estimated intra-plot correlation across different sampling designs.Furthermore, the intra-plot correlation depends on plant size and it is still an open question whether the intraplot correlation parameter of a plant species is expected to vary in a predictable manner among sites (Damgaard andEjrnaes 2009, Damgaard et al. 2013).For example, it may be expected that the intra-plot correlation parameter at a site will increase with mean cover.
In this study, a preliminary investigation of the effect of measurement errors due to the visual estimation of plant cover was made.However, relatively little general knowledge exists on the magnitude and possible bias of this measurement error (Floyd and Anderson 1987, Kennedy and Addison 1987, Klimeš 2003, Milberg et al. 2008, Vittoz et al. 2010), thus, the present investigation of the effect of measurement error on plant cover estimation is incomplete and should only be considered as an approximate indication of the effect of measurement errors.Naturally, the effect of measurement errors will increase with the magnitude of the error and if the errors are biased.Measurement errors are known to depend on the investigator, and if the performances of the individual investigators are measured (Milberg et al. 2008, Vittoz et al. 2010), then this information may be used to correct the plant cover estimates.
It is possible to model different plant cover data in a coherent statistical framework with the same pair of ecologically relevant parameters.In the case in which plant cover needs to be modeled from plant cover data obtained by different sampling methods, it is suggested to model mean plant cover by a latent variable and use the presented statistical framework for measuring the sampling variation and the intraplot correlation for the different sampling designs in a coherent state-space modeling or structural equation modeling approach (e.g., Damgaard 2012, Damgaard et al. 2014).