Spatially modelling native vegetation condition


  • Andre Zerger,

  • Philip Gibbons,

  • Simon Jones,

  • Stuart Doyle,

  • Julian Seddon,

  • Sue V. Briggs,

  • David Freudenberger

  • Andre Zerger and David Freudenberger are in the Agricultural Landscapes Program, CSIRO Sustainable Ecosystems (GPO Box 284, Canberra, ACT 2601, Australia. Tel. +61 2 6242 1691. Fax +61 2 6242 1565. Email:, Phil Gibbons, Sue Briggs, Julian Seddon and Stuart Doyle are with the Department of Environment and Conservation (New South Wales) (GPO Box 284, Canberra, ACT 2601, Australia. Email:,,, Simon Jones is based at the School of Mathematical and Geospatial Sciences, RMIT University (GPO Box 2476V, Melbourne, Vic. 3001, Australia. Email: The research team is currently building on these findings and addressing earlier limitations through a major vegetation condition mapping research project in the Murray Catchment of New South Wales.


Summary  The assessment of vegetation condition is seen as an increasingly important requirement for effective biodiversity conservation in Australia. Condition assessments that operate at the scale of the site are well established. However, there is a need for mapped representations of vegetation condition at regional scales to: (i) assist with regional planning and target setting; (ii) provide regional context for site-based assessment; and (iii) monitor the change in vegetation condition at multiple scales. This paper describes a methodology for converting site condition data collected in plots into maps of vegetation condition across entire regions using a predictive statistical modelling framework (Generalized Additive Modelling) combined with a GIS. The research demonstrates how explanatory variables including topographic position, terrain roughness, landscape connectivity and remote sensing derived indices can be used to map the condition of native vegetation at the scale of a subcatchment. The inclusion of indices derived from remotely sensed imagery (SPOT4) as explanatory variables in the modelling is a novel component of this research. Although the methodology generates statistically and ecologically plausible models of vegetation condition, there are nevertheless limitations associated with the way plot data were collected and some of the explanatory variables, which impacts upon model utility. We discuss how these problems can be minimized when embarking upon studies of this type. We demonstrate how maps produced from exercises such as this could be used for conservation planning and discuss the limitations of these data for monitoring.


The extent, configuration, type and condition of native vegetation represent important surrogates of biodiversity. Although there are established methods for mapping the extent (e.g. Caccetta et al. 2003), configuration (e.g. McGarigal et al. 2002) and type (Cawsey et al. 2002; Ferrier et al. 2002) of native vegetation, operational methods for mapping surrogates of vegetation condition at the scale of the stand remain in their infancy. Noss (1990) suggested that vegetation condition encompasses vegetation structure, composition and function and numerous surrogates have been suggested (e.g. Tongway & Hindley 1995; McElhinny et al. (in press); Oliver 2003). Depending on specific objectives, different combinations of these surrogates have been combined into indices of vegetation condition (reviewed by McElhinny et al. 2005).

Vegetation condition is typically assessed on-site. However, there is a need for mapped representations of vegetation condition at regional scales (i) to provide regional context for site-based assessment; (ii) to assist with regional planning and target setting; and (iii) to monitor the change in vegetation condition at multiple scales.

The aims of this research are (i) to spatially predict an index of vegetation condition over a large area by combining remotely sensed imagery with commonly available GIS datasets in a spatial modelling framework; and (ii) to examine the applicability of the methodology and results for management. This paper provides only a cursory overview of the spatial prediction methodology and devotes more attention to management uses of mapped vegetation condition information and the limitations of the methodology from a nonanalytical perspective. Given this is one of the first attempts to map vegetation condition using these methods (see also Newell et al. this issue), the findings provide an important starting point to operationalize such tools and for asking important questions about management requirements.


Study area

This study was undertaken over an area of 260 000 ha within the Little River subcatchment, approximately 30 km south of Dubbo in Central West New South Wales (148.33–148.88 E and 32.52–33.05 S). The study area occurs within the Central West Catchment Management Area, South-west Slopes Bioregion and Murray-Darling Basin. Mean annual rainfall ranges from approximately 530–750 mm and mean annual temperature from 13 to 17°C (Seddon et al. 2002). The majority of the subcatchment's topography is gently undulating slopes and flats, grading to steep terrain reaching an elevation of over 700 m around the edges of the subcatchment (Seddon et al. 2002). Major land uses in the study are cropping (20%), livestock grazing (63%) and national parks (14%). The remainder is primarily other crown land (e.g. road reserves, travelling stock reserves) and urbanized areas.

Nineteen per cent of the subcatchment supports native vegetation (Seddon et al. 2002). Most of the intact remnant vegetation occurs on the less fertile and steeper terrain. Remnant vegetation on the relatively fertile, lower slopes and flats occurs predominantly as small, discrete, narrow linear strips and isolated scattered trees. Seddon et al. (2002) identified six broad vegetation communities: (i) ridge communities dominated by several species including Black Cypress Pine (Callitris endlicheri), Mugga Ironbark (Eucalyptus sideroxylon) and Tumbledown Gum (Eucalyptus dealbata); (ii) woodland dominated by Grey Box (Eucalyptus microcarpa) and White Cypress Pine (Callitris glaucophylla); (iii) woodland dominated by Fuzzy Box (Eucalyptus conica); (iv) woodland dominated by Yellow Box (Eucalyptus melliodora) and Blakely's Red Gum (Eucalyptus blakelyi); (v) forest dominated by River Red Gum (Eucalyptus camaldulensis); and (vi) woodland dominated by White Box (Eucalyptus albens).

Development of a simple vegetation condition score

We calculated a vegetation condition score at 239 × 0.1 ha (50 m × 20 m) plots across the study area. These plot data were collected for a project principally designed to map vegetation communities (Seddon et al. 2002). Nine plots were selected randomly within each of 27 strata defined by mean annual rainfall, mean annual temperature and great soil group, but sampling was restricted to the three largest patches of native vegetation within each stratum (Seddon et al. 2002). The condition score was derived from seven variables measured on each plot that principally represent vegetation structure (canopy cover, woody cover < 4 m, non-woody ground vegetation cover (< 0.5 m), number of hollow-bearing trees, mature trees (stems > 60 cm d.b.h.)) and some aspects of function including tree dieback, tree regeneration (stems < 10 cm d.b.h.). We calculated the condition score for each plot using an approach similar to the method described by Parkes et al. (2003), that is (i) assigning a score from 0 to 3 to the value recorded for each variable on each plot; (ii) summing these scores across all variables for each plot; and (iii) scaling the summed values within a possible range from 0 to 10. The score from 0 to 3 for each variable was calculated by comparing the measured value for each variable with benchmarks derived from an independent dataset of 405 plots in comparable vegetation types, but restricted to sites in relatively unmodified condition (P. Gibbons pers. comm., 2003). We were confined to using variables that were collected consistently in the two datasets, so some indicators that would otherwise be useful (e.g. native plant species richness, weed cover) were not included in our condition score. The measured value for each variable on a plot was scored from 0 to 3 as follows: (i) if the value for that variable lay within the 95% confidence interval of the mean benchmark then it was given three points; (ii) if the value of that variable was either > 50% of the lower 95% confidence interval, or < 150% of the upper 95% confidence interval of the benchmark then that variable was given two points; (iii) if the value of that variable was either < 50% of the lower 95% confidence interval, or > 150% of the upper 95% confidence interval for the benchmark then that variable was given one point; and (iv) zero points were allocated if the value for that variable was zero, unless zero either fell within the 95% confidence interval of the benchmark.

Modelling the condition score – From site data to mapping

A number of methods exist for interpolating site-based data, such as our condition score calculated on 239 plots across the study area, to continuous spatial representations or a map. These include distance-weighted algorithms, neural network-based methods, expert-system approaches and statistical modelling (generalized linear models, generalized additive models, regression trees). In the vegetation sciences, most methods rely on quantifying the association between a response variable (in this case our vegetation condition score) at a site and several explanatory variables (e.g. topography, climate, land use, geology) (Austin 2002). An understanding of this relationship at discrete locations is then used to infer the pattern across the entire region. The choice of explanatory variables for predicting vegetation condition was limited to those that are readily available in GIS format.

The full suite of explanatory variables tested in the vegetation condition model are shown in Table 1. Terrain roughness is derived from a 25-m resolution digital elevation model and is a measure of the topographic variability (standard deviation) about a moving window. Landscape connectivity is a GIS layer, which integrates both patch size and proximity to other patches in the landscape into one spatially explicit index (M. Drielsma, pers. comm., 2005). We used the algorithm of Gallant & Wilson (2000) to derive topographic position where topographic position is the difference between the elevation in the focus of a moving window, and the mean elevation in that window. The output is an index from 0 to 1 where 1 is considered a position high in the landscape (i.e. a hilltop).

Table 1.  Explanatory variables tested in the GRASP/S-Plus framework for the Little River subcatchment
ElevationDerived from 25 m New South Wales (NSW) state DEM (Department of Infrastructure
Planning and Natural Resources (DIPNR)) – (263–803 m AHD)
SlopeDerived from 25 m NSW state digital elevation model (DEM)
DIPNR – (0–39 degrees)
Topographic positionDerived from 25 m NSW state DEM and processed using a topographic position algorithm (Gallant & Wilson 2000). The algorithm allocates each pixel into a topographic position class ranging from 0 (valleys) to 1 (hilltops).
Neighbourhood terrain roughness (3 × 3, 5 × 5, 10 × 10, and 20 × 20 grid cell windows)Standard deviation of elevation within a neighbourhood of cells (0–16.5)
Percent vegetation cover (5 × 5, 10 × 10, and 20 × 20 grid cell windows)Percentage of a moving raster window occupied by vegetation mapped from SPOT4 satellite imagery – (0–100% cover)
Vegetation patch areaArea in hectares of each vegetation patch derived from woody/non-woody mapping from
SPOT4 satellite image – (400 –> 100 000 m2)
Landscape connectivityLandscape connectivity index developed by Drielsma et al. (2005)
Cadastral parcel areaCadastral layer containing area estimate for each cadastral unit (3.2 to > 100 000 m2)
Town distanceDistance in kilometres to nearest town – (20–18 000 m)
SPOT NDVINormalized Difference Vegetation Index derived from SPOT 4 satellite imagery (−0.1–0.1)
SPOT SAVISoil Adjusted Vegetation Index derived from SPOT 4 satellite imagery – (−0.1–0.1)
Landsat ETM NDVINormalized Difference Vegetation Index derived from Landsat 7 TM satellite imagery – (−0.1–0.1)
Landsat ETM SAVISoil Adjusted Vegetation Index derived from Landsat 7 TM satellite imagery – (−0.1–0.1)
Land useAggregated land use categories from DIPNR land use mapping
Vegetation typeModelled, precleared vegetation communities (Seddon et al. 2002)
RainfallMean annual rainfall surface (mm) developed using ESOCLIM – (520–789 mm)
TemperatureMean annual temperature (°C) developed using ESOCLIM – (13–18°C)

A novel element of this research was the inclusion of remotely sensed data as potential explanatory variables in the predictive model. Remote sensing is rarely used as an explanatory variable in predictive modelling. Its use is typically restricted to assisting modellers delineate extant vegetation or as a starting point to derive patch metrics to be used in a model (e.g. patch area, isolation metrics and measures of fragmentation). However, it also measures other attributes of vegetation. For instance, the normalized difference vegetation index (NDVI) derived from SPOT4 satellite imagery provides an indication of vegetation photosynthetic activity and is one of the most commonly used indexes in remote sensing (Mather 1999).

To construct the final predictive model, the value for each potential explanatory variable at each of the 239 vegetation plots was extracted from the GIS and remote sensing data (SPOT4, Landsat 7 +ETM). This process was made more efficient by the development of the GridSampler pre-processor available freely from the authors.

In this research we built a model of vegetation condition using generalized additive modelling (GAM) with the Generalized Regression Analysis for Species Prediction (GRASP) framework (Lehman et al. 2003a,b). GRASP is an add-in to S-PLUS (Version 6.1) but can also operate under the publicly available R statistical analysis package. A key feature of GRASP is the ability to generate look-up-tables, which allow a GAM model built using spatial explanatory variables to be converted to a map using GIS software (ArcView GIS Version 3.2). GRASP uses a forward stepwise procedure which excludes variables if they are deemed not to make a statistically significant contribution to the final model. In such cases, we assume that either the variable is not a significant predictor (significance level ≥ 0.05); or that the GIS layer representing the variable is inadequate in either resolution or thematic detail. Further, explanatory variables that are correlated above some user-defined threshold (0.75 in this case where the order by which variables are assessed determines which variable is removed) can also be systematically removed from further modelling. This is quite an important analysis step given the nature of the explanatory environmental variables typically used in modelling.

Results and discussion

Observed values for the vegetation condition score calculated on the 239 plots across the study area ranged from 3.8 to 9.5 out of a possible range of 0–10 and were approximately normally distributed (Fig. 1). Thus, very highly modified native vegetation was not sampled. This was because these plot data were originally collected for a different objective (i.e. to map extant vegetation communities and predict pre-European vegetation communities) that required relatively intact examples of remnant native vegetation to be sampled (Seddon et al. 2002). This may be a recurring issue for other studies if sampling is not representative of all remnant native vegetation. Existing data collected for an alternative purpose, particularly if based on biased sampling, is likely to be inadequate for accurate spatial modelling of vegetation condition across the entire landscape. This may often be the case where data collected for mapping vegetation communities is used to map vegetation condition, as plots are typically located in ‘type sites’ or remnants that contain most elements of the original community rather than sampling the full condition spectrum.

Figure 1.

Condition score histogram for the Little River subcatchment. The condition score is scaled to a possible range of 0–10 (239 plots).

Key predictors of vegetation condition

The significant explanatory variables in the generalized additive model predicting vegetation condition using the condition score were neighbourhood terrain roughness (window size of 10 × 10 cells), landscape connectivity (window size of 20 × 20 cells), topographic position and SPOT 4 NDVI. When other variables are held at their mean values, vegetation condition was predicted to be highest on sites with moderate values of terrain roughness; to increase with habitat intactness; to be higher in higher parts of the landscape; and to increase with increasing NDVI values. Landscape connectivity made the most important contribution to this model, followed by topographic position, terrain roughness and SPOT4 NDVI, respectively. Most of these results can be expected. That is, parts of the landscape that are not generally developed for intensive agricultural production (e.g. areas with high terrain roughness and positioned on higher points in the landscape) and large remnants of native vegetation that have more habitat unaffected by edge-effects (i.e. high landscape connectivity index) were, on average, predicted to be in better condition. The positive statistical association between NDVI and vegetation condition suggests that higher quality vegetation is associated with higher levels of photosynthetic activity. This relationship may be explained by the fact that the site condition assessment is also measuring the amount of vegetation cover in each strata which is linked to green leaf biomass and consequently with the NDVI index.

Predictions at the margins of observed data (i.e. vegetation condition scores of 3–5 and 8–10) were unreliable, as indicated by wide standard errors. This reflects a low sampling size at these margins indicating that programs aiming to provide an unbiased map of vegetation condition across a region must adequately sample native vegetation in very poor condition (e.g. scattered paddock trees). A low sampling size of vegetation with scores > 8 probably reflects a scarcity of vegetation in this condition.

As the total number of plots was relatively low, independent data were not withheld for validating the model. Model validation instead consisted of a cross-validation between predicted and observed condition scores at the 239 plots. Observed versus predicted scores had a correlation coefficient of 0.47. For comparison, Newell et al. (this issue) achieved a correlation coefficient of 0.51 for their neural network-based model of vegetation condition, although this was based on a substantially larger number of plots (n > 3000) and utilized different explanatory variables.

The statistical relationship between vegetation condition and the explanatory variables was converted to a GIS-compatible spatial prediction using look-up-tables provided by GRASP. Figure 2 shows the input explanatory variables for a small region of the study area to highlight the spatial variation in each of the explanatory variables to the final model. The final spatial prediction of the condition score across the entire subcatchment is shown in Fig. 3 after masking to show only extant native vegetation. Masking was conducted using a SPOT4 satellite image that was classified to identify woody vegetation.

Figure 2.

Ikonos Panchromatic data for one area of the study site and explanatory variables in the model of vegetation condition. (A) Ikonos panchromatic image, (B) vegetation condition model, (C) terrain roughness, (D) SPOT4 NDVI, (E) landscape connectivity, and (F) topographic position. All images are shown as greyscale where darker values indicate higher values for that variable. Table 1 provides a description of each of these variables.

Figure 3.

Vegetation condition model for the Little River Catchment – condition score. The map also shows the location of field plots used to calibrate the predictive model.

There was little variation in the vegetation condition scores for each vegetation type in the Little River subcatchment. We believe this does not reflect actual patterns but is due to the sampling strategy used. Remembering Figure 1, we note that the majority of condition scores for the 239 plots occurred between about 6.5–7.5 with very few sites in the low and high regions of the distribution. Consequently, the model has limited predictive power at the upper and lower limits of the Gaussian distribution of the condition data. Operationally this means that the full spectrum of possible vegetation conditions need to be captured to enable effective modelling. Further, it highlights the limitation of only treating statistical significance in modelling, or positive results from cross-validation, as indicators of model effectiveness.

Using vegetation condition mapping to inform management

It is generally accepted that there is a need to spatially extend site-based assessments of vegetation condition to broader scales. For instance, it has been suggested that the availability of maps of vegetation condition can be used to better inform the delivery and prioritization of vegetation enhancement activities and incentive payments. But how should this information be used to inform decision-making? To explore this, we provide some examples of how a mapped vegetation condition surface could be used to support native vegetation management at regional scales.

Provide regional context to site assessments

Can we assess the relative significance of a site score without mapped data for that vegetation community? Where mapped vegetation condition data may be most useful is to provide regional context for site-based assessments of vegetation condition. For instance, is a condition score of 6.5 at a particular site considered relatively high or low for that community across the region? Should a vegetation management officer be concerned by such a score? This strikes us as a critical question, which cannot be answered without either an extremely dense site-based sampling intensity, or modelled vegetation condition data. Clearly, the former is unrealistic and hence mapped condition information is particularly useful. For illustration, we have conducted such an analysis for three vegetation communities in the Little River subcatchment and for three sites. Figure 4 shows the in situ condition score measured at one plot for each of these communities compared with the full range of modelled condition scores found in that vegetation type. This example shows that the absolute condition scores for the three sites would be interpreted differently if considered relative to the overall modelled condition for that vegetation type. For example, a site with a condition score of 6.5 in a ridge community would be in relatively poor condition, whereas a site with the same score in a Fuzzy Box community would be in about average condition. Such data allow managers to contextualize their site assessments relative to regional patterns.

Figure 4.

Frequency histogram showing condition scores for the Ridge, White Cypress Pine and Fuzzy Box vegetation communities present in the Little River subcatchment. Individual sampling sites are shown as dots in the frequency histogram at their respective condition score (Ridge communities = Site 12, White Cypress Pine communities = Site 225, Fuzzy Box communities = Site 164). Frequency histograms have been derived using a 100-m sampling resolution across the study domain.

Assist with regional planning or target setting

A map of the pre-European extent of vegetation communities was available for the study area (Seddon et al. 2002). Many regional-scale vegetation management decisions are guided largely by the percentage that each vegetation type has been cleared since European settlement. The availability of spatial information on vegetation condition can add value to this information. One such approach is to calculate the mean mapped condition score for a particular vegetation type, land tenure or land use type to assist with a risk assessment.

A comparable, but conceptually different approach for integrating mapped vegetation condition data for vegetation management is to develop a vegetation risk matrix using percentage cleared and mean condition scores derived for each vegetation type. Table 2 shows this using theoretical vegetation types (A–F) where each vegetation type has been allocated to a position in the matrix. The risk categories are user defined and involve expert judgement regarding the interaction of both extent of clearing and current mean condition. Irrespective of the class intervals used, the matrix provides a means for systematically ranking and comparing vegetation. For a more detailed treatment of such approaches, see McIntyre and Hobbs (1999, 2001) and Hobbs and Harris (2001).

Table 2.  A hypothetical vegetation risk matrix incorporating both percentage cleared and mean condition scores for each vegetation type (A–F). Darker shaded cells represent high risk and lighter cells represent lower risk Thumbnail image of

Application to monitoring: Limitations and improvements

A vegetation condition index, or score, of the type developed and modelled in this study may not be a suitable tool for monitoring. There are a number of reasons for this. First, the raw data for each variable are summarized into one of four classes (0–3). Thus, there may need to be a substantial change to a variable before it is detected in the scoring system. Second, we modelled the total vegetation condition score, so it is not possible to determine the contribution of individual variables to changes in condition across the study area. Using the raw data for each variable and modelling each variable individually may be a more appropriate approach so: (i) the modelled vegetation condition information is continuous and therefore more sensitive to change over time; and (ii) the individual contribution of each variable to the condition score can be assessed.

There are also a number of limitations associated with the spatial modelling methodology which impact upon our ability to monitor condition. First, model uncertainty may be so great that differentiating model uncertainty from temporal variation in condition may be difficult. Further research into the role of model uncertainty and its potential impact on monitoring is required. Second, ‘static’ factors such as topographic factors play an important role in predicting current condition (e.g. topographic position and terrain roughness), however, if catchment-scale vegetation condition improves in future years this is likely to occur due to changes in land-use management and vegetation enhancement activities, rather than any changes in topographic factors. As such, any future modelling would need to account for this. Obtaining management and disturbance data is likely to be a major limitation. One solution may be to use satellite imagery, which can provide finer resolution vegetation cover estimates that can, in turn, identify subtle temporal changes in cover. This may alter the results obtained for landscape connectivity metrics in the model. However, given the current importance of topographic factors, the value of this may be limited.

Given that historic disturbance data are unlikely to become readily available, a useful approach may be to apply a ‘baseline’ rationale. Namely, the modelling methodology developed for the Little River subcatchment can establish a current vegetation condition baseline, given that it is mapping the consequences of over 150 years of intensive modification. In other words, topographic factors and landscape arrangement are useful surrogates for capturing the historical impact of disturbance in agricultural landscapes; however, future modelling will need to account for vegetation enhancement activities, which may violate the assumptions inherent in the current predictive model.

Establishing baseline vegetation condition spatially may be a useful starting point for long-term vegetation management priority setting. However, with the addition of new data, the methodology could be improved to account for vegetation enhancement activities if information pertaining to time of planting and survival rates were available. However, mitigation strategies such as revegetation may not be off a sufficient spatial extent, or have adequate temporal variability to be discriminated by satellite based imaging systems on an annual basis. Consequently, the time interval between assessments is likely to be an important factor when assessing the potential for monitoring.


This paper has explored the potential of using spatial modelling, integrated with explanatory GIS variables and remote sensing to interpolate site-based vegetation condition indexes over regions such as subcatchments. The method is appealing as the explanatory variables identified as significant in the model can be readily obtained for most regions of New South Wales and Australia. With the addition of appropriately stratified plot-based condition data, condition surfaces could be readily developed for other regions to address biodiversity conservation decision-making needs. However, a fundamental limitation of the methodology is its ability to act as a temporal vegetation condition-monitoring tool. We have argued that its value may be in mapping ‘baseline’ vegetation condition status from which vegetation enhancement activities can commence.

An important conclusion from this research is the need to develop a ‘tool-box’ approach for the ongoing mapping of vegetation condition. Given that the methodology generates useful results, there is a need to develop a framework that allows for the ongoing refinement of a model by inclusion of new condition and GIS data very rapidly. This may include the development of relational database technologies that provide a central store for site-based condition data and GIS data which can be readily accessed to build a model for a new region, for a larger region, or as additional in situ site data becomes available. This requires a conceptual move away from a static view mapping of vegetation, to one that is dynamic facilitating the staged update of models as data and methods are refined.

Spatially explicit vegetation condition data such as that developed for the Little River Catchment is likely to become an increasingly important biodiversity decision support tool. However, based on the findings from the National Vegetation Condition Mapping Workshop, methods for mapping vegetation condition in fragmented remnant-dominated agricultural landscapes in Australia are still in their infancy. Similarly, further research is required to determine how modelled condition information could be used to support management objectives. This research has also raised a number of important issues relating to optimum vegetation survey designs for condition modelling and has highlighted the potential of remote sensing as an input to spatial models of vegetation condition. An ongoing project is addressing some of the limitations encountered in the Little River study with a view of both improving the accuracy of the modelling and its relevance to decision-makers.