• Australia;
  • conservation;
  • prediction;
  • species distribution models


  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix
  • 1
    Abiotic environmental predictors and broad-scale vegetation have been used widely to model the regional distributions of faunal species within forested regions of Australia. These models have been developed using stepwise statistical procedures but incorporate only limited expert involvement of the type sometimes advocated in distribution modelling. The objectives of this study were twofold. First, to evaluate techniques for incorporating fine-scaled vegetation and growth-stage mapping into models of species distribution. Secondly, to compare methods that incorporate expert opinion directly into statistical models derived using stepwise statistical procedures.
  • 2
    Using faunal data from north-east New South Wales, Australia, logistic regression models using fine-scale vegetation and expert opinion were compared with models employing only abiotic and broad vegetation variables.
  • 3
    Vegetation and growth-stage information was incorporated into models of species distribution in two ways, both of which used expert opinion to derive new explanatory variables. The first approach amalgamated fine-scaled vegetation classes into broader classes of ecological relevance to fauna. In the second approach, ordinal habitat indices were derived from vegetation and growth-stage mapping using rules specified by an expert panel. These indices described habitat features thought to be relevant to the faunal groups studied (e.g. tree hollow availability, fleshy fruit production). Landscape composition was calculated using these new variables within a 500-m and 2-km radius of each site. Each habitat index generated a spatially neutral variable and two spatial context variables.
  • 4
    Expert opinion was incorporated during the pre-modelling, model-fitting and post-modelling stages. At the pre-modelling stage experts developed new explanatory variables based on mapped fine-scale vegetation and growth-stage information. At the model-fitting stage an expert panel selected a subset of potential explanatory variables from the available set. At the post-modelling stage expert opinion modified or refined maps of predicted species distribution generated by statistical models. For comparative purposes expert opinion was also used to develop maps of species distribution by defining rules within a geographical information system, without the aid of statistical modelling.
  • 5
    Predictive accuracy was not improved significantly by incorporating habitat indices derived by applying expert opinion to fine-scaled vegetation and growth-stage mapping. Use of expert input at the pre-modelling stage to derive and select potential explanatory variables therefore does not provide more information than that provided by remotely mapped vegetation.
  • 6
    The incorporation of expert opinion at the model-fitting or post-modelling stages resulted in small but insignificant gains in predictive accuracy. The predictive accuracy of purely expert models was less than that achieved by approaches based on statistical modelling.
  • 7
    The study, one of few available evaluations of expert opinion in models of species distribution, suggests that expert modification of fitted statistical models should be confined to species for which models are grossly in error, or for which insufficient data exist to construct solely statistical models.


  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

Regional assessments of the adequacy of conservation reserve systems require detailed information on species distribution across large regions. Due to practical constraints this information cannot be obtained through biological surveys alone (Austin & Heyligers 1989; Ferrier 1997). Statistical modelling of biological survey data in relation to mapped environmental variables can provide a cost-effective surrogate for direct species distributional data (Austin, Cunningham & Fleming 1984; Margules & Stein 1989; Nicholls 1989; Lawton & Woodroffe 1991; Austin & Meyers 1996; Austin et al. 1996; Neave et al. 1996; Neave, Norton & Nix 1996a,b; Scott, Tear & Davis 1996; Manel et al. 1999; Palma, Beja & Rodriguez 1999; Suárez, Balbontín & Ferrer 2000).

Several Australian studies have demonstrated the importance of abiotic environmental variables (climate, terrain, substrate) in statistical modelling of species distribution, and in predicting faunal habitat at the regional scale (Austin, Cunningham & Fleming 1984; Braithwaite et al. 1989; Mackey et al. 1989; Margules & Stein 1989; McKenzie et al. 1989; Nicholls 1989, 1991; Lindenmayer et al. 1991b; Nix & Switzer 1991). In particular, recent studies aimed at facilitating regional conservation planning in south-eastern (Austin & Meyers 1996; Mills et al. 1996; Neave et al. 1996; Neave, Norton & Nix 1996a,b) and north-eastern New South Wales (NSW) (NSW NPWS 1994a, 1995) have built on the methodology of earlier predictive modelling that determined species distribution in relation to abiotic predictors.

Extensive modelling of faunal species distribution has already been undertaken in forested north-east NSW as part of the North-East Forests Biodiversity Study (NEFBS) (NSW NPWS 1994a). The predictive variables used in these models included abiotic environmental variables, broad-scaled vegetation variables derived from Landsat TM imagery and broad-scaled disturbance variables (clearing and logging) derived from Landsat imagery and logging history maps. In a recent evaluation of these models, Pearce, Ferrier & Scotts (in press) found that 89% of the models provided predictions that were significantly better than those from a random model, with 70% of the models providing high levels of discrimination accuracy.

The accuracy and spatial resolution of these models might, however, be improved by incorporating other types of available information, such as finer-scaled vegetation and growth-stage mapping and expert opinion of species’ habitat requirements. These variables may provide additional information on important habitat attributes (e.g. nesting sites, protective cover, food resources) known to be of critical importance to specific faunal groups, including arboreal mammals (Lindenmayer et al. 1991a; Lindenmayer, Cunningham & Donnelly 1993; Pausas, Braithwaite & Austin 1995), small terrestrial mammals (Dickman 1991), nectivorous birds (Scotts 1991), ground-foraging and nesting birds (Recher 1991) and bark-foraging birds (Clode & Burgman 1997).

Fine-scale vegetation information may be derived from aerial photograph interpretation (e.g. the forest-type classification; State Forests of NSW 1989), modelling of floristic plot data in relation to abiotic environmental variables, or air-borne videography (Catling & Coops 1999), and can be based on either dominant canopy species (NSW NPWS 1994b) or full floristic composition (Keith, Bedwood & Smith 1995). In north-east NSW a forest-type map at the scale of 1 : 25 000 is available, based on a combination of aerial photograph interpretation and modelling of forest types to fill in gaps in the aerial photograph interpreted coverage. Growth-stage maps derived through aerial photograph interpretation are also available at this scale (NSW NPWS 1996).

The use of fine-scaled vegetation mapping in statistical modelling of faunal distributions presents special challenges due to the large number of vegetation categories typically defined in such classifications. For example, in north-east NSW there are 110 forest types mapped. Regression analysis would normally treat each of these types as a separate class of a factor variable. Consequently, using raw forest types as an explanatory variable in statistical modelling results in significant problems of data sparsity due to the large number of factor classes relative to the total number of surveyed sites. Many of the mapped forest types contain no, or only a very small number of, faunal survey sites. To avoid these problems in regression, the information contained within the finer-scaled vegetation and growth-stage mapping needs to be generalized to provide meaningful and relevant predictors for use in model development. Examples of this approach include the development of habitat indices (Braithwaite et al. 1989) or the amalgamation of forest types into ecologically meaningful classes based on their structural and floristic characteristics (Scotts 1996).

Statistical modelling of species distribution using biological survey data and abiotic and biotic predictors is not the only approach to predictively mapping distributions for regional conservation planning. For example, habitat suitability index models are widely used in the United States to describe species distribution (Cole & Smith 1983). These indices are based on qualitative accounts and general statements about a species’ habitat preferences. Maps of species distribution are developed by combining vegetation or habitat classes within a geographical information system (GIS), and combining this with expert-based rules for linking faunal distribution data to these classes. In this study, we investigated ways in which these two approaches, statistical models and expert-defined habitat suitability indices, may be integrated. First, we evaluated techniques for incorporating fine-scaled vegetation type and growth-stage variables into statistical models at both on-site and landscape scales. Two previously used techniques were evaluated: the development of habitat indices and the amalgamation of forest types into ecologically meaningful classes. Both these techniques require expert opinion to define subjectively the levels of the indices or classes.

Secondly, we investigated a number of other approaches to incorporating expert opinion into statistical models of species distribution. We evaluated the incorporation of expert opinion at the model-fitting stage through selection of relevant explanatory variables. We also evaluated the contribution that experts can make at the post-modelling stage through expert refinement or modification of predictions from statistical models. The predictive accuracy of models developed using these approaches was compared with that of models for which expert input was confined to the derivation of habitat indices (i.e. at the pre-modelling stage) and models developed purely from expert opinion (without any statistical analysis).

The study was undertaken for a region of north-east NSW, Australia, using databases collated over the past 8 years. The evaluation was intended to provide guidelines for the immediate needs of conservation planning in the region.

Materials and methods

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

This study was conducted in two stages. In the first stage, two approaches to incorporating fine-scaled vegetation and growth-stage information into models of species distribution were evaluated using data for 93 fauna species in north-east NSW. In the second stage, a subset of these species was used to evaluate different strategies for incorporating expert opinion into models of species distribution. Each of these components is described in detail below.

Incorporating fine-scaled vegetation and growth-stage mapping

As indicated earlier, fine-scale vegetation mapping usually contains too many classes to allow a sufficient number of sites to be surveyed in every class for the purposes of faunal modelling. The classification used to map forest types in NSW (State Forests of NSW 1989) contains 110 unique forest types for north-east NSW alone, mapped at a scale of 1 : 25 000. Aerial photograph interpretation has recently (1995–96) been used to subdivide further each of these forest types into seven growth-stage classes (NSW NPWS 1996): old-growth forest, mature forest, mature disturbed forest, young forest, rainforest, cleared land, and an unmapped category. A total of 770 combinations of forest type and growth stage can therefore potentially occur within the region.

In this study we evaluated two approaches to pre-processing the forest-type/growth-stage classifications to reduce complexity prior to development of faunal models. The first approach involved amalgamating forest types to form generalized vegetation classes. For this purpose we employed a scheme devised by Scotts (1996), which amalgamated forest types into nine groups (Table 1). These forest-type groupings represent ecological classes deemed to be relevant to fauna, and were devised by considering the structural, floristic and environmental characteristics of each of the forest types.

Table 1.  Grouping of forest types mapped in north-east NSW according to Scotts (1996)
Vegetation groupForest typesDescription (character species)
 11,2,3,4,5,6,7,8Subtropical rainforest
 210,11,12,13,14,15Warm temperate rainforest
 316,17,18,19,20Cool temperate rainforest
 421,22,23,24,25,26Dry and depauperate rainforest
Coastal wet sclerophyll (typical rainforest understorey; > 35 – > 55m height)
 536Eucalyptus pilularis
 642,45,46,47E. microcorys, E. saligna
 748,49E. grandis, Syncarpia glomulifera
 851E. dunnii
 953,55Lophostemon confertus
1060,87E. acmenoides, E. resinifera, E. siderophloia, E. biturbinata
Tablelands wet sclerophyll (typical mesic understorey; > 35 – > 55m height)
11150–155,164E. obliqua, E. fastigata
12163a,170E. andrewsii
13131,168E. laevopinea, E. dalrympleana/E. viminalis
1452,158,162,98E. deanei, E. saligna, E. fraxinoides
Coastal grassy dry sclerophyll (frequent fires; < 25 – < 35, maybe 45 m height)
1537,39E. pilularis dominant, E. maculata
1656,61,62,64E. propinqua, E. siderophloia, E. acmenoides, E. carnea, E. saligna
1770,71,72,74,76,81,82,83,84,215E. maculata, E. siderophloia, E. crebra, E. sideroxylon, E. propinqua, E. pilularis, E. moluccana
1885,92,93,65,211E.tereticornis, E. amplifolia, E. moluccana
Coastal heathy dry sclerophyll (typically < 20 – < 30, maybe to 35 m height)
1938,41E. pyrocarpa, E. pilularis, E. gummifera, Angophora floribunda
2040,97,117,119,115,129,126, 130,105,106,107E. planchoniana, E. globoidea, E. tindaliae, E. intermedia, E.signata, E. rossii, E. pilularis
Tablelands grassy dry sclerophyll (drier tablelands; much cleared; < 20 – < 30m height)
21122,167E. laevopinea, E. caliginosa, E. cameronii
22140–142,159,160,161,103,138,111E. pauciflora, E. dalrympleana, E. rubida, E. nova-anglica, E. viminalis, E. caliginosa, E. laevopinea, E. cameronii, E. deanei
Tablelands heathy dry sclerophyll (very poor granite)
23163bE. andrewsii ssp. campanulata
Swamp sclerophyll forest
2430,31,32E. robusta, Melaleuca spp., Casuarina glauca
25138,136,137E. pauciflora, E. stellulata
26171,175,176,177,172E. melliodora, E. albens, E. dealbata, E. blakelyi, E. macrorhyncha, E. caliginosa
27180,182,203,204,213Callitris endlicheri, E. albens, E. polyanthemos, E. conica, E. crebra, E. melanophloia, E. sideroxlon

The amalgamated forest-types and the growth-stage classes were treated as two unordered factor variables (one for vegetation and the other for growth stage). These variables were then considered for entry into the statistical models along with existing abiotic environmental variables.

The second approach to incorporating fine-scale vegetation and growth-stage mapping into faunal modelling involved the derivation of habitat attribute indices. These indices predict habitat attributes thought to be of relevance in shaping the distribution of fauna (e.g. tree hollow index, nectar index). The indices were derived by an expert panel of three faunal ecologists who assigned a value for each index to each of the 770 unique combinations of forest type and growth stage. Each index consisted of a discrete number of ordered levels. The 10 habitat indices developed by this process are summarized in Table 2.

Table 2.  Description and derivation of the expert-defined habitat indices
Habitat index description and derivation
Predator indexThe relative exposure of terrestrial and scansorial fauna to predation by foxes. Index based on the size of the predator population (represented by elevation classes of forest type) and the influence of understorey structure (represented by forest type and growth-stage classes) on the foraging patterns of foxes and the avoidance strategies of prey.
NectarindexIndex derived from published and expert knowledge of nectar volume of overstorey species (represented by forest type), floral density (represented by growth stage) and the duration of nectar flow.
Fine litter indexIndex of relative invertebrate availability throughout the year. Fine litter index defined as the product of the accumulation rate of litter (represented by forest type and to a small extent growth stage) and its decomposition rate (a product of moisture, nutrients and soil depth). Invertebrate availability influenced by episodic drying (represented by forest type) which results in migration to soil layer and therefore low availability.
Coarse litter indexAs for fine litter, but also includes other foraging and basking substrates such as logs. Growth stage exerts a stronger influence on this index.
Fruit indexFleshy fruit index based on overstorey and understorey floristic composition (represented by forest type) and production rates (represented by growth stage).
Bark indexAerial bark index developed as an index of invertebrate microhabitat and vertebrate foraging substrate provided by higher order branches and the tree trunk. The index is based on annual production (defined by growth stage), bark form and tree architecture (defined by forest type).
Eucalypt foliage indexFoliage nutrient index based on published nitrogen levels of dominant overstorey eucalypt species (represented by forest type) and production rates (defined by a combination of forest type and growth stage). Other factors such as palatability and the presence of polyphenols and tannins were not considered.
Non-eucalypt foliage indexAs above for non-eucalypt component of canopy.
Structural complexity indexAn index of structural complexity defined as the number of strata plus gaps between and within strata. The index is based on site quality (defined by forest type) and growth stage.
Hollow indexAn index of the availability of hollows based on the number and quality of hollows produced by dominant tree species (forest type) and the age at which hollows begin developing in each species (growth stage).

For the purposes of deriving faunal models, each habitat index was expressed as a spatially neutral variable, and as two spatially explicit contextual variables. The spatially neutral variable was assigned the value of the habitat index at each site in the landscape. The two spatially explicit variables were assigned the mean value of the index within a square centred on the site of interest, with an edge length of 1 km and 4 km, respectively.

For each approach, predictive models were developed using field survey data from north-east NSW collected during the NEFBS (NSW NPWS 1994a). Presence/absence models were fitted using generalized additive modelling (GAM; Hastie & Tibshirani 1990) with forward stepwise selection of variables and using the logit link. Models were derived for 93 species: eight small reptiles, seven arboreal marsupials, 58 diurnal birds, eight nocturnal birds and 12 microchiropteran bats using a sample size of 672, 738, 528, 611, and 427 survey sites, respectively. These data were collected between 1992 and 1993. Explanatory variables considered are listed in the Appendix and discussed below. Continuous variables were tested for entry into models as smoothed functions with 4 degrees of freedom. At each step in the selection process, any variables not significantly related to the response at the 5% level (incorporating a Bonferroni correction) were not considered further in later steps. After completing this selection process, each variable in the model was re-examined to determine whether a reduction in function complexity (degrees of freedom) could be achieved without a significant increase in deviance. Degrees of freedom of 3, 2 and 1 were considered. Pearce & Ferrier (2000a) found this stepwise selection strategy (excluding the Bonferroni correction) to provide the most accurate predictive models compared with less stringent strategies.

Each species model was spatially interpolated by applying the fitted generalized additive model for that species to environmental data held within a GIS for each and every 4-ha grid cell within the region. Each of these interpolated distributions provided a map of probability of occurrence.

The accuracy of models derived using one or both of the above approaches to incorporating fine-scale vegetation and growth-stage mapping was compared with that of models employing only abiotic environmental variables, or abiotic variables and broad-scale vegetation variables. Abiotic variables considered are listed in the Appendix. Each abiotic and broad vegetation variable was stored in a GIS at a 4-ha grid resolution, while the fine-scaled vegetation and growth-stage variables were stored at 1-ha grid resolution.

Broad-scale vegetation systems were treated as a factor variable with three classes: rainforest, wet sclerophyll forest and dry sclerophyll forest, mapped from Landsat TM imagery (NSW NPWS 1994b). In the original NEFBS project some refinement of this vegetation system information was undertaken by modelling each of the three forest classes independently as a function of climate, terrain, substrate and point-sampled forest-type data within each mapped vegetation system (NSW NPWS 1994b). Three continuous variables were produced that predicted the probabilities of rainforest, wet sclerophyll forest, and dry sclerophyll forest occurring in each grid cell. These refined vegetation system variables were also used to derive spatially explicit contextual variables by averaging, with inverse distance weighting, the modelled probability of each vegetation class within a 500-m and 2-km radius of the grid cell of interest (NSW NPWS 1994a). In addition to vegetation system information, the severity of logging disturbance at each site was expressed on a three-point scale of light, moderate and heavy, based on maps of logging history. Contextual variables describing the average level of logging within the surrounding landscape were developed using radii of 500 m and 2 km. This resulted in three highly correlated variables for each vegetation system and the logging history variables. These groups of three variables were treated as correlated groups in the stepwise selection procedure. In the first step of variable selection, the significance of each variable was calculated, and the most significant variable of the correlated set retained for consideration in subsequent steps of stepwise selection. The percentage of forest cleared within a 2-km radius was also considered as a contextual variable. The logging history and percentage of forest-clearing variables were based on data collected in 1992–93.

For each species used in this evaluation, 10 separate models were developed using different combinations of spatially neutral and spatially explicit abiotic environmental variables, broad vegetation system variables, expert-derived habitat indices and amalgamated forest-type classes. As indicated previously, these variables are not contemporaneous. The growth-stage variables were derived from information collected 3–5 years after the species data and other environmental descriptors. This may therefore cause problems in modelling due to habitat change over the time period. However, we believe the extent of this problem is likely to be limited in our data sets, as only a very small proportion of the study area has been logged in the period 1991–96. The combinations of predictive variables examined were as follows:

  • 1
    abiotic variables only;
  • 2
    abiotic variables and amalgamated forest-type classes;
  • 3
    abiotic variables and spatially neutral habitat indices;
  • 4
    abiotic variables and spatially explicit habitat indices;
  • 5
    abiotic variables and spatially neutral vegetation systems;
  • 6
    abiotic variables and spatially explicit vegetation systems;
  • 7
    abiotic variables, spatially neutral vegetation system and habitat index variables;
  • 8
    abiotic variables, spatially explicit vegetation system and habitat index variables;
  • 9
    spatially neutral habitat index information only;
  • 10
    spatially explicit habitat index information only.

The predictive accuracy of the 10 models developed for each species was evaluated using independent survey data supplemented by jack-knifing techniques. Independent data were available to validate models developed for small reptile and diurnal bird species (NSW NPWS 1995; Clode & Burgman 1997; NSW NPWS, unpublished data). In total, 497 small reptile and 359 bird survey sites were available for evaluation using data collected in 1995–96. A small amount of independent data was also available for the arboreal marsupial, nocturnal bird and microchiropteran bat species (NSW NPWS, unpublished data). However, these data were strongly biased in their environmental coverage, being predominantly restricted to high-elevation, high-rainfall, areas of north-east NSW. Consequently, a five-group jack-knife technique (Efron 1982) was used to develop pseudoindependent data from the model development survey sites to validate models for these fauna groups. In this procedure, the data were divided into five groups. Four groups were combined and used to develop a predictive model. This model was then applied to the withheld fifth group to calculate the predicted values for this set. This procedure was repeated four times, each time developing a model based on four of the groups of data, and applying the model to the fifth group, until all the sites were assigned a predicted probability of occurrence.

The predictive accuracy of each model was assessed using the Mann–Whitney statistic as a measure of discrimination ability (for further details see Pearce & Ferrier 2000b). This statistic measures the ability of a model to discriminate correctly between occupied and unoccupied sites in an evaluation data set. This statistic can be interpreted as the probability that a model will correctly distinguish between observations at two sites, one observed occupied and the other observed vacant. In other words, if an occupied site and an unoccupied site are selected at random the index is an estimate of the probability that the model will predict a higher likelihood of occurrence for the occupied than for the unoccupied site. The Mann–Whitney statistic ranges between 0·5 for a model performing no better than random to 1 for a model exhibiting perfect discrimination ability.

The Mann–Whitney statistics obtained by applying the 10 modelling approaches to the 93 species were subjected to an analysis of variance (anova) to test the effect of each of the 10 combinations of abiotic, vegetation and growth-stage variables on model performance. In the anova, species was treated as an error term (Chambers, Freeny & Heiberger 1992). The effects listed under ‘error: species’ are those relating to differences in model performance between different species. These effects are not of direct relevance to this study. The effects listed under ‘error: within’ are those relating to differences in performance between different types of models fitted to the same species, controlling for differences between species. The effects found to be significant in the anova were compared using Scheffé's test for unplanned comparisons (Day & Quinn 1989).

Incorporating expert opinion

Expert opinion may be incorporated into models of species distribution at a number of stages in the modelling process. Experts can provide input at the data preparation, or pre-modelling, stage through the development of habitat indices, or by grouping vegetation types into broader ecological classes. These sources of expert input were evaluated in the first part of this study (described above). At the model-fitting stage, experts can provide input by selecting a subset of the available predictors that are of particular ecological relevance to the species concerned. This restricts the number of potentially correlated variables being considered for inclusion in a model, and ensures that only biologically relevant variables are selected for a given species. Expert opinion can also be used at the post-modelling stage to constrain, modify or refine a statistically derived model, by incorporating additional information not included in the model. Alternatively, predictive models can be formulated based on expert opinion alone, without any statistical procedure used to calculate model coefficients.

The above approaches were evaluated for a selected set of 16 species: two small reptiles, four diurnal birds, five nocturnal birds, two arboreal marsupials and three microchiropteran bats. These species were selected to represent a range of taxonomic groups and guilds, rare and common species and species with statistical models of varying accuracy.

An expert panel of three faunal ecologists, highly familiar with the species and habitats present within the north-east NSW region, was convened to establish the following.

  • 1
    Modify or refine predicted distributions from statistical models. The experts specified additional rules based on any of the available GIS layers (forest types, growth stages, habitat indices, abiotic environmental variables) to refine predicted distributions to better reflect their knowledge of species distribution. The statistical models with the highest predictive accuracy from the first stage of the study were used as the basis for this task.
  • 2
    Derive models based purely on expert opinion without any statistical analysis. For each species, the experts specified a habitat model in terms of rules based on any combination of the available GIS layers (forest types, growth stages, habitat indices, abiotic environmental variables). The expert panel also derived a new vegetation variable for nine of the 16 species as described in task 4 below. This variable was only available to ‘expert-only’ models.
  • 3
    Select a subset of predictors of particular ecological relevance to each species, and specify the likely functional form (shape) of the relationship between the species and each of these selected variables. The GAM model for each species was then refitted using only these predictors.
  • 4
    Derive specific vegetation index maps for each species, based on expert opinion. For species for which the expert panel recognized a strong relationship with individual forest types, forest types were ranked on a four-point scale (core habitat, intermediate habitat, marginal habitat and non-habitat) according to their perceived value for each individual fauna species. A new GIS layer was developed for each species, describing the forest-type habitat value indices defined by the expert panel. This variable was then considered along with abiotic variables for inclusion in a statistical model.

The above four models were developed sequentially for each species at a single expert workshop, convened by K. Cherry. Models were developed in the order 1, 4, 3, and then 2. It was recognized that, in some cases, this sequential approach may have resulted in a lack of independence between the four models. However, time and resource constraints prohibited separate workshops being convened for each type of model. It was thought that this bias would positively favour the development of expert-only models, as the expert panel could evaluate the performance of the statistical models as their first task. However, given the subsequent results of this study, the effect of this bias on the relative predictive accuracy of the models is unclear.

The first two models were developed interactively by the expert panel, in collaboration with an experienced GIS operator (M. Drielsma), using the Arcview GIS software package. The expert panel chose to combine explanatory variables in a multiplicative fashion. Variables were expressed either in their original continuous form or converted to ordinal classes according to the decisions of the expert panel. While predictions of habitat suitability were generated on a continuous scale, this scale was not necessarily expressed in terms of probability and could contain values greater than one. This was not a problem for the subsequent evaluations of predictive accuracy because we were interested only in discrimination performance, which effectively treated the predictions from a given model as a relative index of occurrence rather than as true probabilities.

Two statistical models were developed for the third type of expert input, using a stepwise GAM procedure with only those variables nominated by the expert panel considered for inclusion. In the first of these models, each variable was expressed in the functional form nominated by the expert panel. Once the final model was obtained, each variable in the model was re-examined to determine whether a change in complexity was warranted given the significance of the resulting change in deviance. The second model considered all nominated variables with 4 degrees of freedom. Once the model was obtained, each variable was re-examined to determine whether a reduction in degrees of freedom could be achieved without a significant increase in deviance.

The predictive accuracy of the first three models developed for each species was assessed using the evaluation data and techniques described for the first stage of the study. The relative performance of these models was then compared with that obtained for models developed without any expert input during or after the model-fitting process.

The fourth model was evaluated by comparing the predictive performance of models that contained only abiotic variables, with models developed in which the vegetation variable was considered for inclusion in the model. The modified z-test of Hanley & McNeil (1983) was used to compare the predictive performance of the two models for each species. This test accounts for the correlation between measures that arises due to the same data being used to calculate each statistic.


  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

Incorporating fine-scaled vegetation and growth-stage mapping

A significant difference in predictive performance was found to exist between models derived using different combinations of explanatory variables (Table 3). However, this effect was not consistent across all biological groups, as indicated by the significant interaction term in Table 3.

Table 3.  Results of anova analysing the discrimination capacity of models developed for species within several faunal groups, using different combinations of explanatory variables
Error: species     
Faunal group  4 0·65760·1644 0·94190·4437
Residuals 8815·35880·1745  
Error: within     
Variable combination  9 0·83980·100338·18950
Variable combination × biological group 36 0·26820·0083 3·14720
Residuals797 2·17160·0026  

Models based solely on habitat index information (spatially explicit or neutral) performed significantly worse in terms of discrimination accuracy than did the other variable combinations, containing abiotic variables alone or in combination with vegetation and growth-stage information. There was no significant difference between the performance of the other variable combinations (Fig. 1). Habitat index information therefore did not improve model predictive accuracy over models containing abiotic variables and remotely mapped vegetation systems. Models containing spatially explicit vegetation information performed better than those containing spatially neutral information. Although the ranking of models changed slightly when individual faunal groups were considered separately, the overall patterns remained unchanged.


Figure 1. Effect of the choice of explanatory variables on the overall accuracy of the species distribution models. Groups of explanatory variables considered were abiotic environmental variables (Abiotic), habitat index variables (Index), vegetation system and disturbance variables (Vegsystem), and grouping of forest types and growth stages according to Scotts (1996) (Veggroup). Habitat indices and vegetation systems are also represented as spatially explicit variables (Spatial). The adjusted Mann–Whitney statistics are residuals from a linear model containing species as an explanatory term, in order to remove the effect of variation among species.

Download figure to PowerPoint

Incorporating expert opinion

For each of the 16 faunal species considered by the expert panel, four models were developed that incorporated different types of expert input: two statistical models developed by including expert input at the model-fitting stage, a model derived by incorporating expert input at the post-modelling stage, and a model derived solely using expert-defined GIS rules. The decisions made by the expert panel to create these four models for each species are available from the authors.

The mean performance of each of the four models incorporating expert opinion was compared with that of the best-performing statistical model from the first stage of this study. The anova results are shown in Table 4, and suggested that mean predictive accuracy varied significantly between these five models. Multiple pairwise comparisons using Scheffé's technique (Day & Quinn 1989) suggested that models developed using only expert-defined rules performed significantly worse than the other four techniques. The techniques incorporating statistical modelling with or without expert input performed equivalently (Fig. 2).

Table 4.  Results of anova comparing the performance of techniques used to incorporate expert knowledge into models of species distribution
Expert method 40·0550·014 4·7060·002

Figure 2. Discrimination accuracy (and standard error) of models developed for 16 species with various types of expert input. The results are presented as adjusted Mann–Whitney statistics in order to remove the effect of variation among species.

Download figure to PowerPoint

The forest types that compose each of the forest-type habitat indices derived by the expert panel for each species are available from the authors. They recognized a strong relationship with vegetation type for only nine of the 16 species. The mean accuracy values and their standard errors for each of these nine species are shown in Table 5. The new species-specific vegetation variable entered only three of the nine models, and significantly improved only the model for the greater glider Petauroides volans. This improvement was not significantly greater than that provided by the addition of expert-derived habitat index information.

Table 5.  Comparison of discrimination performance of models containing only abiotic environmental variables with those additionally containing the species-specific habitat quality variable developed for eight species by the expert panel during the derivation of expert-based GIS models. Models for which the habitat quality variables did not enter the predictive model are marked by –
  Mann–Whitney Statistic (± SE) 
Species common nameSpecies scientific nameAbiotic onlyAbiotic + expert habitat classesSignificance of difference
Common brushtail possumTrichosaurus vulpecula0·7667 ± 0·0364
Greater gliderPetauroides volans0·7953 ± 0·02120·8253 ± 0·02100·001
Little cave vespadelusVespadelus darlingtoni0·7828 ± 0·0293
Large forest vespadelusVespadelus pumilus0·8136 ± 0·0274
Sooty owlTyto tenebricosa0·5977 ± 0·0370
Marbled frogmouthPodargus ocellatus0·9026 ± 0·06510·9299 ± 0·0672NS
Black-faced monarchMonarcha melanopsis0·8143 ± 0·0355
Spectacled monarchMonarcha trivirgatus0·6771 ± 0·06470·6583 ± 0·0615NS


  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

The results suggest that fine-scaled vegetation and growth-stage information often provide important addi-tional information to abiotic environmental and broad vegetation variables when modelling the distribution of faunal species across the landscape. However, for this information to be useful, the results suggest that each attribute needs to be spatially explicit. Habitat indices and broad-scale vegetation systems both provide important information on vegetation and disturbance. However, it appears that development of habitat indices by an expert panel may not provide sufficient additional information over that obtained using remotely mapped vegetation systems to improve significantly model predictive ability.

Two schemes for amalgamating forest types using expert opinion were evaluated in this study, the first based on Scotts (1996) and the second derived individually for species by the expert panel during this study. Both schemes did not appear to improve substantially model predictions over a model containing only abiotic variables. Although other schemes for the amalgamation of forest types are available and may prove useful, this study suggests that limited resources would best be directed to the development of expert-derived habitat indices rather than schemes to amalgamate forest types.

As expected, the addition of expert opinion into statistical models of species distribution proved effective, but only at a single stage in the modelling process: the pre-modelling stage, through the development of habitat indices. Other forms of expert input proved insignificant (model refinement by an expert panel and specification of explanatory variables) or even detrimental (expert-derived GIS rules).

A number of factors may have contributed to the low (or equivalent) predictive accuracy of expert models relative to statistical models in this study. First, all maps of species distribution were developed at a 4-ha grid cell resolution as this is the resolution at which species distribution models were being developed for regional planning activities in north-east NSW. Statistical models, given adequate data, can model the distribution of a species at this level of resolution quite accurately (Pearce et al., in press). However, in a region of approximately 7·9 million hectares, it is substantially more difficult for an expert panel to consider the landscape at such a fine scale. In this study the experts developed or refined statistical models to better represent the broad regional distribution of the species when the map was viewed from a lower resolution. For example, the panel combined environmental variables in a manner that provided a good match with their knowledge of broad patterns of species distribution across the whole region. They then examined parts of the region in more detail, to ensure that the patterns within these areas concorded with their knowledge. However, it was not possible to examine the predicted distribution of each species at the resolution of individual grid cells across the entire region.

Certainly there are advantages to having each statistical model checked by an expert panel. The expert panel can identify areas of potential model weakness and therefore suggest possible refinements, further explanatory variables, or identify geographical barriers not considered by the statistical procedure. Experts can also identify problems with models for species for which there has been taxonomic uncertainty in the past or for which a number of distinct subspecies may have been represented in a single model. However, the results of this study suggest that the large amount of expert time involved in manually refining statistically derived models could be more efficiently directed towards only modifying statistical models that are grossly in error, or to the development of GIS-based models for species for which there are insufficient data to construct a statistical model, using expert-specified rules. Species for which the data may be unreliable should be identified at the data preparation stage, when expert opinion should be used to identify suspect records or subspecific status prior to modelling.

Based on this study, the following modelling procedure for integrating expert opinion and statistical modelling may provide the most accurate models of faunal species distribution.

  • 1
    Use experts to vet faunal data used to develop statistical models to ensure that only reliable data are used.
  • 2
    Use experts to develop habitat indices based on fine-scaled vegetation, growth-stage and disturbance information.
  • 3
    Develop statistical models of species distribution for species for which adequate data exist, using both habitat indices and other abiotic environmental and broad-scaled vegetation variables. Develop expert models for species with insufficient data.
  • 4
    Validate all models using statistical techniques to provide a measure of confidence in model performance. Independent data should be used where available.
  • 5
    Subject each predictive model to examination by experts. Experts should only modify models for which model predictions are grossly in error.

Recommendations for further research

This study has highlighted the need for further research in two key areas. First, a more thorough evaluation of the performance of expert input at the model development and post-modelling stages is required. This might require the development of a GIS toolkit to assist the expert panel in manipulating explanatory variables, provision of an adequate familiarization period prior to development of models, clear definition of the requirements and applications of the predictive models (including spatial scale), and consideration of a larger range of species.

Secondly, significant effort needs to be devoted to researching the role of spatial scale in modelling faunal distributions, considering the uses to which models will be applied. The issue of scale at which species are modelled, and the scale at which conservation planning decisions are made, is important and has not been addressed fully in north-east NSW. All modelling within the region has been undertaken at the finest grid resolution available, which was 4 ha. Research must be undertaken to determine the most appropriate spatial resolution for modelling individual species to obtain accurate predictive models of species distribution at the spatial scale of relevance to the life history of the species concerned (Turner, Dale & Gardner 1989; Roloff & Haufler 1997). The relative performance of expert vs. statistical models at coarser spatial scales also needs to be determined. It is expected that expert models may perform significantly better at a coarser scale.


  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

The hard work and enthusiasm of the expert panel, David Scotts, Sandy Gilmore and Dave Milledge, was very much appreciated. Thank you for sharing your knowledge with us. The work described in this paper was performed as part of a consultancy funded by Environment Australia. We thank Andrew Taplin and Dave Barratt from this agency for their support and encouragement, and David Meagher for commenting on an early draft of the manuscript. The hard work and dedication of the survey teams that collected the data used in this study is greatly appreciated.


  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix
  • Austin, G.E., Thomas, C.J., Houston, D.C., Thompson, D.B.A. (1996) Predicting the spatial distribution of buzzard Buteo buteo nesting areas using a geographical information system and remote sensing. Journal of Applied Ecology, 33, 15411550.
  • Austin, M.P. & Heyligers, P.C. (1989) Vegetation survey design for conservation: gradsect sampling of forests in north-eastern New South Wales. Biological Conservation, 50, 1332.
  • Austin, M.P. & Meyers, J.A. (1996) Current approaches to modelling the environmental niche of eucalypts: implications for management of forest biodiversity. Forest Ecology and Management, 85, 95106.
  • Austin, M.P., Cunningham, R.B., Fleming, P.M. (1984) New approaches to direct gradient analysis using environmental scalars and curve fitting procedures. Vegetatio, 55, 1127.
  • Braithwaite, L.W., Austin, M.P., Clayton, M., Turner, J., Nicholls, A.O. (1989) On predicting the presence of birds in Eucalyptus forest types. Biological Conservation, 50, 3350.
  • Catling, P.C. & Coops, N. (1999) Prediction of the distribution and abundance of small mammals in the eucalypt forests of south-eastern Australia from airborne videography. Wildlife Research, 26, 641650.
  • Chambers, J.M., Freeny, A.E., Heiberger, R.M. (1992) Analysis of variance. Statistical Models in S (eds J.M.Chamber & T.J.Hastie), pp. 145194. Wadsworth and Brooks/Cole, Pacific Grove, CA.
  • Clode, D. & Burgman, M. (1997) Joint Old Growth Forests Project: Summary Report. NSW National Parks and Wildlife Service and NSW State Forests, Sydney, Australia.
  • Cole, C.A. & Smith, R.L. (1983) Habitat suitability indices for monitoring wildlife populations. Transactions of the North American Wildlife and Natural Resources Conference, 48, 347375.
  • Day, R.W. & Quinn, G.P. (1989) Comparisons of treatments after an analysis of variance in ecology. Ecological Monographs, 59, 433463.
  • Dickman, C.R. (1991) Use of trees by ground-dwelling mammals; implications for management. Conservation of Australia's Forest Fauna (ed. D.Lunney), pp. 125136. Royal Zoological Society NSW, Mossman, Australia.
  • Efron, B. (1982) The Jackknife, the Bootstrap, and other Resampling Plans. CBMS-NSF Regional Conference Series in Applied Mathematics No. 38. SIAM, Philadelphia, USA.
  • Ferrier, S. (1997) Biodiversity data for reserve selection: making best use of incomplete information. National Parks and Protected Areas: Selection, Delimitation and Management (eds P.J.Pigram & R.C.Sundell), pp. 315329. University of New England, Armidale, New South Wales, Australia.
  • Hanley, J.A. & McNeil, B.J. (1983) A method of comparing the areas under relative operating characteristic curves derived from the same cases. Radiology, 148, 839843.
  • Hastie, T.J. & Tibshirani, R. (1990) Generalized Additive Models. Chapman & Hall, London, UK.
  • Keith, D.A., Bedwood, M., Smith, J. (1995) Vegetation of the Eden Negotiation Area of New South Wales. NSW National Parks and Wildlife Service, Sydney, Australia.
  • Lawton, J.H. & Woodroffe, G.L. (1991) Habitat and distribution of water voles: why are there gaps in a species’ range? Journal of Animal Ecology, 60, 7991.
  • Lindenmayer, D.B., Cunningham, R.B., Donnelly, C.F. (1993) The conservation of arboreal marsupials in the montane ash forests of the central highlands of Victoria, south-east Australia. IV. The presence and abundance of arboreal marsupials in retained linear habitats (wildlife corridors) within logged forests. Biological Conservation, 66, 207221.
  • Lindenmayer, D.B., Cunningham, R.B., Tanton, M.T., Nix, H.A., Smith, A.P. (1991a) The conservation of arboreal marsupials in the montane ash forests of the central highlands of Victoria, south-east Australia. III. The habitat require-ments of Leadbeater's possum Gymnobelideus leadbeateri and models of the diversity and abundance of arboreal marsupials. Biological Conservation, 56, 295315.
  • Lindenmayer, D.B., Nix, H.A., McMahon, J.P., Hutchinson, M.F., Tanton, M.T. (1991b) The conservation of Leadbeater's possum, Gymnobelideus leadbeateri (McCoy): a case study of the use of bioclimatic modelling. Journal of Biogeography, 18, 371383.
  • McKenzie, N.L., Belbin, D.L., Margules, C.R., Keighery, G.J. (1989) Towards more representative reserve systems in remote areas: case-study of the Nullarbor region, Australia. Biological Conservation, 50, 239261.
  • Mackey, B.G., Nix, H.A., Stein, J.A., Cork, S.E. (1989) Assessing the representativeness of the wet tropics of Queensland world heritage property. Biological Conservation, 50, 279303.
  • Manel, S., Dias, J.M., Buckton, S.T., Ormerod, S.J. (1999) Alternative methods for predicting species distribution: an illustration with Himalayan river birds. Journal of Applied Ecology, 36, 734747.
  • Margules, C.R. & Stein, J.L. (1989) Patterns in the distribution of species and the selection of nature reserves: an example from Eucalyptus forests in south-eastern New South Wales. Biological Conservation, 50, 219238.
  • Mills, D.J., Norton, T.W., Parnaby, H.E., Cunningham, R.B., Nix, H.A. (1996) Designing surveys for microchiropteran bats in complex forest landscapes – a pilot study from south-eastern Australia. Forest Ecology and Management, 85, 149161.
  • Neave, H.M., Cunningham, R.B., Norton, T.W., Nix, H.A. (1996) Biological inventory for conservation evaluation. III. Relationship between birds, vegetation and environmental attributes in southern Australia. Forest Ecology and Management, 85, 197218.
  • Neave, H.M., Norton, T.W., Nix, H.A. (1996a) Biological inventory for conservation evaluation. I. Design of a field survey for diurnal, terrestrial birds in southern Australia. Forest Ecology and Management, 85, 107122.
  • Neave, H.M., Norton, T.W., Nix, H.A. (1996b) Biological inventory for conservation evaluation. II. Composition, functional relationships and spatial prediction of bird assemblages in southern Australia. Forest Ecology and Management, 85, 123148.
  • Nicholls, A.O. (1989) How to make biological surveys go further with generalised linear models. Biological Conservation, 50, 5176.
  • Nicholls, A.O. (1991) Examples of the use of generalised linear models in analysis of survey data for conservation evaluation. Nature Conservation: Cost Effective Biological Surveys and Data Analysis (eds C.R.Margules & M.P.Austin), pp. 5463. CSIRO, Canberra, Australia.
  • Nix, H.A. & Switzer, M.A. (1991) Kowari No. I. Rainforest Animals: Atlas of Vertebrates Endemic to Australia's Wet Tropics. Australian National Parks and Wildlife Service, Canberra, Australia.
  • NSW NPWS (1994a) Fauna of North-East NSW Forests. North East Forests Biodiversity Study Report No. 3. New South Wales National Parks and Wildlife Service, Sydney, Australia.
  • NSW NPWS (1994b) Vegetation Systems of North-East NSW Forests. North East Forests Biodiversity Study Report No. 2b. New South Wales National Parks and Wildlife Service, Sydney, Australia.
  • NSW NPWS (1995) Vertebrates of Upper North-East NSW. New South Wales National Parks and Wildlife Service, Sydney, Australia.
  • NSW NPWS (1996) Broad Old Growth Mapping Project: Final Report. New South Wales National Parks and Wildlife Service, Sydney, Australia.
  • Palma, L., Beja, P., Rodriguez, M. (1999) The use of sighting data to analyse Iberian lynx habitat and distribution. Journal of Applied Ecology, 36, 812824.
  • Pausas, J.G., Braithwaite, L.W., Austin, M.P. (1995) Modelling habitat quality for arboreal marsupials in the south coastal forests of New South Wales, Australia. Forest Ecology and Management, 78, 3949.
  • Pearce, J.L. & Ferrier, S. (2000a) An evaluation of alternative algorithms for fitting species distribution models using logistic regression. Ecological Modelling, 128, 127147.
  • Pearce, J.L. & Ferrier, S. (2000b) Evaluating the predictive performance of habitat models developed using logistic regression. Ecological Modelling, 133, 225245.
  • Pearce, J.L., Ferrier, S., Scotts, D. (in press) An evaluation of the predictive performance of flora & fauna distributional models in North-East New South Wales. Journal of Environmental Management.
  • Recher, H.F. (1991) The conservation and management of eucalypt forest birds: resource requirements for nesting and foraging. Conservation of Australia's Forest Fauna (ed. D.Lunney), pp. 2534. Royal Zoological Society NSW, Mossman, Australia.
  • Roloff, G.J. & Haufler, J.B. (1997) Establishing population viability planning objectives based on habitat potential. Wildlife Society Bulletin, 25, 895904.
  • Scott, J.M., Tear, T.H., Davis, F.W. (1996) Gap Analysis: A Landscape Approach to Biodiversity Planning. American Society for Photogrammetry and Remote Sensing, Bethesda, MD.
  • Scotts, D. (1991) Old-growth forests: their ecological characteristics and value to forest-dependent fauna of south-east Australia. Conservation of Australia's Forest Fauna (ed. D.Lunney), pp. 147159. Royal Zoological Society NSW, Mossman, Australia.
  • Scotts, D. (1996) Vertebrate Fauna of the Northern Study Area – Deriving Predictive Models and Habitat Deferral Targets. New South Wales National Parks and Wildlife Service, Coffs Harbour, Australia.
  • State Forests of NSW (1989) Forest Types in New South Wales. Research Note No. 17. State Forests of New South Wales, Sydney, Australia.
  • Suárez, S., Balbontín, J., Ferrer, M. (2000) Nesting habitat selection by booted eagles Hieraaetus pennatus and implications for management. Journal of Applied Ecology, 37, 215223.
  • Turner, M.G., Dale, V.H., Gardner, R.H. (1989) Predicting across scales: theory development and testing. Landscape Ecology, 3, 245252.

Received 2 December 1999; revision received 28 September 2000


  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Appendix

Explanatory variables used to develop models of species distribution

Variable nameDescription (transformation)ArborealReptileBird
Abiotic environmental variables
rainMean annual rainfall (log[x])
tempMean annual temperature
radMean annual solar radiation  
topoTopographic position
miMoisture index (X3)
wiWetness index   
sdSoil depth
fertSoil fertility
northAustralian map grid northing
effortSurvey effort 
monthSurvey month  
Vegetation system variables
vegVegetation system mapped from Landsat TM imagery
rfProbability of rainforest at site
dryProbability of dry sclerophyll forest at site
wetProbability of wet sclerophyll forest at site
logLevel of logging disturbance at site
rf500Probability of rainforest within 500 m
dry500Probability of dry sclerophyll forest within 500 m
wet500Probability of wet sclerophyll forest within 500 m
log500Level of logging disturbance within 500 m
rf2kProbability of rainforest within 2 km
dry2kProbability of dry sclerophyll forest within 2 km
wet2kProbability of wet sclerophyll forest within 2 km
log2kLevel of logging disturbance within 2 km
clr2kAmount of clearing within a 2-km radius (log[x])
Habitat index variables
structStructural complexity at site
barkAerial bark accumulation at site
fruitFleshy fruit production at site
fineFine litter availability at site
coarseCoarse litter availability at site
predExposure to predators at site
nectNectar production at site
hollowHollow availability at site
eucfolEucalypt foliage nutrients at site
nonfolNon-eucalypt foliage nutrients at site
struct500Structural complexity within 500 m
bark500Aerial bark accumulation within 500 m
fruit500Fleshy fruit production within 500 m
fine500Fine litter availability within 500 m
coarse500Coarse litter availability within 500 m
pred500Exposure to predators within 500 m
nect500Nectar production within 500 m
holl500Hollow availability within 500 m
euc500Eucalypt foliage nutrients within 500 m
non500Non-eucalypt foliage nutrients within 500 m
struct2kStructural complexity within 2 km
bark2kAerial bark accumulation within 2 km
fruit2kFleshy fruit production within 2 km
fine2kFine litter availability within 2 km
coarse2kCoarse litter availability within 2 km
pred2kExposure to predators within 2 km
nect2kNectar production within 2 km
holl2kHollow availability within 2 km
euc2kEucalypt foliage nutrients within 2 km
non2kNon-eucalypt foliage nutrients within 2 km