A review of statistical methods for the evaluation of aquatic habitat suitability for instream flow assessment



Habitat models serve three main purposes: First, to predict species occurrences on the basis of abiotic and biotic variables, second to improve the understanding of species-habitat relationships and third, to quantify habitat requirements. The use of statistical models to predict the likely occurrence or distribution of species based on relevant variables is becoming an increasingly important tool in conservation planning and wildlife management. This article aims to provide an overview of the current status of development and application of statistical methodologies for analysing the species-environment association, with a clear emphasis on aquatic habitat. It describes the main types of univariate and multivariate techniques available for analysis of species-environment association, and specifically focuses on the assessment of the strengths and weaknesses of the available statistical methods to estimate habitat suitability. A second objective of this article is to propose new approaches using existing statistical methods. A wide array of habitat statistical models has been developed to analyse habitat-species relationship. Generally, physical habitat is dependent on more than one variable (e.g. depth, velocity, substrate, cover) and several suitability indices must be combined to define a composite index. Multivariate approaches are more appropriate for the analysis of aquatic habitat as they inherently consider the interrelation and correlation structure of the environmental variables. Ordinary multiple linear regression and logistic regression are popular methods often used for modelling of species and their relationships with environment. Ridge regression and Principal component regression are particularly useful when the independent variables are highly correlated. More recent regression modelling paradigms like generalized linear models (GLMs) present advantages in dealing with non-normal environmental variables. Generalized additive models (GAMs) and artificial neural networks are better suited for analysis of non-linear relationships between species distribution and environmental variables. The fuzzy logic approach presents advantages in dealing with uncertainties that often exist in habitat modelling. Appropriate methods for analysis of multi-species data are also presented. Finally, the few existing comparative studies for predictive modelling are reviewed, and advantages and disadvantages of different methods are discussed. Copyright © 2006 John Wiley & Sons, Ltd.