• expected probabilities;
  • replicate sampling;
  • River Invertebrate Prediction and Classification System;
  • statistical power;
  • stress


1. An overall aim in freshwater bioassessment is to use biological methods, metrics and forms of indices which are precise, in that they give repeatable results between replicate samples, but which are also sensitive to changes in environmental impacts and stresses. Here we studied the effects of excluding taxa with site-specific River Invertebrate Prediction and Classification System (RIVPACS)-type model expected probabilities less than (or equal to) a threshold Pt (0.0, 0.1, 0.2,…,0.9) on the value, precision and power to detect biological effects of environmental stress using the observed to expected ratios (O/E) of biotic indices used to assess the ecological status of U.K. river sites.

2. Amongst the 614 high quality GB RIVPACS reference sites, excluding taxa with low expected probabilities of occurrence gave less total variation (i.e. lower SD) in the estimates O/E for number of taxa (O/ETAXA) and the average score per taxon (O/EASPT).

3. A separate analysis of a replicated sampling study of sites from a wide range of physical types and qualities revealed that sampling variances in O/E for reference condition sites decreased as more locally rare taxa were excluded (but only up to Pt = 0.5 for O/EASPT). However, for moderately impacted and poor quality sites, estimates of both O/ETAXA and O/EASPT based on all (Pt = 0.0) or most taxa (i.e. Pt ≤ 0.3) had lower sampling variances and were more precise.

4. Within a very large independent set of test sites with a wide range of perceived levels of environmental stress, increasing the threshold Pt led to systematic compression of the realised O/E scale towards unity. Specifically, with increasing threshold, O/E values >1 are on average reduced, while O/E values <1 have a tendency to be higher and closer to unity (with the exception of O/EASPT for the most severely stressed sites).

5. Accuracy and statistical power to detect environmental stress (measured by the percentage of stressed sites with O/E below the lower 10-percentile value for reference sites) was very similar using O/ETAXA for Pt up to 0.7. Using O/EASPT, power to detect overall general stress decreased slowly as Pt was increased; the rate of fall in power was slightly faster when restricted to sites subject to moderate or severe stress from organic inputs.

6. Taxa which are more sensitive to (organic) stresses [i.e. have high Biological Monitoring Working Party (BMWP) scores] tend to be naturally less widespread (i.e. amongst reference sites) and thus were found to have considerably lower average site-specific expected probabilities; this may explain why the use of higher thresholds Pt can exclude more such sensitive taxa and lead to underestimation of the extent of impacts.

7. The standard U.K. RIVPACS sampling and sample processing procedures aim to identify all taxa within a sample. This may lead to a longer distribution tail of rarer (low probability) taxa than sampling methods based on a fixed count subsample and influence the practical effects of excluding rare taxa with low expected probabilities from bioassessments.