1. The predictive modelling approach to bioassessment estimates the macroinvertebrate assemblage expected at a stream site if it were in a minimally disturbed reference condition. The difference between expected and observed assemblages then measures the departure of the site from reference condition.
2. Most predictive models employ site classification, followed by discriminant function (DF) modelling, to predict the expected assemblage from a suite of environmental variables. Stepwise DF analysis is normally used to choose a single subset of DF predictor variables with a high accuracy for classifying sites. An alternative is to screen all possible combinations of predictor variables, in order to identify several ‘best’ subsets that yield good overall performance of the predictive model.
3. We applied best-subsets DF analysis to assemblage and environmental data from 199 reference sites in Oregon, U.S.A. Two sets of 66 best DF models containing between one and 14 predictor variables (that is, having model orders from one to 14) were developed, for five-group and 11-group site classifications.
4. Resubstitution classification accuracy of the DF models increased consistently with model order, but cross-validated classification accuracy did not improve beyond seventh or eighth-order models, suggesting that the larger models were overfitted.
5. Overall predictive model performance at model training sites, measured by the root-mean-squared error of the observed/expected species richness ratio, also improved steadily with DF model order. But high-order DF models usually performed poorly at an independent set of validation sites, another sign of model overfitting.
6. Models selected by stepwise DF analysis showed evidence of overfitting and were outperformed by several of the best-subsets models.
7. The group separation strength of a DF model, as measured by Wilks’Λ, was more strongly correlated with overall predictive model performance at training sites than was DF classification accuracy.
8. Our results suggest improved strategies for developing reliable, parsimonious predictive models. We emphasise the value of independent validation data for obtaining a realistic picture of model performance. We also recommend assessing not just one or two, but several, candidate models based on their overall performance as well as the performance of their DF component.
9. We provide links to our free software for stepwise and best-subsets DF analysis.