You have full text access to this OnlineOpen article
METHODOLOGICAL INSIGHTS: Increasing the value of principal components analysis for simplifying ecological data: a case study with rivers and river birds
Article first published online: 9 JUN 2005
DOI: 10.1111/j.1365-2664.2005.01038.x
Additional Information
How to Cite
VAUGHAN, I. P. and ORMEROD, S. J. (2005), METHODOLOGICAL INSIGHTS: Increasing the value of principal components analysis for simplifying ecological data: a case study with rivers and river birds. Journal of Applied Ecology, 42: 487–497. doi: 10.1111/j.1365-2664.2005.01038.x
Publication History
- Issue published online: 9 JUN 2005
- Article first published online: 9 JUN 2005
- Received 7 December 2004; final copy received 7 February 2005Editor: Rob Freckleton
- Abstract
- Article
- References
- Cited By
Keywords:
- habitat data;
- multicollinearity;
- ordinal variables;
- PCA;
- qualitative data;
- river habitat survey;
- variable clustering
Summary
- 1Two priorities for applied ecologists are to (i) maintain quantitative rigour with minimal resources and (ii) ensure that multivariate results are readily understood by end users. Habitat descriptions and other complex data present particular challenges.
- 2Principal components analysis (PCA) is often used to reduce data and stabilize subsequent statistical analyses. Interpretation can be difficult, however, and PCA is optimized for quantitative (cf. categorical) data. Moreover, future applications (e.g. in predicting species’ distributions) require the recording of all contributing variables irrespective of cost or importance.
- 3We considered the potential benefits of two PCA variants. First, we considered whether a cluster analysis on the correlation matrix of independent variables (i.e. variable clustering), followed by a PCA within each cluster, produced a more easily interpreted output than conventional PCA, while simultaneously reducing costs. Secondly, we considered whether a generalized PCA capable of analysing qualitative data could out-perform conventional PCA when ecological data include ordinal variables. As a case study, we used data from river habitat survey (RHS), a key applied tool in river ecology that uses more than 100 variables to describe river structure and relies heavily on three-point ordinal scales. In distribution models that linked river birds to RHS, we compared the interpretability and efficiency of variable clustering and generalized PCA against conventional PCA.
- 4While variable clustering gave similar predictive performance to PCA, habitat factors generated by the former were more readily interpreted than conventional principal components. Of the two cluster-scoring methods, optimally scaled PCA explained 24% more variance in the first principal component and marginally improved the accuracy of distribution models.
- 5Synthesis and applications. Initial variable clustering makes PCA more interpretable and will benefit the understanding of research results and their translation into management. Variable clustering should also reduce costs as variables contributing to unused clusters need not be recorded in future (cf. PCA). Optimal scaling further increases the versatility of PCA: qualitative ecological data (e.g. habitat categories) can be analysed in the same way as quantitative data, with real benefits to applied research. With cost constraints and the need for dissemination key applied issues, our results offer an important potential advance.

1365-2664/asset/olbannerleft.gif?v=1&s=8b608cc23970983efcf0bf9354181123ee4feba9)
1365-2664/asset/olbannerright.gif?v=1&s=01405a21098d64198820bdbe2e30807b513e69f2)
