Distance-based multivariate analyses confound location and dispersion effects

Authors

  • David I. Warton,

    Corresponding author
    1. School of Mathematics and Statistics and Evolution & Ecology Research Centre
      Correspondence author. E-mail: david.warton@unsw.edu.au
    Search for more papers by this author
  • Stephen T. Wright,

    1. School of Mathematics and Statistics and Evolution & Ecology Research Centre
    Search for more papers by this author
  • Yi Wang

    1. School of Mathematics and Statistics and Evolution & Ecology Research Centre
    2. School of Computer Science and Engineering, The University of New South Wales, NSW 2052, Australia
    Search for more papers by this author

Correspondence author. E-mail: david.warton@unsw.edu.au

Summary

1. A critical property of count data is its mean–variance relationship, yet this is rarely considered in multivariate analysis in ecology.

2. This study considers what is being implicitly assumed about the mean–variance relationship in distance-based analyses – multivariate analyses based on a matrix of pairwise distances – and what the effect is of any misspecification of the mean–variance relationship.

3. It is shown that distance-based analyses make implicit assumptions that are typically out-of-step with what is observed in real data, which has major consequences.

4. Potential consequences of this mean–variance misspecification are: confounding location and dispersion effects in ordinations; misleading results when trying to identify taxa in which an effect is expressed; failure to detect a multivariate effect unless it is expressed in high-variance taxa.

5. Data transformation does not solve the problem.

6. A solution is to use generalised linear models and their recent multivariate generalisations, which is shown here to have desirable properties.

Ancillary