Shifting spaces: which disparity or dissimilarity metrics best summarise occupancy in multidimensional spaces?

Multidimensional analysis of traits are now a common toolkit in ecology and evolution and are based on trait-spaces in which each dimension summarise the observed trait combination (a morphospace or an ecospace). Observations of interest will typically occupy a subset of this trait-space, and researchers will apply one or more metrics to quantify the way in which organisms “inhabit” that trait-space. In macroevolution and ecology these metrics are referred to as disparity or dissimilarity metrics and can be generalised as space occupancy metrics. Researchers use these metrics to investigate how space occupancy changes through time, in relation to other groups of organisms, and in response to global environmental changes, such as global warming events or mass extinctions. However, the mathematical and biological meaning of most space occupancy metrics is vague with the majority of widely-used metrics lacking formal description. Here we propose a broad classification of space occupancy metrics into three categories that capture changes in volume, density, or position. We analyse the behaviour of 25 metrics to study changes in trait-space volume, density and position on a series of simulated and empirical datasets. We find no one metric describes all of trait-space but that some metrics are better at capturing certain aspects compared to other approaches and that their performance depends on both the trait-space and the hypothesis analysed. However, our results confirm the three broad categories (volume, density and position) and allow to relate changes in any of these categories to biological phenomena. Since the choice of space occupancy metric should be specific to the data and question at had, we introduced moms, a user-friendly tool based on a graphical interface that allows users to both visualise and measure changes space occupancy for any metric in simulated or imported trait-spaces. Users are also provided with tools to transform their data in space (e.g. contraction, displacement, etc.). This tool is designed to help researchers choose the right space occupancy metrics, given the properties of their trait-space and their biological question.


In mathematics
In ecology In macroevolution In this paper Ecologists and evolutionary biologists also often use trait-spaces with respect to the same fundamental  Figure 1: different type of information captured by space occupancy metrics. A -Volume (e.g. sum of ranges); B -Density (e.g. average squared pairwise distances); C -Position (e.g. median distance from centroid). build a trait-space, three broad occupancy metrics can be measured: the volume which will approximate the 86 amount of space occupied, the density which will approximate the distribution in space and the position 87 which will approximate the location in space ( Fig. 1; Villéger et al. 2008). Of course any combination of 88 these three aspects is always possible.

102
Density metrics measure the distribution of a group in the trait-space. They can be interpreted as the the mammalian trait-space (adaptive radiation; Halliday and Goswami 2016) but more specific questions can 126 be answered by looking at other aspects of trait-space occupancy: does the radiation expands on previously 127 existing morphologies (elaboration, increase in density; Endler et al. 2005) or does it explore new regions 128 of the trait-space (innovation, change in position; Endler et al. 2005)? Similarly, in ecology, if two groups 129 occupy the same volume in the trait-space, it can be interesting to look at differences in density within these 130 two groups: different selection pressure can lead to different density within equal volume groups.

131
Here, we provide the first interdisciplinary review of 25 space occupancy metrics that uses the broad 132 classification of metrics into volume, density and position to capture pattern changes in trait-space. We 133 assess the behaviour of metrics using simulations and six interdisciplinary empirical datasets covering a wide 134 range of potential data types and biological questions. We also introduce a tool for measuring occupancy in 135 multidimensional space (moms), which is a user-friendly, open-source, graphical interface to allow the tailored 136 testing of metric behaviour for any use case. moms will allow workers to comprehensively assess the properties 137 of their trait-space and the metrics associated with their specific biological question.

139
We tested how 25 different space occupancy metrics relate to each other, are affected by modifications of 140 traits space and affect group comparisons in empirical data. To do so, we performed the following steps 141 (explained in more detail below): 142 1. We simulated 13 different spaces with different sets of parameters; 143 2. We transformed these spaces by removing 50% of the observations following four different scenarios 144 corresponding to different empirical scenarios: randomly, by limit (e.g. expansion or reduction of niches), 145 by density (e.g. different degrees of competition within a guild) and by position (e.g. ecological niche 146 shift). 3. We measured occupancy on the resulting transformed spaces using eight different space occupancy 148 metrics; 149 4. We applied the same space occupancy metrics to six empirical datasets (covering a range of disciplines 150 and a range of dataset properties).

151
Note that the paper contains the results for only eight metrics, the results for the additional 17 metrics is available in the supplementary material 4.

153
Generating spaces 154 We generated trait-spaces using the following combinations of size, distributions, variance and correlation: the density of pairs of observations in the trait-space Position This paper the ratio between the observations' position from their centroid and the centre of the trait-space. A value of 1 indicates that the observations' centroid is the centre of the trait-space    The algorithm to select ρ or D is described in greater detail in in the Supplementary material 1.

207
To measure the effect of space reduction, distribution and dimensionality on the metric, we scaled the metric to Metric score from the removal displayed in blue above (Limit, Density or Position).
Metric score from the removal displayed in orange above (Limit, Density or Position).
Metric score from the random removal. Each group (orange and blue) are generated using the following algorithm: A -randomly; B -by limit 217 (maximum and minimum limit); C -by density (high and low); and D -by position (positive and negative).

218
Panel E represents a typical display of the reduction results displayed in Table 5: the dots represent the 219 median space occupancy values across all simulations for each scenario of trait-space change (Table 2)  This probability decreases as a product of the number of dimensions. Therefore, the "curse" can make 232 the interpretation of high dimensional data counter-intuitive. For example if a group expands in multiple 233 dimensions (i.e. increase in volume), the actual hypervolume can decrease ( Fig. 3 and Tables 6, 7). 234 We measured the effect of space distribution and dimensionality using an ANOVA (occupancy ∼ distribution 235 and occupancy ∼ dimensions) by using all spaces with 50 dimensions and the uniform and normal spaces 236 with equal variance and no correlation with 3, 15, 50, 100 and 150 dimensions (Table 2)   Empirical examples 240 We analysed the effect of the different space occupancy metrics on six different empirical studies covering a 241 broad range of fields that employ trait-space analyses (palaeobiology, macroevolution, evo-devo, ecology, etc.).

242
For each of these studies we generated trait-spaces from the data published with the papers. We divided each     Here we tested 25 metrics of trait-space occupancy on simulated and empirical datasets to assess how each 294 metric captures changes in trait-space volume, density and position. Our results show that the correlation 295 between metrics can vary both within and between metric categories (Fig. 3), highlighting the importance of

307
Furthermore, the fact that we have such a range of correlations for normal distributions suggests that each 308 metric can capture different summaries of space occupancy ranging from obvious differences (for metrics not 309 strongly correlated) to subtle ones (for metrics strongly correlated).

310
Space shifting 311 Most metrics capture no changes in space occupancy for the "null" (random) space reduction (in grey in 312 Table 5). This is a desirable behaviour for space occupancy metrics since it will likely avoid false positive 313 errors in empirical studies that estimate biological processes from space occupancy patterns (e.g. competition median. This is not especially a bad property but it should be kept in mind that even random processes can 317 increase or decrease these metric value.

318
Regarding the changes in volume, the sum of variances and the average distance from centroid are good 319 descriptors (Table 5). However, as illustrated in the 2D examples in Fig. 2-B only the blue change results 320 (maximum limit - Table 5) should not result in a direct change in volume since the trait-space is merely 321 "hollowed" out. That said, "hollowing" is more hard to conceptualise in many dimensions and the metrics can still be interpreted for comparing groups (orange has a smaller volume than blue).

323
Regarding changes in density, the average nearest neigbhour distance and the minimum spanning tree average 324 distance consistently detect changes in density with more precision for low density trait-spaces (in blue in 325   Table 5). However, we can observe some degree of correlation between the changes in density and the changes 326 in volume for most metric picking either signal. This could be due to the use of normally distributed spaces 327 where a change in density often leads to a change in volume. This is not necessary the case with empirical 328 data.

329
Regarding the changes in position of the trait-space, all but the average displacement metric seems to not be 330 able to distinguish between a random change and a displacement of the trait-space (Table 5)  We insist that no metric is better than the next one and that researchers should use the most appropriate 354 metrics based on the metric and trait-space properties as well as their specific biological question. However,

355
following the findings of this study we suggest several points:

356
First, we suggest using multiple metrics to tackle different aspects of the trait-space. This follows the same 357 logical thinking that the mean might not be sufficient to describe a distribution (e.g. the variance might be Third, we suggest to not name metrics as the biological aspect they are describing (e.g. "disparity" or 366 "functional dispersion") but rather what they are measuring (e.g. "sum of dimensions variance"). We believe 367 this will allow both a clearer understanding of what is measured and a better communication between ecology 368 and evolution research where metrics can be similar but have different names (Fig. 3). 369 Multidimensional analyses have been acknowledged to be an essential tool-kit modern biology but can often 370 be counter-intuitive (Chávez et al. 2001). It is thus crucial to accurately describe patterns in multidimensional 371 trait-spaces to be able to link them to biological processes. When summarising trait-spaces, it is important to 372 remember that a pattern captured by a specific space occupancy metric is often dependent on the properties 373 of the trait-space and of the particular biological question of interest. We believe that having a clearer 374 understanding of both the properties of the trait-space and the associated space occupancy metrics (e.g. using 375 moms) as well as using novel space occupancy metrics to answer specific questions will be of great use to study 376 biological processes in a multidimensional world.

377
Acknowledgements 378 We thank Natalie Jones and Kevin Healy for helping with the empirical ecological datasets. We acknowledge 379 funding from the Australian Research Council DP170103227 and FT180100634 awarded to VW.