Biases with the Generalized Euclidean Distance measure in disparity analyses with high levels of missing data
Data archiving statement:
Data for this study, including data sets, scripts, and complete graphical results are available in the Dryad Digital Repository: https://doi.org/10.5061/dryad.4cv1421
Abstract
The Generalized Euclidean Distance (GED) measure has been extensively used to conduct morphological disparity analyses based on palaeontological matrices of discrete characters. This is in part because some implementations allow the use of morphological matrices with high percentages of missing data without needing to prune taxa for a subsequent ordination of the data set. Previous studies have suggested that this way of using the GED may generate a bias in the resulting morphospace, but a detailed study of this possible effect has been lacking. Here, we test whether the percentage of missing data for a taxon artificially influences its position in the morphospace, and if missing data affects pre‐ and post‐ordination disparity measures. We find that this use of the GED creates a systematic bias, whereby taxa with higher percentages of missing data are placed closer to the centre of the morphospace than those with more complete scorings. This bias extends into pre‐ and post‐ordination calculations of disparity measures and can lead to erroneous interpretations of disparity patterns, especially if specimens present in a particular time interval or clade have distinct proportions of missing information. We suggest that this implementation of the GED should be used with caution, especially in cases with high percentages of missing data. Results recovered using an alternative distance measure, Maximum Observed Rescaled Distance (MORD), are more robust to missing data. As a consequence, we suggest that MORD is a more appropriate distance measure than GED when analysing data sets with high amounts of missing data.
Citing Literature
Number of times cited according to CrossRef: 4
- Damián E. Pérez, Luciana M. Giachetti, Is Cyclocardia (Conrad) a wastebasket taxon? Exploring the phylogeny of the most diverse genus of the Carditidae (Archiheterodonta, Bivalvia), Palaeontology, 10.1111/pala.12467, 63, 3, (477-495), (2020).
- Benjamin C. Moon, Thomas L. Stubbs, Early high rates and disparity in the evolution of ichthyosaurs, Communications Biology, 10.1038/s42003-020-0779-6, 3, 1, (2020).
- Thomas Guillerme, Natalie Cooper, Stephen L. Brusatte, Katie E. Davis, Andrew L. Jackson, Sylvain Gerber, Anjali Goswami, Kevin Healy, Melanie J. Hopkins, Marc E. H. Jones, Graeme T. Lloyd, Joseph E. O'Reilly, Abi Pate, Mark N. Puttick, Emily J. Rayfield, Erin E. Saupe, Emma Sherratt, Graham J. Slater, Vera Weisbecker, Gavin H. Thomas, Philip C. J. Donoghue, Disparities in the analysis of morphological disparity, Biology Letters, 10.1098/rsbl.2020.0199, 16, 7, (20200199), (2020).
- Daniel D. Cashmore, Philip D. Mannion, Paul Upchurch, Richard J. Butler, Ten more years of discovery: revisiting the quality of the sauropodomorph dinosaur fossil record, Palaeontology, 10.1111/pala.12496, 0, 0, (undefined).




