Filling the gap in functional trait databases: use of ecological hypotheses to replace missing data
Article first published online: 25 FEB 2014
© 2014 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd.
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
Ecology and Evolution
Volume 4, Issue 7, pages 944–958, April 2014
How to Cite
Ecology and Evolution 2014; 4(7):944–958
- Issue published online: 7 APR 2014
- Article first published online: 25 FEB 2014
- Manuscript Accepted: 16 JAN 2014
- Manuscript Revised: 15 JAN 2014
- Manuscript Received: 23 JUL 2013
- European Community's Seventh Framework Program. Grant Number: FP7/2007-2013
- MULTISWARD. Grant Number: FP7-244983
- Functional diversity;
- imputation methods;
- LEDA database;
- missing data;
- plant functional trait
Functional trait databases are powerful tools in ecology, though most of them contain large amounts of missing values. The goal of this study was to test the effect of imputation methods on the evaluation of trait values at species level and on the subsequent calculation of functional diversity indices at community level using functional trait databases. Two simple imputation methods (average and median), two methods based on ecological hypotheses, and one multiple imputation method were tested using a large plant trait database, together with the influence of the percentage of missing data and differences between functional traits. At community level, the complete-case approach and three functional diversity indices calculated from grassland plant communities were included. At the species level, one of the methods based on ecological hypothesis was for all traits more accurate than imputation with average or median values, but the multiple imputation method was superior for most of the traits. The method based on functional proximity between species was the best method for traits with an unbalanced distribution, while the method based on the existence of relationships between traits was the best for traits with a balanced distribution. The ranking of the grassland communities for their functional diversity indices was not robust with the complete-case approach, even for low percentages of missing data. With the imputation methods based on ecological hypotheses, functional diversity indices could be computed with a maximum of 30% of missing data, without affecting the ranking between grassland communities. The multiple imputation method performed well, but not better than single imputation based on ecological hypothesis and adapted to the distribution of the trait values for the functional identity and range of the communities. Ecological studies using functional trait databases have to deal with missing data using imputation methods corresponding to their specific needs and making the most out of the information available in the databases. Within this framework, this study indicates the possibilities and limits of single imputation methods based on ecological hypothesis and concludes that they could be useful when studying the ranking of communities for their functional diversity indices.