Making better Maxent models of species distributions: complexity, overfitting and evaluation
Article first published online: 6 DEC 2013
© 2013 John Wiley & Sons Ltd
Journal of Biogeography
Volume 41, Issue 4, pages 629–643, April 2014
How to Cite
Radosavljevic, A., Anderson, R. P. (2014), Making better Maxent models of species distributions: complexity, overfitting and evaluation. Journal of Biogeography, 41: 629–643. doi: 10.1111/jbi.12227
- Issue published online: 17 MAR 2014
- Article first published online: 6 DEC 2013
- Manuscript Accepted: 26 AUG 2013
- Manuscript Revised: 7 AUG 2013
- Manuscript Received: 15 MAR 2013
- US National Science Foundation. Grant Numbers: NSF DEB-0717357, DEB-1119915
- International Biogeography Society
- Cross validation;
- Heteromys ;
- Maxent ;
- South America;
Models of species niches and distributions have become invaluable to biogeographers over the past decade, yet several outstanding methodological issues remain. Here we address three critical ones: selecting appropriate evaluation data, detecting overfitting, and tuning program settings to approximate optimal model complexity. We integrate solutions to these issues for Maxent models, using the Caribbean spiny pocket mouse, Heteromys anomalus, as an example.
North-western South America.
We partitioned data into calibration and evaluation datasets via three variations of k-fold cross-validation: randomly partitioned, geographically structured and masked geographically structured (which restricts background data to regions corresponding to calibration localities). Then, we carried out tuning experiments by varying the level of regularization, which controls model complexity. Finally, we gauged performance by quantifying discriminatory ability and overfitting, as well as via visual inspections of maps of the predictions in geography.
Performance varied among data-partitioning approaches and among regularization multipliers. The randomly partitioned approach inflated estimates of model performance and the geographically structured approach showed high overfitting. In contrast, the masked geographically structured approach allowed selection of high-performing models based on all criteria. Discriminatory ability showed a slight peak in performance around the default regularization multiplier. However, regularization levels two to four times higher than the default yielded substantially lower overfitting. Visual inspection of maps of model predictions coincided with the quantitative evaluations.
Species-specific tuning of model parameters can improve the performance of Maxent models. Further, accurate estimates of model performance and overfitting depend on using independent evaluation data. These strategies for model evaluation may be useful for other modelling methods as well.