Assessing model sensitivity in ancestral area reconstruction using Lagrange: a case study using the Colchicaceae family
Likelihood analyses of ancestral ranges require a parameterized model that consists of a time-calibrated phylogeny, an ‘adjacency matrix’ of allowed or forbidden area connections, and an ‘area–dispersal’ matrix with probabilities for discrete periods of time. The approach is implemented in the software Lagrange. Because it can incorporate information about past continental positions, the approach has been used in historical biogeographical studies of relatively old clades. Surprisingly, no study has evaluated the interactions among these input matrices. Here we use the lily family Colchicaceae and artificial data to study the relative effect of the input matrices on final estimates.
Africa, Australia, Eurasia, North America and South America.
Using eight plastid, mitochondrial and nuclear DNA regions from 85 of the c. 280 species of Colchicaceae (representing all genera and the entire geographical range) and relevant outgroups, we obtained a well-resolved phylogeny dated with a molecular clock. We then assigned species to six geographical distributions and carried out 22 Lagrange runs in which we modified the adjacency and dispersal matrices, the latter with zero, two or four time periods and one, three or five dispersal probabilities. For a second data set, the areas at deep nodes in the empirical tree were modified by shuffling species distributions. Models were compared based on global log-likelihoods.
The adjacency matrix strongly determined the outcome, while time slices and dispersal probability categories had minor effects. Ancestral areas reconstructed at most nodes were unaffected by the different input matrices. Colchicaceae are likely to have originated in Cretaceous East Gondwana, initially diversified in Australia (c. 67 Ma), reached southern Africa during the Palaeocene–Eocene, and from there extended their range to Southeast Asia (probably through Arabia) and then North America (through Beringia).
At least in small data sets, Lagrange models should be tested with sensitivity analyses as carried out here, concentrating on constrained versus unconstrained adjacency matrices, and it should be good practice to report the set-up of both input matrices, not just the dispersal matrix, which is the less decisive of the two.