Aim Various methods are employed to recover patterns of area relationships in extinct and extant clades. The fidelity of these patterns can be adversely affected by sampling error in the form of missing data. Here we use simulation studies to evaluate the sensitivity of an analytical biogeographical method, namely tree reconciliation analysis (TRA), to this form of sampling failure.
Location Simulation study.
Methods To approximate varying degrees of taxonomic sampling failure within phylogenies varying in size and in redundancy of biogeographical signal, we applied sequential pruning protocols to artificial taxon–area cladograms displaying congruent patterns of area relationships. Initial trials assumed equal probability of sampling failure among all areas. Additional trials assigned weighted probabilities to each of the areas in order to explore the effects of uneven geographical sampling. Pruned taxon–area cladograms were then analysed with TRA to determine if the optimal area cladograms recovered match the original biogeographical signal, or if they represent false, ambiguous or uninformative signals.
Results The results indicate a period of consistently accurate recovery of the true biogeographical signal, followed by a nonlinear decrease in signal recovery as more taxa are pruned. At high levels of sampling failure, false biogeographical signals are more likely to be recovered than the true signal. However, randomization testing for statistical significance greatly decreases the chance of accepting false signals. The primary inflection of the signal recovery curve, and its steepness and slope depend upon taxon–area cladogram size and area redundancy, as well as on the evenness of sampling. Uneven sampling across geographical areas is found to have serious deleterious effects on TRA, with the accuracy of recovery of biogeographical signal varying by an order of magnitude or more across different sampling regimes.
Main conclusions These simulations reiterate the importance of taxon sampling in biogeographical analysis, and attest to the importance of considering geographical, as well as overall, sampling failure when interpreting the robustness of biogeographical signals. In addition to randomization testing for significance, we suggest the use of randomized sequential taxon deletions and the construction of signal decay curves as a means to assess the robustness of biogeographical signals for empirical data sets.