Spatial phylogenetics

The metric called phylogenetic diversity (PD) has been employed over the last 30 years to add an evolutionary dimension to the exploration of biodiversity. However, the application of PD was until recently limited by both technology and methodology. Newly available distributional data from global museum databasing efforts, rapidly increasing coverage of DNA sequence data and improvements to computer hardware and software have enabled a new ‘big data’ approach to the application of PD‐based metrics and randomization‐based hypothesis tests called ‘spatial phylogenetics’. It can be defined most simply as turning a phylogeny into a GIS layer, which can then be used with other GIS layers to understand drivers of phylodiversity patterns and for conservation prioritization. Alpha and beta phylodiversity can be measured using different ways of representing branch lengths on a given topology (called ‘facets’), each yielding a different, interesting perspective that are best viewed in combination. Challenges posed by available data need to be addressed through careful cleaning and gathering further data in a targeted manner. Spatial phylogenetics is only in its infancy, showing much promise but with many elements awaiting expansion to address further questions.


| INTRODUC TI ON
Biogeography has been overwhelmingly dominated by speciescentric approaches. Species distributions are mapped to assess biodiversity, ecological interactions are discussed on a species by species basis and climate change is viewed as affecting a species as a whole entity. For example, a search of the top 50 Journal of Biogeography articles (ranked by average annual citations) shows that at least 40 papers take species as the main unit of measurement or analysis (https://onlin elibr ary.wiley.com/doi/toc/10.1111/ (ISSN) 1365-2699.top-50-cited -papers). But there is much more to biodiversity, ecology and climate change than species. Species are at best just one arbitrary level on the tree of life; there are important lineages below and above the named species level that need to be studied for a complete understanding of biogeographical patterns and processes. At worst, named species are often not clades at all, in groups that have not been well studied phylogenetically. For these reasons, we need to shift away from a reliance on species taxa and take a broader phylogenetic view.
The species-centric approach might have been appropriate in Humboldt's day, when species were viewed as specially created entities. But in the Darwinian era, we should take an evolutionary approach. Darwin himself (1859) argued that the species level is nothing special; it is just some arbitrary point in the divergence of lineages that we give a name to out of convenience. While many ecologists still view species as real entitiesas compared to higher taxonomic levels, in truth species are only as real as genera are (and only if they are clades; Mishler, 2021). For Darwin, and hopefully for us in the modern day, the divergence process is what is important, in all its hierarchical glory.
Divergence and reticulation of lineages takes place at multiple, nested levels, all of which are potentially important, and none of which should be singled out a priori as the fundamental level (Mishler, 2022).

| 1455
MISHLER Spatial phylogenetics is one answer to the question of how to take an evolutionary approach to address many biogeographical questions.
Instead of species, it uses the full tree of life, as far as currently known.
It can be defined most simply as turning a phylogeny into a GIS layer, and employs spatial randomizations to test hypotheses statistically.
We ask what proportion of the total branch length of a particular phylogenetic topology is present in a given pixel, employing various nuances as to how those branch lengths are measured. These nuances can involve modifying the original inferred molecular branch lengths to reflect different factors, as discussed in the 'facets' section below.
Spatial phylogenetics is fundamentally free of taxonomic ranks, which allows it to avoid the effects of splitting vs. lumping debates that plague species-counting measures of biodiversity. Given limitations of current data bases, it is often the case that named species are the terminals represented in a phylogeny, but this is not necessary. We can include smaller clades if they are known or use more inclusive clades if we do not have fine-scale phylogenetic information. We can mix and match different taxonomic levels. As long as the terminals are hypothesized to represent monophyletic groups (clades), a phylogeny can logically be built to connect them.

| HIS TORY
The spatial phylogenetics approach is based on the fundamental idea of phylogenetic diversity (PD) proposed by Faith (1992). Initial applications of PD in the 1990s and 2000s were hampered by several technical limitations. There were not enough distributional or DNA sequence data to represent the ranges and relationships of a large chunk of the biota present in an area. Existing phylogenetic software and computer hardware were not then up to the task of calculating phylogenies with thousands of terminals. In turn, these data and technical limitations dampened the exploration of new methods. Thus, PD was perforce initially applied in limited phylogenetic and geographical contexts, without the 'big data', statistical approaches available today.
However, given rapid accumulation of DNA data in Genbank and geographical data in GBIF (due to efforts around the world to digitize museum data), coupled with improvements in phylogenetic software and computer hardware, it has been possible over the last decade to apply the concept of PD to large geographical regions for large branches of the tree of life. These studies, in turn, stimulated the development of methodological advances in the metrics and the development of randomization tests to judge statistical significance, giving rise to the emergent field of spatial phylogenetics.
Interestingly, the earliest studies were on plants. It is not clear why that was-possibly because the species level is more entrenched in zoology. Botanists have always been more sceptical about species, and botanists also adopted an organized, cooperative approach to building big phylogenies much earlier than zoologists did (Mishler, 2014). Whatever the reason, there are many opportunities available to apply spatial phylogenetics to animals, given that phylogenies and distributional data are readily available. These methods have been applied in some animal groups: bats (López-Aguirre et al., 2018), birds (Garcia et al., 2019), snakes (Azevedo et al., 2020), vertebrates (Murali et al., 2021), ants (Camacho et al., 2021) and butterflies (Earl et al., 2021), and there is clearly much room for further studies on animals.
These spatial phylogenetic methods are highly appropriate for microbes as well, given the large phylogenies and corresponding spatial data that are becoming available due to metagenomic and microbiome studies. Amazingly, given that no one really thinks a traditional species concept applies to bacteria, these studies generally spend their energy trying to 'bin' data into 'species' and count those up as their only measure of biodiversity! PD-based measures thus seem even more important to use with microbes (Faith et al., 2009); hopefully, spatial phylogenetics will be applied soon to microbes, which after all comprise by far most of the PD in the tree of life.

| ME THODS FOR A SS E SS ING ALPHA PHYLODIVER S IT Y
The first task in any biodiversity assessment is to understand alpha diversity: how much diversity is in each location (usually, but not necessarily, a standard-sized grid cell). To assess alpha phylodiversity, we want to know which branches (both terminal and internal) of an encompassing phylogenetic tree built for an entire study region are present locally. The classic measure PD adds up the branch lengths for the minimum path connecting all the terminals in a locality to the base of the tree.
A second task in any biodiversity assessment is to assess relative endemism: how range restricted are the entities in each location (Crisp et al., 2001). Phylogenetic endemism (PE; Rosauer et al., 2009) is an analogous measure to weighted endemism of species-it takes into account the range size of the branches (both terminal and internal) of an encompassing phylogeny that are present locally. This is fundamentally a PD-based measure, but carried out on a tree whose branch lengths have been divided by range size, thus downweighting widespread branches.
Two derived measures of alpha phylodiversity have proven useful as well. Relative PD (RPD) is a ratio comparing PD measured on the original tree, with PD measured on a comparison tree that contains the same nodes but has each branch length adjusted to be of equal length. Likewise, relative PE (RPE) is a ratio comparing PE measured on the original tree, with PE measured on the same comparison tree.
In both cases, the only thing varying between the numerator and denominator of the ratio is the relative branch lengths; thus, if the ratio is large, there must be many long branches present, while if it is small there must be many short branches present .
Finding concentrations of PD and PE, or concentrations of long and short branches, is an important step towards tests of ecological and evolutionary hypotheses, and clearly relevant to conservation concerns. However, we need a way of judging whether such concentrations are statistically significant. How are these metrics expected to be distributed on the landscape under a null model?

| S TATIS TI C AL TE S TS
Richness and PD are expected to be correlated, since every time you add a tip to a tree you add a branch. Thus, mapping raw patterns of PD tells us little beyond mapping richness. It is necessary to develop hypothesis tests that use this expected correlation as the basis for a biogeographical null hypothesis representing a neutral model of terminal taxon distribution-if it does not matter what terminal taxa occur together in a locality, then there should be a tight correlation between richness and PD. Thus, a spatial randomization can provide a useful test of an hypothesis that it does matter what terminal taxa occur together. The randomization that is usually applied is one that randomizes terminal taxon occurrences subject to two constraints: the richness of each locality remains constant, and the range size of each terminal taxon remains constant (called 'rand_structured' in Biodiverse; Laffan et al., 2010). We randomize the occurrences many times to generate a distribution of this null expectation, and compare the observed measure to it, often using a two-tailed test since both extremes of the null distribution can be of interest.
The same randomization can be used to test hypotheses using all the alpha phylodiversity metrics described above. For example, significantly high PD means that the taxa occurring together are more distantly related to each other than expected by chance ('phylogenetic overdispersion'; Webb et al., 2002), while significantly low PD means that the taxa occurring together are more closely related to each other than expected by chance ('phylogenetic clustering ';Webb et al., 2002). The former could indicate that an ecological process of competitive exclusion among close relatives has occurred, while the latter could indicate a different ecological process of habitat filtering, when close relatives have the same habitat preferences and are drawn to the same locations (Webb et al., 2002). A different hypothesis is tested with RPD (see Figure 1 which shows two examples of RPD significance). Significantly high RPD means that there is a concentration of long branches in a location, while significantly low RPD means that there is a concentration of short branches in a location . The former could indicate that a region is a refuge F I G U R E 1 An illustration of phylogenetic measures of alpha diversity for North American butterflies and seed plants. The bottom two maps show the patterns of relative phylogenetic diversity in the two groups, indicating major differences in geographical concentrations of short branches (in red; which may indicate regions of recent diversification), and in geographical concentrations of long branches (in blue; which may indicate regions containing refugial taxa). Prepared by Chandra Earl and reprinted by permission from Earl et al. (2021). containing a number of long branches pruned by extinction, while the latter could indicate a region where recent diversification has happened. Of course, while such patterns of significance are important, they are only part of a complete explanation that also requires study of the traits of taxa, their evolution and their ecological interactions.
Employing the same randomization, PE and RPE are assessed together in a two-step approach called Categorical Analysis of Neo-and Paleo-Endemism . The first step is a one-tailed test of PE, to find locations that are significantly high in PE (i.e. centres of endemism), as measured on either the original tree or the comparison tree described above. The reason why both topologies are used is to allow range-restricted short branches to have a chance to be included-since PE confounds the original branch length with endemism, it is biased towards finding range-restricted long branches when measured on the original tree. The second step is applied only to those locations that were significant in the first step, and is a twotailed test using RPE (see Figure 2). Significantly high RPE means that there is a concentration of range-restricted long branches in a location (i.e. a centre of paleoendemism), while significantly low RPD means that there is a concentration of range-restricted short branches in a location (i.e. a centre of neoendemism). Locations that are neither significantly high nor low in the second step are called centres of mixed endemism, and if they are highly significant in PE they have been called centres of super-endemism .
The significance tests of these metrics each tell you something different about ecology, evolution and biogeographical history and are best examined in combination. For example, Thornhill et al. (2016Thornhill et al. ( , 2017 were able to use comparisons among the metrics to characterize different regions of Australia and California, respectively, both for academically interesting insights into processes driving biodiversity patterns and for applied conservation purposes. Kling et al. (2018) made further comparisons among California locations, applied an algorithm that used a phylogenetic measure of complementarity and produced a rigorous conservation prioritization.

| ME THODS FOR A SS E SS ING B E TA PHYLODIVER S IT Y
Even once patterns of alpha diversity have been established, it is important to study patterns of beta diversity: what are the differences among locations. To assess beta phylodiversity, we want to know which branches (both terminal and internal) of a encompassing phylogenetic tree are shared between two local regions (called 'phyloturnover'). The metrics that are used are modifications of familiar species turnover measures such as Jaccard, Sørenson and Simpson indices. But instead of counting how many species are shared or not shared between two locations, the phyloturnover equivalent of these metrics counts the lengths of the tree branches that are shared or not shared between two locations (Faith et al., 2009;Graham & Fine, 2008;Laffan, Rosauer, et al., 2016).
Phyloturnover using the original tree could also be called PD turnover. Laffan, Rosauer, et al. (2016) presented a new measure of phyloturnover that uses a range-weighted tree, as discussed above when introducing PE. This is a tree in which the original branch lengths have been divided by their range size, thus down-weighting branches that are widespread on the landscape. Phyloturnover using the range-weighted tree could also be called PE turnover. This appears to be a useful metric for such purposes as establishing bioregions (e.g. Figure 3), because it fits our intuitions in that for locating biotic breaks we are not interested in lineages found all over the map but rather those with limited ranges .  Thornhill et al., 2017). This is because branch lengths tend to be tiny for the branches that differ among topologies that are close competitors according to an optimality criterion such as maximum likelihood; thus, they do not strongly affect PD-based measures.

| FACE TS OF PHYLODIVER S IT Y
There are an indefinitely large number of potential facets for any Significance of PD measured using this facet relates to 'feature diversity' (as discussed by Faith, 1992).
2. Chronograms, where the branch lengths are scaled by the inferred time that elapsed on them. Unlike phylograms, chronograms are ultrametric, meaning that all tips reach the present at the same time, so sister groups always have the same branch length. There are an indefinite number of possible chronograms for the same topology, based on different methods for time-calibrating phylogenies. It is also important to note that the raw data used in the calibration inference include the original molecular branch lengths-those lengths are being pushed and pulled to make the tree ultrametric. So this facet is not entirely independent of the phylogram. Significance of PD measured using this facet relates to temporal diversity, that is the survival time of the lineages present (Kling et al., 2018).
3. Cladograms, where the branch lengths are scaled so as to make them of equal length, that is each branch has an equal proportion of the total length of the tree (i.e. the facet used in the denominator of the RPD and RPE ratio). Significance of PD measured using this facet relates to net diversification, that is how many divergence events are represented among the lineages present (Kling et al., 2018). This is a net measure, of course, since extinction can also affect the number of divergences observed.

F I G U R E 3
An illustration of using range-weighted phylogenetic turnover to detect biotic regions for Chilean vascular plants. The colours in the cluster diagram correspond to the colours on the map, and indicate locations that share a similar part of the phylogeny, weighting range-restricted branches. Reprinted by permission from Scherson et al. (2017). Just as described above for different metrics measured on the same facet, significance tests of the same metric measured on different facets of the tree each tell you something different about ecology, evolution and biogeographical history. Thus, to gain maximum insights on the distribution of biodiversity and its causes, it is best to look at all the different metrics on different facets and interpret results comparatively (e.g. see Figure 5). Phylograms tell you about trait evolution, chronograms tell you about elapsed time and cladograms tell you about divergence events. When it comes to conservation assessment, it is important to look at PD and PE on these three main types of facets; each representing different goals for conservation. Conservation based on phylograms emphasizes preserving genetic diversity, based on chronograms emphasizes preserving the combined experience of the lineages present, and based on cladograms emphasizes preserving places where high diversification in the future might be predicted (Kling et al., 2018).

| DATA CHALLENG E S
As with any 'big data' research, one must be cognizant of potential issues with data quality. Many of the challenges faced by spatial phylogenetics are not unique to it. Any spatial analysis of biodiversity, including all traditional species-based methods, relies on having reasonably accurate and unbiased spatial data. The two main kinds of distributional data: recorded observations (such as iNaturalist records) and vouchered data taken from museum/herbarium specimens are both often spatially biased in terms of collecting locations F I G U R E 4 An illustration of three major facets of phylodiversity for the California vascular plants. The branch connections are the same in each, but the branch lengths have been adjusted to reflect different measures as explained in the text (the tree topology is from Thornhill et al., 2017). and times (Daru et al., 2018); they are both also fraught with taxon name-matching issues. They have contrasting advantages and disadvantages, however: iNaturalist and other recent observations have a more accurate georeference than older museum/herbarium specimens, yet have a more uncertain identification in many cases since there is no voucher specimen to examine and dissect. All types of spatial data need careful cleaning to fix georeferencing errors and misidentifications, as well as to account for taxonomic synonymy problems Mishler et al., 2020).
Once the location data are as clean as possible, one popular approach is to then use them as the basis for ecological niche modelling to try to fill in gaps in the record and come up with reasonable hypotheses of taxon distributions (Sillero, 2011). While this approach is useful, it does assume that the distribution of a taxon is controlled by the macroenvironmental factors usually used to model the range, which may not be the case (i.e. rather than macroenvironment, a taxon's distribution could be caused by microenvironment, soil type or biotic interactions). Thus, Thornhill et al. (2017) recommended doing analyses twice, using point data and niche models, respectively, to check for robustness of results.
A challenge unique to spatial phylogenetics, as compared to traditional species-based methods, is the need for reasonably accurate and unbiased phylogenetic sampling. Sampling of the phylogeny does not have to be 'complete'-that goal is not achievable in any case given that many extant lineages are likely unknown and most lineages in a clade are now extinct. But sampling of the phylogeny should be unbiased, rather than concentrated in only a few major clades -that is as one goes up the tree from the base, node-by-node, the percent of the known terminal taxa sampled should be about even for both sister clades descended from each node. Some have been tempted to add unsampled species to the phylogeny based on their generic placement (e.g. Cai et al., 2023;Qian et al., 2021) but this not only adds nothing to the analysis, it introduces artefacts to the randomization tests. The only valid route to improve sampling of terminal taxa is to add new sequencing data.
Gathering new occurrence data to redress spatial biases should be targeted carefully to fill in the geographical ranges of terminal taxa and eliminate any regional sampling biases. Gathering new sequence data to redress phylogenetic biases should follow priorities based on the sister clade balancing approach mentioned above . That said, it appears the strongest results of spatial phylogenetics can be quite robust to sampling biases (Scherson et al., 2017) so one should not hesitate to do preliminary analyses with the data at hand while working to gather more.

| FUTURE DIREC TIONS
This is a new field, and there are many directions it should be expanded. New facets are certainly possible and could be useful for different purposes. For example, systematic exploration of phylograms representing different types of characteristics would be useful for exploring morphological, chemical or physiological differences. There is an important distinction between diversity (how many lineages are present in a place) and disparity (how different are they from each other; Guillerme et al., 2020). Spatial phylogenetics would provide an appropriate evolutionary framework to measure and compare these distinct but related concepts in the future.
Randomization approaches need much further exploration.
The best approach to use for spatial randomizations will certainly vary depending on the specific biogeographical question. The rand_structured approach described above, which has been used in most studies to date and seems rather robust , is suitable for cases where it is reasonable to postulate as a null expectation that any terminal taxon might have been expected to occur anywhere in the study area. When large regions are taken into account, at scales where dispersal limitations and historical factors become more important in determining distributions, this assumption is probably not realistic. In such cases, the randomization method should be spatially structured, but it is not F I G U R E 5 Comparing the three major facets of phylodiversity for the California vascular plants. The axes of the cube at left are the three facets; the selection of taxa present in a location could fall anywhere in this 3D space and thus lead to different interpretations, some of which are indicated by different shades of colour on the map at right. The cube, map and analysis are from Kling et al. (2018), using the tree from Thornhill et al. (2017), and used with permission. Diagram was kindly prepared by Matthew Kling. clear exactly how.  explored two ways of spatially structuring the randomization (that are implemented in Biodiverse, Laffan et al., 2010, http://shawn laffan.github.io/biodi verse/). One also could restrict the randomization to only happen within biotic regions (e.g. Mishler et al., 2020), but then one has to be able to objectively define the constraining regions. There might be a danger of circularity if one of the goals of a study was to define biotic regions for example. So this should be an active area of future investigation.
Because they are weighted by the range size of lineages, PE measures are obviously constrained by the choice of region for study (Daru et al., 2020). It is often seen that some centres of endemism are close to the edges of a study area, which may be at least partly because of lineages that are range-limited in that study area yet occur more widely outside. There is nothing wrong with studying local patterns of endemism, of course, as long as one is aware of this issue. Very often a management unit, such as state within the United States, does care about lineages that are range-restricted within its boundaries regardless of whether they occur elsewhere. One can partially correct for such an edge effect by running an analysis based solely on terminal taxa that are completely restricted to the study area (e.g. Thornhill et al., 2017), or by including the global range size of a terminal taxon in the PE calculation within a local study area (as discussed by Mishler et al., 2020). But neither approach fully deals with this issue, since it is not only the terminal taxa in the local study region that affect global PE, but also their relatives that are not present in the study region. Being able to use a global estimate of range size for all clades awaits a global spatial phylogenetic analysis, which are being contemplated but are hampered by the lack of good spatial data from many parts of the world.
So far, spatial phylogenetic methods have only been applied at fairly large geographical and phylogenetic scales, and there has been a disconnect with approaches called 'landscape genetics' or 'phylogeography'. This disconnect has been largely caused by the distinction made in those approaches below the species level. The methods they use within species are often phenetic (i.e. based on genetic distances), such as AMOVA (Meirmans, 2012) or structure plots (Ramasamy et al., 2014), rather than phylogenetic (based on gene trees). Since spatial phylogenetics does not make this distinction between within-species versus between-species analyses, a unified set of phylogeny-based methods can in principle be applied to all scales allowing a unified approach. However, there is work to do before this can happen.
To apply spatial phylogenetics at very fine spatial scales, finescale distributional data are required. If available from surveys or plot data, quite interesting studies could be done on small areas.
For example, it could be very useful to know the locations having the most unusual concentrations of local PD and PE when managing conservation within a single park. There is also no reason why spatial phylogenetics cannot be applied at very fine phylogenetic scales. It is the case that as one works at and below the traditional species level one tends to see more incongruence among gene trees than at higher levels. That need not be a hindrance; gene trees are still phylogenies having branch lengths. PD and PE could be assessed on incongruent gene trees present in the genomes of the organisms at a location, and then summed to get an ensemble score. An exciting area for future development will be applications of spatial phylogenetics at the population level, at fine scales for both geography and phylogeny.
So far, these methods have mostly been applied to spatial patterns of biodiversity. But the methods work just as well for temporal comparisons. The alpha and beta phylodiversity measures discussed above could be used to look at changes over time, for example, before and after future projected climate change (as was done by González-Orozco et al., 2016), giving rise to what might be called 'temporal phylogenetics'! Even though spatial phylogenetics has only been applied to neontological data to date, it would be exciting to extend the methods to palaeontological data, to track spatial and phylogenetic changes over time, for example, before and after major extinction events in earth history. Unlike standard species-counting methods, there is no need to have species be the terminal units on the phylogeny, which lends itself to palaeontological data, where getting to a species-level identification is often difficult.
Thus, extending spatial phylogenetics to both macroevolutionary scales on the one hand, and microevolutionary scales on the other hand, are important goals for the future. The exponential growth of genomic data, and the ever-increasing knowledge of organismal distributions, bodes well. Spatial phylogenetics is one of the areas of biogeography that bridge across the fields of systematics, evolutionary biology and ecology, and thus can provide interdisciplinary integration.