SEARCH

SEARCH BY CITATION

Keywords:

  • Aesculus;
  • biogeography;
  • DIVA;
  • fossil wildcards;
  • MrBayes;
  • phylogenetic uncertainty

Abstract

  1. Top of page
  2. Abstract
  3. 1 Material and methods
  4. 2 Results
  5. 3 Discussion
  6. 4 Conclusions
  7. Acknowledgments
  8. References
  9. Appendix

Abstract  We propose a simple statistical approach for using Dispersal–Vicariance Analysis (DIVA) software to infer biogeographic histories without fully bifurcating trees. In this approach, ancestral ranges are first optimized for a sample of Bayesian trees. The probability P of an ancestral range r at a node is then calculated as inline image where Y is a node, and F(rY) is the frequency of range r among all the optimal solutions resulting from DIVA optimization at node Y, t is one of n topologies optimized, and Pt is the probability of topology t. Node Y is a hypothesized ancestor shared by a specific crown lineage and the sister of that lineage “x”, where x may vary due to phylogenetic uncertainty (polytomies and nodes with posterior probability <100%). Using this method, the ancestral distribution at Y can be estimated to provide inference of the geographic origins of the specific crown group of interest. This approach takes into account phylogenetic uncertainty as well as uncertainty from DIVA optimization. It is an extension of the previously described method called Bayes-DIVA, which pairs Bayesian phylogenetic analysis with biogeographic analysis using DIVA. Further, we show that the probability P of an ancestral range at Y calculated using this method does not equate to pp*F(rY) on the Bayesian consensus tree when both variables are <100%, where pp is the posterior probability and F(rY) is the frequency of range r for the node containing the specific crown group. We tested our DIVA-Bayes approach using Aesculus L., which has major lineages unresolved as a polytomy. We inferred the most probable geographic origins of the five traditional sections of Aesculus and of Aesculus californica Nutt. and examined range subdivisions at parental nodes of these lineages. Additionally, we used the DIVA-Bayes data from Aesculus to quantify the effects on biogeographic inference of including two wildcard fossil taxa in phylogenetic analysis. Our analysis resolved the geographic ranges of the parental nodes of the lineages of Aesculus with moderate to high probabilities. The probabilities were greater than those estimated using the simple calculation of pp*F(ry) at a statistically significant level for two of the six lineages. We also found that adding fossil wildcard taxa in phylogenetic analysis generally increased P for ancestral ranges including the fossil's distribution area. The ΔP was more dramatic for ranges that include the area of a wildcard fossil with a distribution area underrepresented among extant taxa. This indicates the importance of including fossils in biogeographic analysis. Exmination of range subdivision at the parental nodes revealed potential range evolution (extinction and dispersal events) along the stems of A. californica and sect. Parryana.

Studies in historical biogeography based on phylogeny have accumulated rapidly due to the recent increase in availability of molecular phylogenetic data (see Xiang et al., 1998a, 2004, 2005, 2006; Wen, 1999; Sanmartín et al., 2001; Donoghue & Smith, 2004; Sanmartín & Ronquist, 2004; Soltis et al., 2006). One of the most widely used methods of inferring biogeographic histories based on phylogeny is Dispersal–Vicariance Analysis (DIVA) (Ronquist, 1997, 2001). DIVA is a method of reconstructing biogeographic history that falls under the broad heading of event-based methods, in which biogeographic processes that help drive speciation are incorporated a priori into the methodology (Ronquist, 1996, 1997; Sanmartín et al., 2001). Specifically, DIVA uses a parsimony approach that minimizes extinctions and dispersals and assumes vicariance as the null hypothesis (Ronquist, 1996). The program estimates distributions of hypothesized ancestors at internal nodes on a fully bifurcating phylogenetic tree based on the distributions of terminal taxa (Ronquist, 1996). Results of biogeographic analysis using DIVA are optimized ancestral ranges at each internal node under the parsimony criterion. Frequently, multiple equally parsimonious biogeographic pathways (MP pathways) are obtained from a given tree, and these are summarized as multiple optimal solutions at some or all internal nodes of the tree. Although new model-based likelihood and Bayesian methods of reconstructing biogeographic histories have recently been developed (Ree et al., 2005; Ree & Smith, 2008; Sanmartín et al., 2008; also Lemmon & Lemmon, 2008), a quick, advanced search using Google Scholar for 2008 published reports containing the words “biogeography” and “DIVA” illustrates that DIVA continues to be widely used in historical biogeographic studies. The primary advantage of DIVA over the likelihood method of Ree et al. (2005) is that less prior information is required (Ree et al., 2005; Ree & Smith, 2008). DIVA is also fast, simple, and user-friendly and gives results congruent to the model-based likelihood method Lagrange (http://code.google.com/p/lagrange/) for most lineages that have been compared (Ree et al., 2005; Burbrink & Lawson, 2007; Ree & Smith, 2008; Velazco & Patterson, 2008; Xiang & Thomas, 2008; Xiang et al., 2009) when analyses using DIVA included outgroups that are not widely distributed or the root range was used for area coding for outgroups at higher rank than species (see Ronquist, 1996).

Running the DIVA program requires that two parameters are defined; the phylogeny and the distributions of terminal taxa. Aside from any questions that might arise regarding the underlying assumptions implemented in the program, uncertainty in the results of DIVA arises from two areas, phylogenetic uncertainty and uncertainty in DIVA optimization. Biogeographic reconstruction using DIVA is typically carried out using a single tree topology; the author's “best” tree representing the true phylogeny (e.g., Fiz et al., 2008; Jeandroz et al., 2008; Lim, 2008). The single tree approach is a common practice in phylogenetic biogeography using many methods including Component analysis (Page, 1993a, 1993b), Bremer's ancestral area analysis (Bremer, 1992), and the model-based likelihood methods of Ree et al. (2005) and Ree & Smith (2008). Of the five reports published in the American Journal of Botany and Systematic Biology in 2008, in which a primary research goal was to reconstruct historical biogeography, five used DIVA, four used a single tree (Calviño et al., 2008; Hines, 2008; Huttunen et al., 2008; Mansion et al., 2008) and one showed that alternative resolutions of polytomies had no effect on biogeographic reconstruction (Mast et al., 2008). Using a single tree rarely accounts for the full range of possible, slightly less optimal topologies given the data. Additionally, the “best” phylogeny is not always fully resolved or strongly supported for all nodes; some clades may be weakly supported or there may be polytomies. Polytomies are particularly problematic. The backbone phylogeny used in DIVA analysis must be fully bifurcating as the program is unable to accept polytomies, but polytomies present a problem for most methods of biogeographic analysis using phylogeny, as reconstruction necessarily breaks down at these unresolved nodes. The other area of uncertainty from DIVA is the multiple, equally parsimonious biogeographic scenarios for a given phylogeny. The program does not provide any quantifiable method of selecting between the multiple possibilities. However, authors can use information from area connections and divergence times to rule out certain hypotheses or to favor one hypothesis over another, as also discussed by Ronquist (1996). Both types of uncertainty in DIVA have been recognized and handled by Nylander et al. (2008) using posterior probabilities (pp).

Nylander et al. (2008) recently showed the utility of a probabilistic approach to DIVA in reconstructing the biogeographic history of the avian genus Turdus L. Specifically, they optimized 20,000 Bayesian trees in DIVA and used the results of these optimizations to determine the marginal distributions of alternative ancestral ranges at each node of interest, dependent on the node's occurrence in the sampled topologies. Thus, alternative ancestral ranges at each node in the tree (Fig. 1a of Nylander et al., 2008) can be assumed to have a probability equal to the product of the clade pp (phylogenetic uncertainty) and the occurrence of the alternative ranges for the clade in DIVA (the uncertainty in the biogeographic reconstruction). The occurrence of each alternative range was determined as a fraction of all optimal ranges; that is, for a given tree, a node with three optimal ancestral ranges “A, B, or AB”, the occurrence of each range was recorded “A:1/3, B:1/3, AB:1/3”. This approach accounts for both uncertainty in the location of a node in the broader tree topology (i.e., phylogenetic uncertainty) and uncertainty in ancestral range reconstructions (multiple, equally parsimonious DIVA optimizations). Nylander et al. (2008) referred to this as a Bayes-DIVA analysis. Using a subset of Bayesian trees to account for uncertainty in phylogeny has been used before (e.g., Lutzoni et al., 2001; Pagel et al., 2004). In biogeography, this methodology was also suggested by Lemmon and Lemmon (2008) and was previously used by Huelsenbeck and Immenov (2002). Nylander et al. (2008) were the first to apply this approach to use with DIVA.

Here, we extend the Bayes-DIVA method to allow estimation of the geographic origin of a lineage in a polytomy. We first redefined a node as the parent node (parent node, hereafter) of a crown group node, where a crown group node (crown node, hereafter) represents the last shared common ancestor of all constituents of a crown group with an undefined sister (x) (Fig. 1). Therefore, the parent node is inherently present on every tree in the posterior distribution of phylogenetic trees in which the crown group occurs, regardless of the relationship of the crown group to other groups. Using this definition allows for estimation of the ancestral range of the stem lineage of a highly supported terminal taxon or crown group even if the lineage is resolved as a member of a polytomy in the phylogeny (Fig. 1). The probability (P) of an ancestral range r at a node of interest is calculated as

  • image(1)

where Y is the parent node, t is one of the randomly selected Bayesian trees, n is the total number of sampled trees, F(rY)t is the occurrence of an ancestral range r at node Y for tree t, and Pt is the probability of tree t, which is the proportion of the tree in the pool of the sampled trees (which can be extended to the proportion of the tree in the pool of the entire posterior distribution of trees). F(rY) is calculated as the actual frequency of r within the pool of biogeographic pathways optimized using DIVA for each sampled tree: inline image.

image

Figure 1. Graphical explanation of parent nodes, crown nodes, and unspecified sister groups. A, Hypothetical phylogeny containing well-supported crown groups marked by triangular symbols and incomplete resolution of relationships among them. Open circles indicate crown nodes of crown groups 1–4. Closed circles indicate parent nodes (node, sensu this study). Numbered parent nodes corresponding to numbered crown groups. B, Unspecified sister groups (x) for crown groups 1–4. Node numbers in closed circles correspond to those in A.

Download figure to PowerPoint

The value i is the number of times a range (r) occurs in the total number of MP pathways (Rt) over the tree. The actual frequency can be obtained by using the command “printrecs” in DIVA. An alternative estimation of F(rY) is using the method of Nylander et al. (2008) as 1/N, where N is the total number of alternative ancestral distributions at node Y. An example of this method of probability calculation and both methods of deriving F(rY) are illustrated in Fig. 2. This revised Bayes-DIVA approach can provide statistical confidence on inferred biogeographic origins of lineages of interest with unresolved or poorly supported phylogenetic placement, for which the traditional DIVA analysis or the Bayes-DIVA approach used by Nylander et al. (2008) are uninformative.

image

Figure 2. Example of calculation of P(rY) and of F(rY) using two methods. A, Hypothetical sample of three Bayesian trees, T1T3. Node Y (circles) is parent node of Lineage 1. A, B, C, and D are distribution areas. Ranges of terminals are given below lineage names. Possible ranges for node Y include A, B, C, D and widespread areas including two or more of these. In B and C, only areas with F(rY) > 0 for at least one tree shown. B, Calculation of F(rY) using actual frequency of areas from dispersal–vicariance analysis output (i.e., inline image). C, Calculation of F(rY) assuming all optimal areas equally probable for each t (i.e., 1/N).

Download figure to PowerPoint

The parent node Y in this study is similar to the floating node described by Pagel et al. (2004) in that both Y and the floating node do not always include the same crown groups. However, the floating node must include two specific crown groups of interest, although it may contain other clades or taxa as well (Pagel et al., 2004). Y differs in that it is the parent of exactly two groups: a specific crown group of interest and its sister x, which is undefined. Another important difference is that the two clades of interest at a floating node of Pagel et al. (2004) can have any level of support, whereas the Y applies to only the nodes connecting the well-supported crown clade and its unspecified sister. Therefore, the floating node is not suitable as a substitute for Y.

Using simulated data, we tested whether the range probabilities of a parent node can be accurately inferred as the product of the pp at the node containing the crown group and a defined sister, and the frequency of occurrence of the range at that node optimized by DIVA on the Bayesian consensus tree topology, that is,

  • image(2)

We further tested the utility of our approach using data from Aesculus L., a genus of woody trees and shrubs with a disjunct Laurasian distribution. We also illustrate two additional applications of this method. First, we estimated the impact of two fossil wildcard taxa (sensuNixon & Wheeler, 1992) on biogeographic reconstruction of Aesculus. Second, we examined range subdivisions at the parental nodes of lineages of interest and estimated the most probable ranges inherited by these lineages (referred to as post-Y range hereafter) to gain some insights into range evolution along the stem branches. The primary goals of this study are: (i) to describe an alternative method of using the Bayes-DIVA analysis under phylogentic uncertainty which can provide estimation of geographic origin for crown groups with unknown sister relationships; and (ii) to test the method and its possible applications using Aesculus L.

Aesculus (Sapindales, Sapindaceae) is a genus of 13–19 species belonging to six major lineages, which are supported by phylogenetic studies using molecular and morphological data: sect. Aesculus (2 species), sect. Macrothyrsus (1 species), sect. Parryana (1 species), sect. Pavia (4 species), an Asian clade (3–10 species), and the species Aesculus californica Nutt. (Xiang et al., 1998b; Forest et al., 2001; Harris et al., 2009). Extant Aesculus species are distributed across the Northern Hemisphere and each lineage is restricted to one of the following areas: East Asia (EA); western North America (wNA); eastern North America (eNA); and Europe (EU), except sect. Aesculus, which is disjunct in EA and EU. Aesculus has a rich fossil record from EA, EU, and wNA and with fossils found in strata ranging from the Paleocene to the Quaternary (Hu & Chaney, 1940; Condit, 1944; Puri, 1945; Szafer, 1947, 1954; Tanai, 1952; Schloemer-Jäger, 1958; Prakash & Barghoorn, 1961; Axelrod, 1966; Budantsev, 1983; de Lumley, 1988; Mai & Walther, 1988; Wehr, 1998; Golovneva, 2000; Manchester, 2001; Jeong et al., 2004; Dilhoff et al., 2005).

Aesculus is an ideal genus for biogeographic study owing to its small number of species, pan-Northern Hemisphere distribution, extensive fossil record, and the continental endemism of most lineages and all species. However, molecular phylogenetic studies of Aesculus using several DNA regions (Xiang et al., 1998b; Harris et al., 2009) have resulted in poorly supported or unresolved relationships among the six major lineages despite strong support for the polytypic lineages (i.e., crown groups). Thus, the utility of DIVA applied in the traditional way for biogeographic reconstruction of the genus is limited. In addition to deep node polytomies, biogeographic reconstruction of Aesculus presents another challenge due to uncertainties in positions of some fossil species. Recently, many authors have cited the need for inclusion of fossils in phylogenetic reconstruction and phylogeny-based biogeographic analyses (Manchester, 1999; Rothwell, 1999; Wen, 1999; Lieberman, 2003; Crane et al., 2004; Donoghue & Smith, 2004; Xiang et al., 2005, 2006, 2009; Hilton & Bateman, 2006; Rothwell & Nixon, 2006). Excluding fossils can produce a false or incomplete biogeographic history of a group (Manchester, 1999; Lieberman, 2003; Crane et al., 2004). The limitations of including fossils, for which often only incomplete morphological data and rarely ancient DNA data is available, have been discussed (Nixon & Wheeler, 1992; Kearney, 2002; Kearney & Clark, 2003; Wiens, 2003, 2006) and observed in empirical studies (e.g. Rothwell & Nixon, 2006; Harris et al., 2009; but see Manos et al., 2007). Fossil taxa for which little informative data is available may act as wildcard taxa (Nixon & Wheeler, 1992) in phylogenetic analysis. Wildcard taxa are defined as those that, due to significant missing characters, may be placed algorithmically at many or all nodes on the tree topology (Nixon & Wheeler, 1992; Kearney & Clark, 2003). Two geographically and temporally important complete leaf (leaflets attached to a petiole) fossil species of Aesculus offer few phylogenetically informative characters. These are Aesculus longipedunculus Schloemer-Jäger (Eocene, EU) and Aesculus“magnificum” (Budantsev, 1983; Manchester, 2001) (Paleocene, EA). In preliminary analyses, these fossil species behave as wildcards, limiting phylogenetic resolution for the fossils and for otherwise well-supported groups. In the example using Aesculus, we use the revised Bayes-DIVA to provide a statistical measure of shifts in ancestral range probabilities when fossils are included versus excluded.

1 Material and methods

  1. Top of page
  2. Abstract
  3. 1 Material and methods
  4. 2 Results
  5. 3 Discussion
  6. 4 Conclusions
  7. Acknowledgments
  8. References
  9. Appendix

1.1 Assessing the difference between Equation 1 and Equation 2

It is of interest to determine if the product of the pp and the frequency of a range (F(rY)) derived from DIVA analysis of the Bayesian consensus tree (with compatible groupings below 50% allowed) effectively reflects the estimation using Equation 1 of the revised Bayes-DIVA method (i.e., Equation 2 versus Equation 1) because the former is so much simpler. To accomplish this, 10 random DNA sequences of 200 bp in length were generated using a JavaScript sequence generator (http://www.faculty.ucr.edu/~mmaduro/random.htm) (M. Maduro, pers. comm., 2008). These sequences were used to represent 10 hypothetical lineages, Lineage 1–Lineage 10. These lineages represent 10 unique operational taxonomic units where each might be a species or a clade containing multiple species with 100% pp. This is a simplistic example, but our analysis of Aesculus L. provides an example of data calculation for clades supported by pp less than 100% of the data. The 10 simulated sequences were treated as aligned and placed in a data matrix. Knowledge of any true relationship between these sequences was unknown and inessential as the objective was not to test the utility of Bayesian analysis in recovering true relationships. The random sequences were expected to provide phylogenetic uncertainty sufficient to test the hypothesis whether Equation 1 results in ancestral range probabilities at a node significantly different from that resulting from Equation 2. The 10 random sequences are available from the authors by request.

Phylogenetic analysis of the simulated data was carried out using MrBayes 3.1.2 (Huelsenbeck & Ronquist, 2001; Huelsenbeck & Ronquist, 2003). The program was run using default priors for two simultaneous runs of 22 million generations each. Each run used one hot chain and two cold chains with default settings. Burnin was set to 2,200,000 (or 10%) and trees were sampled every 2000 generations. Resulting post-burnin trees were assembled into a PHYLIP format file and a majority rule consensus with compatible groupings >50% was generated using Consense in the PHYLIP 3.68 package (Felsenstein, 1989; Felsenstein, 2008). Lineage 3 was randomly selected as an outgroup. The consensus tree was used to identify four lineages, two sister groups, that would be used to test our hypothesis: Lineages 1 and 8; and Lineages 4 and 9 (Fig. 3: A).

image

Figure 3. Results of Bayesian analysis of simulated data. A, Consensus trees for 19,800 (left) and 100 (right) Bayesian trees. Values of posterior probability support are shown above branches, actual occurrences are given in parentheses. Geographic ranges of terminals subtend terminal names. Parent and crown nodes used in Bayes-dispersal–vicariance analysis simulation are highlighted, expanded in B. MJ, majority. B, Explanation of nodes of interest for Bayes-dispersal–vicariance analysis simulation.

Download figure to PowerPoint

One hundred trees from the 19,800 post-burnin dataset were randomly selected using RandomTree (Kauff, 2005). Four ancestral areas, A, B, C, and D, were randomly assigned to each of the 10 lineages, with each area being used at least once and with each lineage endemic to a single area. The 100 trees were optimized using DIVA 1.1 for Windows (Ronquist, 1996, 1997) with default settings. The ancestral ranges of the parent nodes were recorded in an Microsoft Excel 2007 spreadsheet. The spreadsheet format was used for calculation of ancestral range probabilities at each node of interest and for statistical test analysis.

Lineages 1, 4, 8, and 9 (Fig. 3: A) were used to compare the probabilities calculated using Equation 1 and Equation 2. Probabilities of ancestral ranges for the node shared by Lineage 1 + Lineage 8 and the node shared by Lineage 4 + Lineage 9 (occurring in the 50% consensus topology) were first calculated using Equation 2 to provide an estimation of ancestral origin of these lineages. The results were then compared to those estimated using Equation 1, in which the sisters of Lineages 1, 4, 8, and 9 were undefined (x). A two-tailed z-test was used to determine if there was significant difference between probabilities for ranges obtained using the two methods. The goal of these comparisons, and of similar comparisons made in the empirical example using Aesculus, was to determine whether Equation 1 could recover additional informative range data for the parental node that has <100% pp in the Bayesian consensus tree than Equation 2. Any significant differences between Equation 1 and Equation 2 indicate that there is additional useful range information present in the subset of Bayesian trees that is discarded by using Equation 2. In all DIVA analyses constraints on maximum areas (“maxareas” command) were not implemented.

1.2 Reconstructing ancestral ranges in Aesculus L.

1.2.1 DNA and morphological data  DNA seque-nces from matK, the rps16 intron, and internal transcribed spacer (ITS), available from a previous study for 16 species of Aesculus as well as for outgroup taxa Handeliodendron bodinieri Redhr., Billia columbiana Planch. & Linden ex Triana & Planch. and Billia hippocastanum Peyr., were used in this study (Appendix I). For information on outgroup selection see Hardin (1957a), Judd et al. (1994), Xiang et al. (1998b), Forest et al. (2001), Harrington et al. (2005), and Harris et al. (2009), and DNA sequences were aligned manually using MacClade 4.02 (Maddison & Maddison, 2001). The 39-character morphological matrix of Forest et al. (2001) was modified by: (i) excluding all outgroup taxa used in their study except those noted above; (ii) eliminating Aesculus glabra Willd. var. arguta (Buckley) B.L. Rob.; and (iii) combining the species of Billia into a single taxonomic entry, Billia sp.

Fossil taxa, A. longipedunclus and A.“magnificum” were scored based on published reports (Schloemer-Jäger, 1958; Budantsev, 1983; Golovneva, 2000; Manchester, 2001) for three characters: petiolulate leaflets (as opposed to sessile); serrate margins (as opposed to entire); and having palmately compound leaves (as opposed to ternate). The presence of petiolulate leaflets is a parsimony informative character in Aesculus (Hardin, 1957a; Forest et al., 2001; Manchester, 2001; Harris et al., 2009). All extant species of Aesculus except (arguably) Aesculus parryi (sect. Parryana) have some degree of leaf serration (Hardin, 1957a; Forest et al., 2001). Outgroup taxa Handeliodendron and Billia have entire leaflets (Wiggins, 1932; Hardin, 1957a, 1957b; Forest et al., 2001; Harris et al., 2009). Palmately compound leaves are common to all extant Aesculus and Handeliodendron, whereas leaves of Billia are ternate (Forest et al., 2001; Hardin, 1957a, 1957b, 1960).

1.2.2 Phylogenetic analysis  Three independent phylogenetic analyses were carried out. In Analysis 1, gaps in matK were coded using ambiguous region coding (ARC) (Kauff et al., 2003) for ambiguously aligned regions and simple gap coding for unambiguous gaps. In Analysis 2 ARC and simple gap coding were applied for all genes in the concatenated sequences. Analysis 3 included the extant species as well as the two fossil species A. longipedunculus and A.“magnificum” and was carried out using a matrix of combined morphological and molecular data with the same ARC and gap codings as Analysis 2. Analyses were carried out using MrBayes 3.1.2. Data was partitioned into four sets, matK, rps16, ITS, and morphology including the modified morphological matrix of Forest et al. (2001) and the standard states from ARC and simple gap coding. For each gene region, ModelTest 3.0 (Posada & Crandall, 1998) was used to determine the best model of evolution. Although character state ratios and other specific information were dependent on use ARC and simple gap coding, the basic models were not affected by use of these coding methods. The Akaike Information Criterion in ModelTest returned the following models: TVM + I + G for matK, TRN + I for ITS, and K81uf for rps16. Models were implemented in MrBayes using the PRSET and LSET commands.

For each analysis, two simultaneous, independent Markov chains were run for 22 million generations to check convergence. Trees were sampled every 2000 generations. Burnin was set to 2.2 million generations or 1100 trees, and was checked using Tracer 1.3 (Rambaut & Drummond, 2003). The 19,800 post-burnin trees from each analysis were combined independently and summarized by generating a 50% majority rule consensus tree in PAUP* 4.0b10 (Swofford, 2002).

1.2.3 Biogeographic analysis using the revised Bayes-DIVA method  Nine nodes of interest were identified on the Bayesian consensus tree from analysis of combined data with gaps in matK coded using ARC and simple gap coding (Analysis 1). These were the parent nodes of sect. Aesculus, sect. Macrothyrsus, sect. Parryana, sect. Pavia, the Asian clade, A. californica, and the crown nodes of each of the polytypic lineages; sect. Aesculus, sect. Pavia, and the Asian clade. One hundred trees from the combined post-burnin Bayesian tree files from each analysis were randomly sampled using RandomTree. Terminals were coded as belonging to one of five ancestral areas: Europe (A), East Asia (B), eastern North America (C), western North America (D), and Latin America (E) to cover distributional ranges of Aesculus and its outgroup Billia. Trees were optimized using default settings in DIVA 1.1 for Macintosh. Results from DIVA for each of the nine nodes of interest were recorded in a Microsoft Excel spreadsheet which was used for subsequent calculations. Ancestral range probability at each node of interest was calculated using Equation 1. For those nodes present in the Bayesian consensus topology, the probability of alternative ancestral ranges was also calculated using Equation 2 for comparison. Individual topologies of sampled trees were examined using TreeView 1.6.6 (Page, 1996, 2001) and PAUP* 4.0b10 (Swofford, 2002).

Biogeographic analysis of the Analysis 3 phylogenies using the revised Bayes-DIVA method considered only the ancestral ranges of the six parent nodes and did not include the three crown group nodes. We used the floating node of Pagel et al. (2004) in cases where crown clades contained fossil species. The floating node allowed that crown clade existed on the tree as long as the floating node included only the crown clade alone or only the crown clade plus one or both fossil species. The floating node was not a substitute for Y. Instead Y included the crown clade (plus any fossils) and x. On some topologies for some crown clades of interest, x was a fossil and this was perfectly acceptable. The revised Bayes-DIVA analysis including fossils was done using a sample of 100 trees from the post-burnin posterior distribution of trees. This analysis was repeated for the same set of 100 trees with fossils pruned from the topologies. Z statistics were used to compare the results of these two analyses (fossils included and fossils pruned) and the results from Analysis 1 including only extant species.

Post-Y range analyses for each of the six major lineages of Aesculus were carried out using Bayes-DIVA results from Analysis 1 data. For each node Y of the six major lineages, all possible ranges that the branch leading to the crown group of interest could inherit from ranges at Y with a P > 0 were determined. Inheritance of each possible range from splitting of ranges at Y was considered equally probable and was then weighted by the probability of the ancestral range at Y. The probability of each possible range inherited from node Y by each of the descendant branches was calculated as the sum of the probabilities of that range over all ranges with a P > 0 at Y. For example, if Lineage L has Y P(A) = 0.50 and P(AB) = 0.50, for range A at node Y, the probability of inheritance of range A by the two descendant lineages is 1.0. For range AB at node Y, the descendant lineages may inherit A, B, or AB, each with a probability of 1/3. The post-Y probability of range A for Lineage L is, therefore, post-YP(A) = 0.5 * 1.0 + 0.50 * 0.333 = 0.667. Post-Y range probability calculations were carried out using RAD@Y, a Python 2.5 user interface program developed by the authors for this purpose and available upon request. The post-Y ranges provide information on range inheritance of the descendant lineages and range evolution along the stem of crown groups.

For all comparisons between Equation 1 and Equation 2 in the empirical example using Aesculus, a quick and conservative approach was used by allowing F(rY) to have its largest possible value, that is, F(rY) = 1 (when there was no uncertainty from DIVA optimization for node Y), thus Equation 2= pp. If the maximized values of Equation 2 are still significantly smaller than those found by using Equation 1, the conclusion that Equation 2 does not effectively reflect the probability estimated by Equation 1 can be made.

2 Results

  1. Top of page
  2. Abstract
  3. 1 Material and methods
  4. 2 Results
  5. 3 Discussion
  6. 4 Conclusions
  7. Acknowledgments
  8. References
  9. Appendix

2.1 Equation 1 vs. Equation 2 in simulated data

Relationships between all lineages were poorly supported (Fig. 3: A). The highest pp support was observed for the sister relationships between Lineages 1 and 8 (pp = 52%) and Lineages 4 and 9 (pp = 48%). In the randomly selected subset of Bayesian trees, the monophyly of Lineages 1 + 8 was supported in 58% of the data and the monophyly of Lineages 4 + 9 was supported in 47% of the data (Fig. 3: A). Results from DIVA using the 50% majority rule tree with nodes compatible (Fig. 3: A) indicated that the geographic range for the node shared by Lineages 8 + 1 was A only (no alternative solutions), thus F(rY) = 1.0. For the node shared by Lineages 4 and 9, results from DIVA showed an ancestral range of BD only with F(rY) = 1.0. Therefore, the probabilities of ancestral ranges for these nodes based on Equation 2 were P(A) = 0.54 * 1.0 for L8 + 1 and P(BD) = 0.47 * 1.0 for L4 + 9, implying that the geographic origins of both L1 and L8 are most likely to have occurred in A with probability of 0.54, whereas the geographic origins of L4 and L9 are both most likely to have occurred in BD with probabilities equal to 0.47.

In the revised Bayes-DIVA approach applying Equation 1, the most probable ancestral ranges at four parent nodes (Fig. 3: B), Lineage 1 +x, Lineage 4 +x, Lineage 8 +x, and Lineage 9 +x, inferred from the sample of 100 Bayesian trees were A (P = 0.744), BD (P = 0.484), A (P = 0.755), and BD (P = 0.643), respectively (Fig. 4: A). All most highly supported ancestral ranges for each parent node of interest were significantly greater than the second most highly supported ancestral range (Fig. 4: A) and were significantly greater than those obtained by using Equation 2, except in the case of Lineage 4 +x (Table 1). The probability of BD was significantly higher for Lineage 9 compared to Lineage 4 (Fig. 4: B), and P(BD) was equal for Lineages 4 and 9 when using Equation 2.

image

Figure 4. Results of Bayes-dispersal–vicariance analysis of simulated data. A, Relative frequency graphs showing probability (P) of ancestral ranges for the parent nodes of Lineages 1, 4, 8, and 9 and their unspecified sisters (x). Circled numbers correspond to numbered lineages. Ranges are shown above graphs. Results of Z-test comparing the most highly supported range to the second most highly supported range shown below frequency boxes. Arrows point to bars compared in B. B, Comparison of P(BD) as ancestral range of Lineages 4 and 9.

Download figure to PowerPoint

Table 1.  Comparison of pp*F(rY) (Equation 2) and inline image (Equation 1) for ancestral areas of lineages of interest from the Java script simulated data
 Sister in Bayesian consensus treeAncestral area from consensus tree optimizationpp support for sister in consensus of 19,800Equation 2 resultsMost highly supported ancestral area from Bayes- DIVA analysisEquation 1 resultsz statistic for comparison of EQ1 and EQ2Significant difference at α/2 = .005p value
  1. †pp*F(ry) was estimated with F(rY) = 1 for a more conservative test on the majority rule tree of the 100 sampled trees; ‡A, B, C, and D were used in dispersal–vicariance analysis (DIVA) of simulated data to represent four hypothetical, unique areas. pp, posterior probability.

Lineages
 1Lineage 8A0.520.58A0.7443.90yes<0.0001
 4Lineage 9BD0.480.47BD0.484 0.286no 0.779
 8Lineage 1A0.520.58A0.7554.37yes  0.0002
 9Lineage 4BD0.480.47BD0.6433.80yes<0.0001

2.2 Results of analyses using Aesculus

Phylogenetic analyses of different data partitions showed strong support for the monophyly of polytypic groups but poor resolution of relationships among the major lineages (Fig. 5). In the analysis including fossils, support for the monophyly of all polytypic lineages greatly decreased (Fig. 6). Fossil species were observed to ally variously with all major lineages and, rarely, with outgroup species, with low support (Fig. 6).

image

Figure 5. Bayesian trees from phylogenetic Analyses 1 and 2 of Aesculus. Consensus trees were condensed, showing major lineages. Values of posterior probability support are above branches, and bootstrap support are below branches. Modern ranges subtend terminal names corresponding to areas indicated in C. A, Results of analysis of extant taxa only (Analysis 1 with ambiguous region coding and simple gap coding in matK). Numbered nodes correspond to nine nodes of interest considered in Bayes-dispersal–vicariance analysis. 1, Asian clade +x; 2, sect. Aesculus+x; 3, Aesculus calfornica+x; 4, sect. Macrothyrsus+x; 5, sect. Pavia+x; 6, sect. Parryana+x; 7–9, last shared ancestor of species of polytypic lineages, that is, crown nodes. B, Results of Analysis 2 including extant species only and with ambiguous region coding and simple gap coding for all gene regions. C, Geographic map indicating areas used in Bayes-dispersal–vicariance analysis analysis, created using Online Map Creation (Weinelt, 1999). EA, East Asia; eNA, eastern North America; EU, Europe; LA, Latin America; wNA, western North America.

Download figure to PowerPoint

image

Figure 6. Results from phylogenetic analysis of Aesculus including extant species and wildcard fossils (Analysis 3). A, Bayesian consensus trees from Analysis 3. Values of posterior probability support from 19,800 trees are shown above branches and those from 100 randomly sampled trees are below branches. Fossils highlighted in gray. Dashed lines indicate placement of Aesculus“magnificum” in consensus of 19,800 trees (lower) and 100 trees (upper). Distributional ranges are provided to the right of terminals corresponding to those indicated in B. B, Geographic map indicating areas used in Bayes-dispersal–vicariance analysis analysis, created using Online Map Creation (Weinelt, 1999). EA, East Asia; eNA, eastern North America; EU, Europe; LA, Latin America; wNA, western North America.

Download figure to PowerPoint

Results from the modified Bayes-DIVA analysis (below) are not presented on the consensus tree or other graphical representation of the relationships between clades of Aesculus. This is for three reasons. First, Y cannot be accurately reflected on a consensus tree or other single topology. Second, the probabilities of ancestral ranges calculated using Equation 1 are not dependent on position of the clades on the tree nor on pp support shown on the tree for clades, though they are weighted by these values. Finally, assuming that x is best represented by the sister group indicated on the consensus tree, topology limits confidence in the results of the revised Bayes-DIVA analysis to confidence in nodal support.

Using the revised Bayes-DIVA analysis, the ancestral ranges of the crown nodes of interest (Fig. 5: A, nodes 7–9), sect. Aesculus, sect. Pavia, and the Asian clade, were estimated to be EA-EU, eNA, and EA, respectively, in all sampled trees. Therefore, the probabilities for these ranges at these crown nodes are all equal to 1.0. In this case, there is no difference between Equation 1 and Equation 2 because all groups in question were supported by 100% pp and there was no optimization uncertainty in DIVA for all sampled trees.

In contrast, the ancestral ranges at the parent nodes of the Asian clade, sect. Aesculus, sect. Pavia, A. californica, sect. Parryana, and sect. Macrothyrsus (Fig. 5: A, nodes 1–6, respectively) were sensitive to topological rearrangements. More than one optimal geographic range was resolved for each of these parent nodes using Bayes-DIVA (Fig. 7). For five of the six lineages, a most probable range with P≥ 0.5 was recovered. The most probable range for the parent node of the Asian clade was shown to be EA with P= 0.755 (Table 2, Fig. 7: A). An EA distribution was also revealed to be the most likely for the parent node of sect. Aesculus (P= 0.832) (Table 2, Fig. 7: B). For the parent nodes of sect. Pavia and A. californica, the most likely ancestral ranges were shown to be widespread in eNA-wNA and EA-wNA, respectively (P= 0.663 and 0.76) (Table 2, Fig. 7: C, D), whereas the parent nodes of sect. Parryana and sect. Macrothyrsus were both shown to be widespread in eNA-wNA (P= 0.90 and 0.395, respectively) (Table 2, Fig. 7E, F). Some of these probability values are greater than the pp support for the nodes shared by these lineages and a specific sister and all are greater than the Equation 2 values for nodes present in the 50% majority rule Bayesian consensus (Table 3). The probabilities obtained for ranges at the parent nodes of sect. Parryana+x and sect. Aesculus+x were significantly different from those obtained using Equation 2 (Table 2), for which the relationships sect. Parryana+ sect. Pavia and sect. Aesculus+ the Asian clade were used for DIVA analysis (Table 3).

image

Figure 7. Probabilities of ancestral ranges for the six major lineages of Aesculus L. Highest probabilities are given in black text in beveled slices. A, Asian clade. B, Section Aesculus. C, Aesculus californica. D, Section Pavia. E, Section Parryana. F, Section Macrothyrsus. EA, East Asia; eNA, eastern North America; EU, Europe; LA, Latin America; wNA, western North America.

Download figure to PowerPoint

Table 2.  Most probable ancestral ranges of the stem node of six lineages inferred from analysis without fossils and comparison between Equation 1 and Equation 2 calculations of probability
LineageMost probable rangeEquation 1 resultsEquation 2 resultsZ-statistic for comparison of Eqn 1 to Eqn 2P value§
  1. pp*F(rY) of Equation 2 was estimated with F(rY) = 1 (no optimization uncertainty), leading to pp*F(rY) = pp for conservative test. See Material and Methods, 1.2.3; ‡Two-tailed z-test; §Highlighting indicates significance at Zα/2, α= 0.01; —, Eqn 2 not used for calculation of ancestral range probability for sect. Macrothyrsus as no posterior probability (pp) value is available from 50% majority rule consensus of Bayesian topologies. EA, East Asia; eNA, eastern North America; wNA, western North America.

Asian cladeEA0.7550.730.5800.5619
AesculusEA0.8320.732.7190.0065
Aesculus californicaEA-wNA-eNA0.7600.750.2340.815 
PaviaeNA-wNA0.6630.70−0.781 0.4348
ParryanaeNA-wNA0.9000.706.628<0.0001  
MacrothyrsuseNA-wNA0.395none ≥0. 50
Table 3.  Differences between pp and pp*F(rY) of Equation 2 for Aesculus stem lineage nodes on Bayesian consensus tree derived from analysis without fossils (Analysis 1). F(rY) was calculated using 1/N
LineageSister lineagein 50% MJ rulepp supporting relationship to sister in consensus topologyMost probable range(s) from DIVA analysis of consensus treeEquation 2 results
  1. Sect. Macrothyrsus pruned for this analysis to produce fully bifurcating tree topology; See Fig. 5: A. DIVA, dispersal–vicariance analysis; EA, East Asia; eNA, eastern North America; EU, Europe; pp, posterior probability; wNA, western North America.

Asian cladeAesculus0.70EA0.365
EU-EA0.365
AesculusAsian clade0.70EA0.365
EU-EA0.365
A. californica(Aesculus+ Asian clade)0.75EU-EA0.250
EA-wNA0.250
EU-EA-wNA0.250
PaviaParryana0.70eNA-wNA0.70
ParryanaPavia0.70eNA-wNA0.70

When fossils were included in the Bayes-DIVA analysis, the probability of any ancestral range including Europe, P(EUR), increased significantly for three of six parent nodes (Fig. 8, Table 4) when compared to results from trees with fossils pruned. The value of P(EUR) increased significantly for all six parent nodes when compared to results from trees resulting from phylogenetic analysis including extant taxa only (Table 4). In contrast, changes in the probability of ranges including East Asia, P(EAR), were less dramatic when fossils were included vs. excluded (Fig. 8, Table 4).

image

Figure 8. Comparison of P(EU ∈ R) and P(EA ∈ R) for the six parent nodes of interest when fossils are included, pruned, and excluded. Probability (P; y axis) is the probability of any ancestral area, including widespread areas, that include Europe (left) and East Asia (right).

Download figure to PowerPoint

Table 4.  Change in probabilities of ancestral ranges including Europe (EU) and East Asia (EA) when fossils excluded vs. included. A, Comparison of probabilities calculated using Bayes- dispersal–vicariance analysis with fossils pruned vs. fossils included on trees from analysis including both extant and fossil species. Highlighting indicates significant change in P. B, Comparison of probabilities with fossils included (trees from Analysis 3) vs. excluded (trees from analysis including only extant species). Highlighting indicates significant change in P
 LineageP(EUR), fossils excl.P(EUR), fossils incl.ΔP(EUR)P valueP(EAR), fossils excl.P(EAR), fossils incl.ΔP(EAR)P value
  1. †Absolute value of change, arrow indicating direction of change when fossils included; ‡From z-test comparing two means.

A
 Asian clade0.7450.6550.090 [DOWNWARDS ARROW]0.16481.0000.9920.008 [DOWNWARDS ARROW]0.3703
Aesculus0.6780.7070.029 [UPWARDS ARROW]0.65990.6740.5680.106 [DOWNWARDS ARROW]0.1223
Aesculus californica0.0000.1830.183 [UPWARDS ARROW]< 0.00010.0000.1920.192 [UPWARDS ARROW]< 0.0001
Pavia0.0330.1140.081 [UPWARDS ARROW]0.02820.0330.0440.011 [UPWARDS ARROW]0.9681
Parryana0.0440.1120.068 [UPWARDS ARROW]0.07350.0330.0620.029 [UPWARDS ARROW]0.3371
Macrothyrsus0.0000.2000.200 [UPWARDS ARROW]< 0.00010.0000.1790.179 [UPWARDS ARROW]< 0.0001
B
 Asian clade0.0000.6550.655 [UPWARDS ARROW]< 0.00011.0000.9920.008 [DOWNWARDS ARROW]0.3703
Aesculus0.0880.7070.619 [UPWARDS ARROW]< 0.00010.9900.5680.422 [DOWNWARDS ARROW]< 0.0001
Aesculus californica0.0800.1830.103 [UPWARDS ARROW]< 0.00010.7650.1920.573 [UPWARDS ARROW]< 0.0001
Pavia0.0000.1140.114 [UPWARDS ARROW]< 0.00010.0220.0440.022 [UPWARDS ARROW]0.3843
Parryana0.0070.1120.105 [UPWARDS ARROW]0.00170.0360.0620.026 [UPWARDS ARROW]0.3953
Macrothyrsus0.0580.2000.142 [UPWARDS ARROW]0.00270.3370.1790.158 [DOWNWARDS ARROW]0.0107

Post-Y range calculations yielded moderate to high support for a post-Y range of EA for the Asian clade and sect. Aesculus (post-YP(EA) = 0.837 and 0.888, respectively) (Table 5). The most probable post-Y range for sect. Pavia was eNA, but with lower support (post-YP(eNA) = 0.543) (Table 5). For the other three major lineages of Aesculus, no single post-Y range received support greater than 0.500 (Table 5). These preliminary results, which do not represent all of the available molecular and fossil data (see Harris et al., 2009), revealed possible extinction in EA and migration to wNA of the A. californica lineage, extinction in wNA and dispersal to eNA of the sect. Pavia lineage, and extinction in eNA and dispersal to wNA of sect. Parryana.

Table 5.  Possible post-Y ranges inherited from the most probable ancestral range (see Fig. 7) for each of the six major lineages of Aesculus
LineagePossible post-Y rangesProbability for each post-Y range
  1. †Total number of non-zero post-Y ranges are shown in parentheses below lineage names; ‡Only the three highest post-Y ranges are shown. EA, East Asia; eNA, eastern North America; wNA, western North America.

Asian clade
(5)EA0.83700
EA-wNA0.06000
wNA0.06000
Aesculus
(11)EA0.88800
EU0.03250
EA-EU0.02905
Aesculus californica
(11)wNA0.32500
EA0.26000
EA-wNA0.26000
Pavia
(7)eNA0.54300
wNA0.22100
eNA-wNA0.22100
Parryana
(23)wNA0.34400
eNA-wNA0.30670
eNA0.30670
Macrothyrsus
(15)eNA0.47000
wNA0.15200
eNA-wNA0.15100

3 Discussion

  1. Top of page
  2. Abstract
  3. 1 Material and methods
  4. 2 Results
  5. 3 Discussion
  6. 4 Conclusions
  7. Acknowledgments
  8. References
  9. Appendix

3.1 Accounting for phylogenetic and DIVA optimization uncertainties

Accounting for uncertainties in phylogeny and optimization is a major challenge in biogeographic analysis. The Bayes-DIVA method provides a simple and sound solution to this problem. The Bayes-DIVA method of Nylander et al. (2008) applies to nodes with fixed bipartitions (i.e., the two sister lineages at a node are clearly defined) and only trees containing these fixed nodes are considered, however, the revised Bayes-DIVA approach extends the method to allow estimation of geographic ranges at a node with only one of the two lineages defined and all trees containing the defined lineage contribute to the estimation. This revision to Bayes-DIVA provides a method of estimating biogeogaphic origins of lineages with uncertain sister affiliation with statistical confidence. Both Bayes-DIVA methods require optimization of a large set of Bayesian topologies and subsequent analyses of the results. It would be an easier alternative solution if the product of the pp value and F(rY) obtained from the 50% majority rule tree (i.e., Equation 2) could accurately reflect the full extent of range information inherent in the sampled Bayesian trees. However, our comparisons showed that this is not the case (e.g., P(BD) for Lineages 4 and 9, Fig. 4: B) and that probabilities calculated using Equation 2, even when F(rY) is equal to its maximum value of 1.0, are usually lower than the probabilities obtained using the revised Bayes-DIVA method. An alternative way to simplify the calculation of Equation 1 is to use 1/N (N is the number of alternative optimal ranges from DIVA for tree t) for F(rY). However, we found that 1/N (implying occurrence of each unique alternative range with equal frequency) can be very different from the actual frequencies (inline image) (Fig. 9). The values of F(rY) calculated using inline image can be substantially different in two trees showing identical sister relationships at the node of interest but differing elsewhere (Fig. 9). Using inline image as a calculation of F(rY) accurately reflects the frequencies of ranges given the data, which is important because the actual frequencies better reflect the uncertainty of DIVA optimization. Because a range with 100% occurrence at node Y on a given tree suggests no uncertainty in DIVA optimization, a range at node Y occurring more frequently in the optimal MP pathways indicates greater certainty of that range in DIVA optimization compared to other ranges occurring at the lower frequencies. However, 1/N may be used if one prefers to weight the alternative ranges at a node equally. Software for automation of analyzing the results from Bayes-DIVA and calculation of probabilities is desirable as, at present, this can be time consuming. The revised Bayes-DIVA approach is not in disagreement with Nylander et al. (2008) or Huelsenbeck and Immenov (2002), but rather provides an alternative method of accommodating phylogenetic and optimization uncertainties extending to parent nodes of crown groups with uncertain sisters. Although the model-based, full Bayesian approach of Sanmartín et al. (2008) has been developed to account for phylogenetic and optimization uncertainties in inferring biogeographic dispersal events, this approach is well suited for island biogeography, but may not be suitable for continental biogeography.

image

Figure 9. Comparison of inline image for an identical node in two different Bayesian trees from analysis including fossils (Analysis 3). A, B, Two Bayesian trees from sample of 100 from Analysis 3. Section Aesculus highlighted dark gray; sect. Macrothyrsus (Aesculus parviflora) +Aesculus californica are highlighted in light gray. Dots indicate the parent node of sect. Aesculus. The alternative range frequencies at this node are presented in C–E. C, Relative frequencies of nine alternative optimal ancestral areas determined using 1/N. Arrow indicates area BC (EA – eNA), an example referred to in text. D, Relative frequencies of the alternative ancestral areas determined based on actual occurrences by inline image for tree A. Arrow indicates area BC (EA – eNA), example referred to in text. E, Relative frequencies determined using inline image for tree B. Arrow indicates area BC (EA – eNA), example referred to in text. A, Europe; B, East Asia (EA); C, eastern North America (eNA); D, western North America.

Download figure to PowerPoint

Nylander et al. (2008) raised the question of how range probabilities obtained using Bayes-DIVA should be interpreted because the optimal ranges for each node from DIVA represent only the most parsimonious solutions, rather than including all possible solutions that may be statistically equally likely. This is no less of a concern for the revised Bayes-DIVA approach presented here. Nylander et al. (2008) hypothesized that optimal solutions from DIVA might be treated as approximating ML solutions and that the Bayes-DIVA method could then be treated as a non-parametric empirical Bayesian method. The method relies on empirical observations to approximate the actual stochastic distribution (see Johns, 1957). However, as noted by Nylander et al. (2008), it is not currently possible to determine how effectively DIVA MP solutions approximate ML solutions because there is no stochastic model for DIVA and, thus, no way of estimating the full range of distribution of solutions. Nonetheless, studies comparing biogeographic inference using DIVA and the model-based likelihood methods (e.g., Ree et al., 2005; Ree & Smith, 2008) have found that results are often largely congruent (e.g., Ree et al, 2005; Xiang & Thomas, 2008; Xiang et al., 2009). This may support the necessary assumption that MP solutions from DIVA are reasonable approximations of the ML solutions.

3.2 Biogeographical inference of extant Aesculus

Conflicting biogeographic hypotheses have been proposed for Aesculus (Hardin, 1957a; Xiang et al., 1998b; Forest et al., 2001; Harris et al., 2009). The most recent hypothesis was proposed by Harris et al. (2009) based on results of DIVA using phylogenies inferred from a combination of DNA sequences, morphology, and fossils. The study of Harris et al. (2009) included more molecular data and more fossils than were included here for testing the Bayes-DIVA method. We do not attempt to describe the biogeographic history of lineages of Aesculus with the data presented here. Rather, this portion of the discussion focuses on the utility of this approach to Bayes-DIVA with respect to ancestral distributions at certain nodes of interest.

Despite low to moderate support for placement of five of six lineages in phylogenetic analysis of extant taxa (Analysis 1), we were able to obtain high to moderate statistical support for the biogeographic origins of these lineages (Fig. 5: A, nodes 1–3, 5–6) using the new approach described here (Fig. 7, Table 3). For example, the placement of sect. Parryana was supported by pp = 70% in phylogenetic analysis (Fig. 5: A) and we obtained higher support (P= 0.90) for its ancestral range in eNA-wNA (Table 2, Fig. 7: E). The support for the ancestral range of eNA-wNA for the section was much lower (P= 0.70) when estimated using Equation 2 (Table 3). The increased probability support for P(eNA-wNA) for sect. Parryana using the revised Bayes-DIVA occurred because some alternate placements of the section, for example, sect. Parryana+ sect. Macrothyrsus and sect. Parryana+ (sect. Macrothyrsus+A. californica), yielded non-zero F(eNA-wNAParryana + x). For the node Y including sect. Pavia and x, the ancestral range eNA-wNA is supported weakly to moderately (P= 0.663) (Table 2, Fig. 7: D). However, all four possible ancestral ranges of sect. Pavia contain eNA, resulting in P(eNAR) = 1.0 and providing high confidence for inference that the ancestral range of sect. Pavia included eNA. Similarly this type of inference can be applied to sect. Macrothyrsus, represented by a single extant species known from southeastern USA, which has an unresolved sister (i.e., part of a polytomy) in the Bayesian consensus topology (Fig. 5: A, node 4). Although no single ancestral range with P≥ 0.5 emerged for sect. Macrothyrsus, the two ancestral ranges with the highest probabilities, eNA (P= 0.395) and eNA-wNA (P= 0.245) (Fig. 7: F), can be combined for a total probability of P= 0.640 of eNA. Further exploration of the biogeographic history of this group might begin with eNA as a working hypothesis. This finding highlights the utility of the Bayes-DIVA analysis in cases of polytomies. The ancestral ranges of the lineages of interest inferred in this study are largely congruent with those inferred in Harris et al. (2009), but here we show statistical support deriving from analysis that takes into account topological and optimization uncertainties.

3.3 Adding fossil wildcards

The addition of a European fossil appears to have had a more significant effect than the addition of an East Asian fossil on ranges estimated for the parent nodes of interest (Table 4, Fig. 8). Bayes-DIVA results inferring P(EUR) changed significantly for all six parent nodes of interest (Table 4), whereas the P(EAR) increased significantly for only two of the six nodes (Table 4). This phenomenon can be explained by the fact that only one extant species of Aesculus occurs in Europe, Aesculus hippocastanum L., forming sect. Aesculus with Aesculus turbinata Blume (EA) with a pp = 100% (Fig. 5), but there are several extant species in two major clades occurring in EA. When EU is specified as the range for only one, highly stable terminal taxon, the impact of EU on optimal ancestral ranges for the major lineages is expected to be small compared to EA. Adding a wildcard fossil from Europe to the phylogeny thus heavily influenced the outcomes, that is, increased the probability of EU in the ancestral ranges of the parent nodes. This finding suggests that including fossils from species poor areas with phylogenetic uncertainty will have dramatic impact on results of biogeographic analysis. We therefore recommend that special care be taken to reduce a fossil's wildcard behavior (see Kearney, 2002; Kearney & Clark, 2003) especially when introducing a fossil from a geographic area unrepresented or poorly represented by extant species. Lineages most affected by the inclusion of wildcard fossil taxa appear to be those that have a very low probability of a range including the distribution of the fossil (i.e., P(Rfossil∈ R) when fossils are not included in the analysis.

3.4 Determining probable post-Y ranges

How an ancestral range is subdivided and inherited by daughter lineages immediately following speciation (i.e., at the base of the internode of a branch) can provide additional information about the total historical biogeographic pathway of a lineage of interest. Recently, range evolution along internodes on a phylogenetic tree has been addressed by and can be calculated using the model-based likelihood method of Ree et al. (2005) and Ree and Smith (2008). Here we show that the marginal probabilities obtained using Bayes-DIVA can also be used to make inferences about range inheritance at the base of the internode and evolution along the branches. Our analyses on range divisions at the parental nodes revealed potential extinction and dispersal events along branches of two Aesculus lineages (comparing results of Fig. 7 and Table 5). Future studies could compare the range evolution data from Bayes-DIVA and the likelihood method implemented in Lagrange.

4 Conclusions

  1. Top of page
  2. Abstract
  3. 1 Material and methods
  4. 2 Results
  5. 3 Discussion
  6. 4 Conclusions
  7. Acknowledgments
  8. References
  9. Appendix

As other authors have previously argued, it is best to include all available and relevant information when using phylogeny to reconstruct biogeography (Tiffney & Manchester, 2001; Huelsenbeck & Immenov, 2002; Ree et al., 2005; Nylander et al., 2008). Historical biogeography is a synthetic discipline that produces the most reliable results when analyses include data from divergence time, evidence from paleobotany, geological and ecological data, as well as highly resolved and robust phylogenies (e.g., Tiffney & Manchester, 2001; Emerson & Hewitt, 2005; Ree et al., 2005; Carstens & Richards, 2007; Nylander et al., 2008). Biogeographic analysis using DIVA, which requires little prior information, is perhaps not the best method of biogeographic reconstruction when information in addition to phylogenetic pattern and distributions of extant taxa is available. However, given that DIVA is fast, user friendly, and produces results similar to those from the model-based methods that implement prior information into the optimization, our revised Bayes-DIVA approach provides a solution for authors who favor DIVA but face the problem of polytomies. In biogeogrpahic studies using DIVA, the prior information on divergence time and area connections can be used to distinguish among multiple optimal solutions. DIVA remains advantageous when working with groups for which little or unreliable prior information is available, for its ease of use and freedom from potential error associated with model selection and model parameter determination required for the model-based methods (Ree et al., 2005; Nylander et al., 2008).

Bayes-DIVA offers an advantage over using DIVA in the traditional way as well as over using many methods that require only a single input tree. This is because the Bayes-DIVA analysis provides statistical support for inferred ranges, allows for inference at poorly supported parent nodes of lineages of interest, and allows for other types of analyses of support for biogeographic reconstruction including the two applications we have shown here. As suggested by previous authors (Lemmon & Lemmon, 2008; Nylander et al., 2008; Sanmartín et al., 2008), this approach is not limited to use with DIVA and is applicable to other types of biogeographic analyses.

Acknowledgments

  1. Top of page
  2. Abstract
  3. 1 Material and methods
  4. 2 Results
  5. 3 Discussion
  6. 4 Conclusions
  7. Acknowledgments
  8. References
  9. Appendix

Acknowledgements  The authors are highly indebted to François LUTZONI (Duke University, Durham, NC, USA) for his assistance in developing this approach to use of DIVA software. We also acknowledge Beau DABBS (University of Chicago, Chicago, IL, USA) for his assistance with mathematics and statistics, David THOMAS (formerly North Carolina State University) for helpful discussion, Morris MADURO (University of California, Riverside, CA, USA) for correspondence regarding the random sequence generation script, Holly FORBES (University of California, Berkeley, CA, USA) for collection of fresh leaf materials, and the Gray Herbarium at Harvard University for the loan of herbarium specimens. This manuscript is a part of the thesis of AJ Harris submitted to the NCSU graduate school in 2007. This study has benefited from a National Science Foundation (USA) grant made to Xiang (DEB-0444125). For travel support to workshops and symposia we thank the Deep Time Research Coordination Network, supported by a NSF grant funded to D.E. Soltis (DEB-0090283), and the Phytogeography of the Northern Hemisphere Working Group and the Clock Workgroup supported by NSF through NESCent.

References

  1. Top of page
  2. Abstract
  3. 1 Material and methods
  4. 2 Results
  5. 3 Discussion
  6. 4 Conclusions
  7. Acknowledgments
  8. References
  9. Appendix
  • Axelrod DI. 1966. The Eocene Copper Basin flora of northeastern Nevada. University of California Publications in Geological Science 59: 183.
  • Bremer K. 1992. Ancestral areas: a cladistic reinterpretation of the center of origin concept. Systematic Biology 41: 436445.
  • Budantsev LJ. 1983. History of the Arctic flora of the early Cenophytic epoch. Nauka , Leningrad . (in Russian).
  • Burbrink FT, Lawson R. 2007. How and when did Old World rat snakes disperse into the New World? Molecular Phylogenetics and Evolution 43: 173189.
  • Calviño CI, Martínez SG, Downie SR. 2008. Morphology and biogeography of Apiaceae subfamily Saniculoideae as inferred by phylogenetic analysis of molecular data. American Journal of Botany 95: 196214.
  • Carstens BC, Richards CL. 2007. Integrating coalescent and ecological niche modeling in comparative phylogeography. Evolution 61: 14391454.
  • Condit C. 1944. The Remington Hill flora. Washington : Carnegie Institute of Washington Publication 553: 2155.
  • Crane PR, Herendeen P, Friis EM. 2004. Fossils and plant phylogeny. American Journal of Botany 91: 16831699.
  • De Lumley H. 1988. La stratigraphie du remplissage de la Grotte du Vallonnet. L’Anthropologie 92: 407428.
  • Dilhoff RM, Leopold EB, Manchester SR. 2005. The McAbee flora of British Columbia and its relation to the early-middle Eocene Okanagan Highlands flora of the Pacific Northwest. Canadian Journal of Earth Science 42: 151166.
  • Donoghue MJ, Smith SA. 2004. Patterns in the assembly of the temperate forest around the Northern Hemisphere. Philosophical Transactions of the Royal Society of London: Biology 359: 16331644.
  • Emerson BC, Hewitt GH. 2005. Phylogeography. Current Biology 15: 367371.
  • Felsenstein J. 1989. PHYLIP (Phylogeny Inference Package) Version 3.2. Cladistics 5: 164166.
  • Felsenstein J. 2008. PHYLIP (Phylogeny Inference Package) Version 3.68. Distributed by the author. Seattle : Department of Genome Sciences, University of Washington .
  • Forest F, Drouin JN, Charest R, Brouillet L, Bruneau A. 2001. A morphological phylogenetic analysis of Aesculus L. and Billia Peyr. (Sapindaceae). Canadian Journal of Botany 79: 154169.
  • Fiz O, Vargas P, Alarcón M, Aedo C, Garcia JL, Aldasoro JJ. 2008. Phylogeny and historical biogeography of Geraniaceae in relation to climate changes and pollination ecology. Systematic Botany 33: 326342.
  • Golovneva L. 2000. Early Paleogene floras of Spitzbergen and North Atlantic floristic exchange. Acta Universitatis Carolinae Geologica 44: 3950.
  • Hardin JW. 1957a. A revision of the American Hippocastanaceae. Brittonia 9: 145171.
  • Hardin JW. 1957b. A revision of the American Hippocastanaceae, II. Brittonia 9: 173195.
  • Hardin JW. 1960. Studies in the Hippocastanaceae, V. Species of the Old World. Brittonia 12: 2638.
  • Harrington MG, Edwards KJ, Johnson SA, Chase MW, Gadek PA. 2005. Phylogenetic inference in Sapindaceae sensu lato using plastid matK and rbcL DNA sequences. Systematic Botany 30: 366382.
  • Harris AJ, Thomas DT, Xiang QY. 2009. Phylogeny, origin, and biogeographic history of Aesculus L. (Sapindales): an update from combined analysis of DNA sequences, morphology, and fossils. Taxon 58: 108126.
  • Hilton J, Bateman RM. 2006. Pteridosperms are the backbone of seed-plant phylogeny. Journal of the Torrey Botanical Society 133: 119168.
  • Hines HM. 2008. Historical biogeography, divergence times, and diversification patterns of bumble bees (Hymenoptera: Apidae: Bombus). Systematic Biology 57: 5875.
  • Hu HH, Chaney RW. 1940. A Miocene flora from Shantung Province, China. Washington : [bpa2]Carnegie Institute of Washington Publication 507: 1147.
  • Huelsenbeck JP, Immenov NS. 2002. Geographic origin of human mitochondrial DNA: accommodating phylogenetic uncertainty and model comparison. Systematic Biology 51: 155165.
  • Huelsenbeck JP, Ronquist F. 2001. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17: 754755.
  • Huelsenbeck JP, Ronquist F. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19: 15721574.
  • Huttunen S, Hedenäs L, Ignatov MS, Devos N, Vanderpoorten A. 2008. Origin and evolution of the Northern Hemisphere disjunction in the moss genus Homalothecium (Brachytheciaceae). American Journal of Botany 95: 720730.
  • Jeandroz S, Murat C, Wang Y, Bonfante P, Tacon FL. 2008. Molecular phylogeny and historical biogeography of the genus Tuber, the “true truffles”. Journal of Biogeography 35: 815829.
  • Jeong EK, Kim K, Kim JH, Suzuki M. 2004. Fossil woods from the Janggi Group (Early Miocene) in Pohang Basin, Korea. Journal of Plant Research 117: 183189.
  • Johns MV Jr. 1957. Non-parametric empirical Bayes procedures. The Annals of Mathematical Statistics 28: 649669.
  • Judd WS, Saunders RW, Donoghue MJ. 1994. Angiosperm family pairs: preliminary analyses. Harvard Papers in Botany 5: 151.
  • Kauff F, Miadlikowska J, Lutzoni F. 2003. ARC: a program for ambiguous region coding. Available online at http://www.lutzonilab.net/ and select “Downloadable Programs”[Accessed 10 October 2006.
  • Kauff F. 2005. RandomTree: random tree sampling. Available online at http://www.lutzonilab.net/ and select “Downloadable Programs”[Accessed 10 October 2006.
  • Kearney M. 2002. Fragmentary taxa, missing data, and ambiguity: mistaken assumptions and conclusions. Systematic Biology 51: 369381.
  • Kearney M, Clark JM. 2003. Problems due to missing data in phylogenetic analyses including fossils: a critical review. Journal of Vertebrate Paleontology 23: 263274.
  • Lemmon AR, Lemmon EM. 2008. A likelihood framework for estimating phylogeographic history on a continuous landscape. Systematic Biology 57: 544561.
  • Lieberman BS. 2003. Paleobiogeography: the relevance of fossils to biogeography. Annual Review of Ecology and Systematics 34: 5169.
  • Lim K. 2008. Historical biogeography of New World emballonurid bats (tribe Diclidurini): taxon pulse diversification. Journal of Biogeography 35: 13851401.
  • Lutzoni F, Pagel M, Reeb V. 2001. Major fungal lineages are derived from lichen symbiotic ancestors. Nature 411: 937940.
  • Maddison DR, Maddison WP. 2001. MacClade 4: analysis of phylogeny and character evolution. Version 4.02. Sunderland : Sinauer Associates.
  • Mai DH, Walther H. 1988. Die pliozaenen Floren von Thueringen, Deutsche Demokratische Republik. Quartaerpalaeontologie 7:55297.
  • Manchester SR. 1999. Biogeographical relationships of North American Tertiary floras. Annals of the Missouri Botanical Gardens 86: 472522.
  • Manchester SR. 2001. Leaves and fruits of Aesculus (Sapindales) from the Paleocene of North America. International Journal of Plant Sciences 162: 985996.
  • Manos PS, Soltis PS, Soltis DE, Manchester SR, Oh SH, Bell CD, Dilcher DL, Stone DE. 2007. Phylogeny of extant and fossil Juglandaceae inferred from the integration of molecular and morphological data sets. Systematic Biology 56: 412430.
  • Mansion G, Rosenbaum G, Schoenenberger N, Bacchetta G, Rosselló JA, Conti E. 2008. Phylogenetic analysis informed by geological history supports multiple, sequential invasions of the Mediterranean Basin by the angiosperm family Araceae. Systematic Biology 57: 269285.
  • Mast AR, Willis CL, Jones EH, Downs KM, Weston PH. 2008. A smaller Macadamia from a more vagile tribe: inference of phylogenetic relationships, divergence times, and diaspore evolution in Macadamia and relatives (tribe Macadamieae; Protaceae). American Journal of Botany 95: 843870.
  • Nixon KC, Wheeler QD. 1992. Extinction and the origin of species. In: NovacekMJ, WheelerQD eds. Extinction and phylogeny. New York : Columbia University Press. 119143.
  • Nylander JAA, Olsson U, Alström P, Sanmartín I. 2008. Accounting for phylogenetic uncertainty in biogeography: a Bayesian approach to Dispersal–Vicariance Analysis of the thrushes (Aves: Turdus). Systematic Biology 57: 257268.
  • Page RDM. 1993a. COMPONENT: tree comparison software for Microsoft Windows, Version 2.0, User's Guide. London : Natural History Museum.
  • Page RDM. 1993b. Genes, organisms, and areas: the problem of multiple lineages. Systematic Biology 42: 7784.
  • Page RDM. 1996. TREEVIEW: an application to display phylogenetic trees on personal computers. Computer Applications in Bioscience 12: 357358.
  • Page RDM. 2001. TreeView for Windows. Version 1.6.6. Available online at http://taxonomy.zoology.gla.ac.uk/ and select “Software”.
  • Pagel M, Meade A, Barker D. 2004. Bayesian estimation of ancestral character states on phylogenies. Systematic Biology 53: 673684.
  • Posada D, Crandall KA. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14: 817818.
  • Prakash U, Barghoorn ES. 1961. Miocene fossil woods from the Columbia Basalts of Central Washington, II. Journal of the Arnold Arboretum 42: 165203.
  • Puri GS. 1945. Some fossil leaflets of Aesculus indica Colebr. from the Karewa Beds at Laredura and Ningal Nullah, Pir Panjal, Kashmir. Journal of the Indian Botanical Society 24.
  • Rambaut A, Drummond J. 2003. Tracer. Version 1.3. Available online at http://evolve.zoo.ox.ac.uk/Evolve/Welcome.html.
  • Ree RH, Smith SA. 2008. Maximum likelihood inference of geographic range evolution by dispersal, local extinction, and cladogenesis. Systematic Biology 57: 414.
  • Ree RH, Moore BR, Webb CO, Donoghue MJ. 2005. A likelihood framework for inferring the evolution of geographic range of phylogenetic trees. Evolution 59: 22992311.
  • Ronquist F. 1996. Dispersal Vicariance Analysis (DIVA) 1.1. User's manual. Available online at http://www.ebc.uu.se/syszoo/research/diva/diva.html.
  • Ronquist F. 1997. Dispersal–vicariance analysis: a new approach to the quantification of historical biogeography. Systematic Biology 46: 195203.
  • Ronquist F. 2001. Dispersal Vicariance Analysis (DIVA) 1.2. Available online at http://www.ebc.uu.se/syszoo/research/diva/diva.html.
  • Rothwell GW. 1999. Fossil and ferns in the resolution of land plant phylogeny. Botanical Review 65: 189218.
  • Rothwell GW, Nixon KC. 2006. How does the inclusion of fossil data change our conclusions about the phylogenetic history of euphyllophytes. International Journal of Plant Sciences 167: 737749.
  • Sanmartín I, Ronquist F. 2004. Southern Hemisphere biogeography inferred by event–based models: plant versus animal patterns. Systematic Biology 53: 216243.
  • Sanmartín I, Enghoff H, Ronquist F. 2001. Patterns of animal dispersal, vicariance and diversification in the Holarctic. Biological Journal of the Linnean Society 73: 345390.
  • Sanmartín I, Van Der Mark P, Ronquist F. 2008. Inferring dispersal: a Bayesian approach to phylogeny-based island biogeography, with special reference to the Canary Islands. Journal of Biogeography 35: 428449.
  • Schloemer-Jäger A. 1958. Alttertiare pflanzen aus flozen der bragger-halbinsel Spitzbergens. Paleontographica Abt B 39103.
  • Soltis DE, Morris AB, MacLachlan JS, Manos PS, Soltis PS. 2006. Comparative phylogeography of unglaciated eastern North America. Molecular Ecology 15: 42614293.
  • Swofford DL. 2002. PAUP*– Phylogenetic analysis using parsimony (*and other methods). Version 4.0b10. Sunderland : Sinauer Associates.
  • Szafer W. 1947. The Pliocene flora of Kroscienko in Poland. Rozpr Wydz mat-przyr Akad Urn. 72: 91162. (in Polish and English).
  • Szafer W. 1954. Pliocene flora from the vicinity of Czorsztyn (West Carpathians) and its relationship to the Pleistocene. Institute of Geology of Warzawa 111: 1238. (in Polish and English).
  • Tanai T. 1952. The fossil vegetation from the coalified basin of Nishitagawa, Prefecture of Yamagata, Japan. Japanese Journal of Geology and Geography 22: 119135. (in French).
  • Tiffney BH, Manchester SR. 2001. Integration of paleobotanical and neobotanical data in the assessment of phylogeographic history of Holarctic angiosperm clades. International Journal of Plant Sciences 162: S19S27.
  • Velazco PM, Patterson BD. 2008. Phylogenetics and biogeography of the broad-nosed bats, genus Platyrrhinus (Chiroptera: Phyllostomidae). Molecular Phylogenetics and Evolution 49: 479459.
  • Wehr WC. 1998. Middle Eocene insects and plants of the Okanagan Highlands. In: MartinJE ed. Contributions to the paleontology and geology of the West Coast: in honor of V. Standish Mallory. Seattle : Thomas Burke Memorial Washington State Museum Research. 99109.
  • Weinelt M. 1999. Online Map Creation (OMC) version 4.1. Available online at http://www.aquarius.ifm-geomar.de [Accessed 1 Jan 2008.
  • Wen J. 1999. Evolution of the eastern Asian and eastern North American disjunct distributions in flowering plants. Annual Review of Ecology and Systematics 30: 421455.
  • Wiens JJ. 2003. Missing data, incomplete taxa, and phylogenetic accuracy. Systematic Biology 52: 528538.
  • Wiens JJ. 2006. Missing data and the design of phylogenetic analyses. Journal of Biomedical Informatics 39: 3442.
  • Wiggins IL. 1932. The lower California buckeye, Aesculus parryi A. Gray. American Journal of Botany 19: 406410.
  • Xiang QY, Thomas DT. 2008. Tracking character evolution and biogeographic history through time in Cornaceae—Does choice of methods matter? Journal of Systematics and Evolution 46: 349374.
  • Xiang QY, Soltis DE, Soltis PS. 1998a. The eastern Asian and eastern and western North American disjunction: congruent phylogenetic patterns in seven diverse genera. Molecular Phylogenetics and Evolution 10: 178190.
  • Xiang QY, Crawford DJ, Wolfe AD, Tang YC. 1998b. Origin and biogeography of Aesculus L. (Hippocastanaceae): a molecular phylogenetic perspective. Evolution 52: 988997.
  • Xiang QY, Zhang WH, Ricklefs RE, Qian H, Chen ZD, Wen J, Li JH. 2004. Regional differences in rates of plant speciation and molecular evolution: a comparison between eastern Asia and eastern North America. Evolution 58: 21752184.
  • Xiang QY, Manchester SR, Thomas DT, Zhang WH, Fan C. 2005. Phylogeny, biogeography, and molecular dating of cornelian cherries (Cornus, Cornaceae): tracking Tertiary plant migration. Evolution 58: 16851700.
  • Xiang QY, Thomas DT, Zhang WH, Manchester SR, Murrell Z. 2006. Species level phylogeny of the genus Cornus (Cornaceae) based on molecular and morphological evidence – implications for taxonomy and Tertiary intercontinental migration. Taxon 55: 930.
  • Xiang QY, Smith SA, Harris AJ, Feng C. 2009. Use of fossils in biogeographic analysis – challenges and possible solutions. Abstract. Invited presentation: 4th International conference of the International Biogeography Society, Merida, Mexico. 69.

Appendix

  1. Top of page
  2. Abstract
  3. 1 Material and methods
  4. 2 Results
  5. 3 Discussion
  6. 4 Conclusions
  7. Acknowledgments
  8. References
  9. Appendix

Appendix I: DNA sequences of Aesculus and outgroups.

Notes: For each taxon, information reads as taxon, accession number, and gene sequence data available. GenBank accessions are given following gene names. Internal transcribed spacer (ITS) accessions are given in the order ITS1, ITS2, and 5.8s if available. Superscripts correspond to numbered accessions in Fig. 3a of Harris et al. (2009).

Ingroup.—Section Aesculus—. A. hippocastanum L., Kew 00-69.11289-263, rps16 (EU687697) matK (EU687725) ITS (EU687600, EU687637); A. turbinata Blume, D.J. Crawford 4111, rps16 (EU687695) matK (EU687723) ITS (EU687598, EU687635); A. turbinata Blume, JC Raulston Arboretum 9500162, rps16 (EU687696) matK (EU687724) ITS (EU687599, EU687636, EU687666); Section Calothyrsus (traditional)—. A. assamica Griff., Mongolia Expedition 10039, rps16 (EU687676) ITS (EU687578, EU687615, EU687651); A. californica (Spach.) Nutt., D.J. Crawford 4061, rps16 (EU687689) matK (EU687715) ITS (EU687590, EU687627, EU687659); A. californica (Spach.) Nutt., T.M. Hardig 27952, rps16 (EU687690) matK (EU687716) ITS (EU687591, EU687628, EU687660); A. californica (Spach.) Nutt., J.C. Raulston arboretum 9504133, rps16 (EU687691) matK (EU687717) ITS (EU687592, EU687629, EU687661); A. californica (Spach.) Nutt., UC Berkeley 93.12034, rps16 (EU687692) matK (EU687718) ITS (EU687593, EU687630, EU687662); A. californica (Spach.) Nutt., UC Berkeley 93.11165, rps16 (EU687693) matK (EU687719) ITS (EU687594, EU687631, EU687663); A. chinensis Bunge, Q.Y. Xiang 3051, rps16 (EU687678) ITS (EU687580, EU687617, EU687652); A. chinensis Bunge, Q.Y. Xiang 04-C882, rps16 (EU687677) matK (EU687706) ITS (EU687579, EU687616); A. indica (Camb.) Hook, Q.Y. Xiang 3011, rps16 (EU687686) matK (EU687711) ITS (EU687587, EU687624); A. indica (Camb.) Hook, J.C. Raulston Arboretum 0014052, rps16 (EU687687) matK (EU687712) ITS (EU687588, EU687625, EU687658); A. polyneura Hu & Fang, Q.Y. Xiang 02-255, rps16 (EU687681) matK (EU687707) ITS (EU687582, EU687619, EU687654); A. tsiangii Hu & Fang, Q.Y. Xiang 04-C37, rps16 (EU687685) matK (EU687710) ITS (EU687586, EU687623, EU687657); A. wilsonii Rehder, Q.Y. Xiang 02-1051, rps16 (EU687684) ITS (EU687585, EU687622, EU687656); A. wilsonii Rehder., Q.Y. Xiang 04-C92, rps16 (EU687683) matK (EU687709) ITS (EU687584, EU687621, EU687655); A. wangii Hu, Q.Y. Xiang 303, rps16 (EU687682) matK (EU687708) ITS (EU687583, EU687620); Section Macrothyrsus—. A. parviflora Walter, J.C. Raulston arboretum sene non., rps16 (EU687694) matK (EU687721) ITS (EU687596, EU687633, EU687664); Section Pavia—. A. glabra Willd., D.J. Crawford 413, rps16 (EU687702) matK (EU687734) ITS (EU687607, EU687644, EU687671); A. flava Sol., C.W. DePamphilis F-MI-41, matK (EU687737); A. flava Sol., Q.Y. Xiang 98-1502, rps16 (EU687703) matK (EU687738) ITS (EU687610, EU687647, EU687672); A. pavia L., Q.Y. Xiang 01-541, rps16 (EU687700) matK (EU687732) ITS (EU687605, EU687642, EU687669); A. pavia L., Q.Y. Xiang 98-1352, rps16 (EU687701) matK (EU687733) ITS (EU687606, EU687643, EU687670); A. sylvatica Bart., Q.Y. Xiang 01-2511, rps16 (EU687698) matK (EU687726) ITS (EU687601, EU687638, EU687667); A. sylvatica Bart., Q.Y. Xiang 98-1102, rps16 (EU687699) matK (EU687728) ITS (EU687602, EU687639, EU687668); Section Parryana—. A. parryi Gray, Epling 1936 sene non, rps16 (EU687688) matK (EU687714).

Outgroup.—Handeliodendron bodinieri (Levl.) Rehd., Q.Y. Xiang 302, rps16 (EU687674) ITS (EU687575, EU687612, EU687649); Billia Peyr sp., Q.Y. Xiang 02-12, rps16 (EU687675) matK (EU687705) ITS (EU687577, EU687614, EU687650).