treespace: Statistical exploration of landscapes of phylogenetic trees
Abstract
The increasing availability of large genomic data sets as well as the advent of Bayesian phylogenetics facilitates the investigation of phylogenetic incongruence, which can result in the impossibility of representing phylogenetic relationships using a single tree. While sometimes considered as a nuisance, phylogenetic incongruence can also reflect meaningful biological processes as well as relevant statistical uncertainty, both of which can yield valuable insights in evolutionary studies. We introduce a new tool for investigating phylogenetic incongruence through the exploration of phylogenetic tree landscapes. Our approach, implemented in the R package treespace, combines tree metrics and multivariate analysis to provide low‐dimensional representations of the topological variability in a set of trees, which can be used for identifying clusters of similar trees and group‐specific consensus phylogenies. treespace also provides a user‐friendly web interface for interactive data analysis and is integrated alongside existing standards for phylogenetics. It fills a gap in the current phylogenetics toolbox in R and will facilitate the investigation of phylogenetic results.
1 INTRODUCTION
Genetic sequence data are becoming an increasingly common and informative resource in a variety of fields including evolutionary biology (Wolfe & Li, 2003), ecology (Hudson, 2008), medicine (Weinshilboum, 2002) and infectious disease epidemiology (Holden et al., 2013; Pybus & Rambaut, 2009). Although specific methods emerge to tackle particular problems in different fields, many analyses of homoplasy, selection and population structure begin with a reconstructed tree. Indeed, phylogenetic reconstruction remains the gold standard for assessing the evolutionary relationships amongst a set of taxa or sampled isolates (Bouckaert et al., 2014; Popescu, Huber, & Paradis, 2012; Ronquist & Huelsenbeck, 2003; Schliep, 2011) in the absence of horizontal gene transfers and recombination events (McInerney, Cotton, & Pisani, 2008).
Ideally, a single phylogenetic tree could be used to visualize the evolutionary history of a set of sequences. In practice, however, a number of biological and statistical factors may lead to phylogenetic uncertainty and incongruence (Jeffroy, Brinkmann, Delsuc, & Philippe, 2006; Kumar, Filipski, Battistuzzi, Kosakovsky Pond, & Tamura, 2012; Som, 2015). In such cases, several phylogenies may be equally supported by the data and need to be examined. Besides horizontal gene transfers (Delsuc, Brinkmann, & Philippe, 2005; McInerney et al., 2008), genomic reassortments (Nelson et al., 2008) and gene loss and acquisition (Page & Charleston, 1997), incomplete lineage sorting can lead different genes to exhibit distinct genealogies (Jeffroy et al., 2006; Pollard, Iyer, Moses, & Eisen, 2006; Som, 2015) and invalidate the idea of a “single evolutionary history” (Jeffroy et al., 2006; McInerney et al., 2008). Statistical uncertainty in tree topology can also arise when using bootstraps (Efron 1992; Felsenstein, 1985, Newton, 1996; Soltis & Soltis, 2003) or when considering samples of trees in Bayesian approaches (Drummond & Rambaut, 2007; Huelsenbeck, Rannala, & Masly, 2000; Ronquist & Huelsenbeck, 2003).
Because examining multiple phylogenies quickly becomes impractical, this problem is classically addressed by choosing a single reference phylogeny and indicating support for individual nodes in the other trees (Drummond & Rambaut, 2007; Felsenstein, 1985; Paradis, Claude, & Strimmer, 2004; Soltis & Soltis, 2003). Unfortunately, bootstrap or posterior support values can only be easily interpreted when they show high congruence, and considerable effort has been devoted to quantifying the credibility or probability of clades in reconstructed phylogenies (Anisimova, Gil, Dufayard, Dessimoz, & Gascuel, 2011; Drummond, Ho, Phillips, & Rambaut, 2006; Holmes, 2003b; Lemey, Rambaut, Drummond, & Suchard, 2009; Newton, 1996; Wróbel, 2008). Statistically significant results derived from different data sources can differ (Kumar et al., 2012), and while this would usually result in low bootstrap values, anomalously high bootstrap values can result from concatenation of gene sequences (Gadagkar, Rosenberg, & Kumar, 2005; Kumar et al., 2012). While several different phylogenies can be nearly equally supported by the data (Wróbel, 2008), in practice these alternative often remain unexplored (Felsenstein, 1985; Holmes, 2003a; Newton, 1996). A more satisfying alternative would consist of extracting the essential differences and similarities amongst a set of trees, visualizing these relationships and identifying one or more representative trees (Amenta & Klingner, 2002; Chakerian & Holmes, 2012; Hillis, Heath, & St John, 2005; Holmes, 2003b; Nye, 2014).
Several metrics and measures of dissimilarity between trees have been developed (Table 1), each of which directly compares trees to each other according to certain biological or mathematical properties (Critchlow, Pearl, & Qian, 1996; Estabrook, McMorris, & Meacham, 1985; Hein, Jiang, Wang, & Zhang, 1996; Kendall & Colijn, 2015; Pavoine, Ollier, Pontier, & Chessel, 2008; Robinson & Foulds, 1979, 1981; Williams & Clifford, 1971). Interestingly, these methods of pairwise tree comparison can form the basis of further analyses aiming to visualize and characterize relationships in a whole set of phylogenies. Several studies have also focussed on providing Euclidean visualizations of tree spaces, but typically relied on a single tree metric (Amenta & Klingner, 2002; Chakerian & Holmes, 2012; Hillis et al., 2005; Kendall & Colijn, 2016; Wilgenbusch, Huang, & Gallivan, 2017).
| Metric/tree summary | References | R function (package) |
|---|---|---|
| Robinson–Foulds metric | (Robinson & Foulds, 1979, 1981) | RF.dist (phangorn) (Schliep, 2011) dist.topo (ape) (Paradis et al., 2004) |
| Branch score distance | (Kuhner & Felsenstein, 1994) | KF.dist (phangorn) (Schliep, 2011) |
| Billera–Holmes–Vogtmann metric (BHV) | (Billera et al., 2001) | dist.multiPhylo (distory) (Chakerian & Holmes, 2013) |
| Path difference metric (a.k.a. patristic distance/node distance/tip distance/dissimilarity measure) | (Steel & Penny, 1993), (note also the l1‐norm version by [Williams & Clifford, 1971; ]) | path.dist (phangorn) (Schliep, 2011) distTips (adephylo) (Jombart et al., 2010a) |
| Kendall–Colijn metric | (Kendall & Colijn, 2015) | treeDist (treespace) |
| Abouheif's dissimilarity | (Pavoine et al., 2008) | distTips (adephylo) (Jombart et al., 2010a) |
| Sum of direct descendents | (Pavoine et al., 2008) | distTips (adephylo) (Jombart et al., 2010a) |
We introduce treespace, an R package providing a comprehensive toolkit for the analysis of phylogenetic incongruence. We generalize a previous approach (Amenta & Klingner, 2002; Hillis et al., 2005) for visualizing relationships between trees in a continuous, low‐dimensional Euclidean space to any tree metric, and implement the most common ones (Table 1). In addition, we provide a range of clustering methods permitting the identification of groups of similar trees commonly known as “tree islands” (Maddison, 1991) and implement a new method for defining summary trees (Kendall & Colijn, 2016). Our R package also implements a user‐friendly web interface giving access to all of the package's features and permitting the interactive visualization and analysis of sets of phylogenetic trees. To maximize data interoperability, it is fully integrated alongside existing standards for phylogenetics (Jombart, Balloux, & Dray, 2010; Popescu et al., 2012; Schliep, 2011) in the R software (R Core Team 2016).
2 IMPLEMENTED METHODS
treespace generalizes an approach used by Amenta and Klingner (Amenta & Klingner, 2002) and later by Hillis et al. (2005), implemented as the treesetviz module for mesquite (Maddison & Maddison, 2003). This method used the Robinson–Foulds metric (Robinson & Foulds, 1979, 1981) to visualize relationships between labelled trees with identical tips in a Euclidean space. Here, we generalize this approach to any tree metric, and add the use of multiple clustering approaches to formally identify “tree islands”.
The core idea underlying tree space exploration is to map variability in tree topology or branch length onto a low‐dimensional, Euclidean space, which can then be used for visualizing relationships between the phylogenies and, potentially, to define clusters of similar trees (Figure 1). First, pairwise distances between all pairs of trees in the sample are computed (Figure 1a,b). Typically, measures of distances between trees rely on mapping each phylogeny to a vector of labelled numbers corresponding to pairwise comparisons of tips or internal nodes and then computing the Euclidean distance between the resulting vectors (Figure S1). treespace implements an extensive selection of distances relying on this principle (Kendall & Colijn, 2015; Pavoine et al., 2008; Robinson & Foulds, 1979, 1981; Steel & Penny, 1993; Williams & Clifford, 1971), as well as the BHV metric (Billera, Holmes, & Vogtmann, 2001), which directly computes distances between trees without intermediate feature extraction (Table 1).

Once pairwise distances between trees are computed, they are decomposed into a low‐dimensional space using metric multidimensional scaling (MDS), also known as principal coordinate analysis (PCoA, Gower, 1966; Dray & Dufour, 2007; Legendre & Legendre, 2012). This method finds independent (uncorrelated) synthetic variables, the “principal components” (PCs), which represent as well as possible the original distances inside a lower‐dimensional space (Figure 1c). By inspecting the proportion of the total distances between trees represented by specific axes (the “eigenvalues” of the different PCs), one can assess the number of relevant PCs to examine and, ideally, separate structured phylogenetic variation from random noise (Legendre & Legendre, 2012). Importantly, MDS can only be applied to Euclidean distances (Legendre & Legendre, 2012). In the case of non‐Euclidean tree distances (Billera et al., 2001; Robinson & Foulds, 1981), we use Cailliez's transformation (Cailliez, 1983) to render these distances Euclidean before MDS.
Exploring tree spaces using MDS allows the main features of a given phylogenetic landscape to be explored and evaluated. In particular, the resulting typology may exhibit discrete clusters of related trees (the “phylogenetic islands”), indicating that several distinct phylogenies may actually be supported by the data (Figure 1c). To identify such clusters formally, we implemented various hierarchical clustering methods based on the projected distances, including the single linkage, complete linkage, Unweighted Pair Group Method with Arithmetic Mean (UPGMA) and Ward's method (Legendre & Legendre, 2012).
This approach allows the user to seek representative trees for each cluster separately (Figure 1d). A method for selecting such representative trees is given in Kendall and Colijn (2015) and implemented in treespace as the function “medTree.” This function identifies the geometric median tree(s), which are the tree(s) closest to the mean of the Kendall–Colijn tree vectors for a given cluster. Such trees serve as alternatives to other summary tree approaches such as the consensus tree (Felsenstein, 1985) or the maximum clade credibility (MCC) tree (Drummond & Rambaut, 2007; Ronquist & Huelsenbeck, 2003), with the key advantage that they correspond to specific trees in the sample, thus avoiding implausible negative branch lengths (Heled & Bouckaert, 2013). However, given a collection of trees in a cluster, any summary approach such as MCC could be used.
All the functionalities described above are implemented in treespace as standard R functions, fully documented in a vignette tutorial, as well as in a user‐friendly web interface for interactive data analysis. This interface can be started locally (i.e. without Internet connection) from R using a simple instruction (treespaceServer()) and, therefore, demands virtually no knowledge of the R language. Alternatively, we also provide an online instance of the application at http://shiny.imperial-stats-experimental.co.uk/users/mlkendal/treespace
3 WORKED EXAMPLE
As an illustration, we used treespace to analyse 17 publicly available sequences of dengue virus (Drummond & Rambaut, 2007; Lanciotti, Gubler, & Trent, 1997). This analysis is reproduced in a vignette distributed with the package which can be loaded using the instruction vignette(“DengueVignette”). Three types of phylogenetic trees were obtained: (a) a neighbour‐joining (NJ) tree (Figure 2a) created using the R package ape (Paradis et al., 2004); (b) a maximum‐likelihood (ML) tree (Figure 2b) obtained using phangorn (Schliep, 2011); and (c) Bayesian trees using beast v1.8 with the codon‐position‐specific substitution model and relaxed clock priors, as specified in xml file S2 in (Drummond & Rambaut, 2007). 100 bootstrap trees were obtained for the NJ and ML phylogenies (Holmes, 2003a). For beast, 200 trees were randomly sampled from the posterior distribution after visually assessing the convergence of the MCMC chain with 10,000,000 iterations. Results were qualitatively unchanged using larger samples. The NJ and ML trees were rooted using the “D4Thai63” sequence, seen as the most basal in the beast MCC tree.

Trees inferred using the three methods were different (Figure 2) in the position of the “Philippines clade” (dashed box in Figure 2) and in whether the Tahiti84 tip was sister to PRico86. Bootstrap support values for the NJ tree show considerable phylogenetic incongruence, both near the tips and deep in the tree (Figure 2a). In contrast, the ML tree has high bootstrap support for most nodes (Figure 2b). Interestingly, the ML and NJ trees themselves were quite different (Figure 2a,b), notably with the “Philippines” clade clustered with isolates from Thailand and Sri Lanka (“D4Thai” and “D4SLanka” isolates) in the ML tree and not in the NJ phylogeny. Examination of bootstrap values alone does not indicate whether the NJ and ML bootstrap trees exhibit any common topologies. beast trees visualized using densitree (Bouckaert, 2010) and the beast MCC tree (Figure 2c,d) seemed more similar to the ML phylogeny in the position of the “Philippines” clade, but also showed uncertainty in tree topologies in multiple places. While densitree plots provide intuition about the extent of incongruence amongst these trees, Figure 2c does not reveal whether the topologies of beast phylogenies coincide with any of the other trees.
We used treespace to investigate potential discrepancies in more detail. A three‐dimensional MDS based on the Kendall–Colijn metric (Kendall & Colijn, 2015) revealed differences between the different methods (Figure 3a; see vignette for an interactive version). This analysis revealed that topologies of NJ and ML bootstrap trees were broadly similar, overlapping in three distinct and similar‐sized clusters. However, the NJ trees exhibited slightly more variation, including a few outlying topologies (top right, Figure 3a), which is consistent with the overall lower bootstrap support values than in the ML tree (Figure 2).

beast trees formed a group of their own, with no overlap between their topologies and those of the NJ or ML trees (Figure 3a). A separate analysis of the beast trees revealed four distinct clusters of topologies (function “findGroves,” Figure 3b). Closer examination of the phylogenies revealed that topologies of these sets of trees were indeed all different; no single topology was shared between beast trees and NJ/ML trees. The median trees (function “medTree”) obtained for each cluster (Figure 3c–f) revealed that Bayesian trees largely supported the positioning of the “Philippines” clade of the ML tree (Figure 3d,f), with alternative placements mostly due to a few outlying topologies more akin to the NJ tree (Figure 3c,e). These results also suggested that the position of root may be disputed, as every phylogenetic islands exhibited a different rooting.
4 DISCUSSION
treespace provides a simple framework for exploring landscapes of phylogenetic trees and investigating phylogenetic incongruence using tree–tree distances. Of the various methods for measuring distances between trees, some may be better than others at capturing meaningful topological differences, as is the case when testing phylogenetic signal (Jombart, Pavoine, Devillard, & Pontier, 2010; Münkemüller et al., 2012; Pavoine et al., 2008). There are currently no theoretical descriptions that can determine a priori which tree comparison method will be most revealing for which kind of data. Recognizing this, we have incorporated considerable flexibility into treespace in terms of how trees are compared, by providing a framework which can incorporate any tree‐to‐tree distance, and implementing seven different ones by default. This feature distinguishes treespace from other similar software, like the R package RWTY which re‐implements mesquite's treesetviz module (Robinson–Foulds metric) as part of an excellent toolkit for assessing mixing in Bayesian phylogenetics (Warren, Geneva, & Lanfear, 2017), or treescaper, which puts stronger emphasis on reduced space optimization methods and community detection algorithms (Huang et al., 2016; Wilgenbusch et al., 2017).
treespace combines a fast dimension reduction technique (MDS) with various hierarchical clustering approaches (Legendre & Legendre, 2012) to reveal phylogenetic tree islands. While this approach is very computer‐efficient, it may sometimes struggle to delineate tree islands in the presence of distortions of the tree space observed in some specific metrics (Hillis et al., 2005). For instance, recent work suggests that the Robinson–Foulds metric is best combined with nonlinear dimension reduction techniques for identifying clusters of similar trees (Wilgenbusch et al., 2017). Further efforts should be devoted to investigating alternative dimension reduction approaches such as the t‐SNE implemented with a Barnes–Hut approximation (van der Maaten & Hinton, 2008), and nonlinear classifiers such as support vector machines (Schölkopf & Smola, 2002) or community detection methods (Blondel, Guillaume, Lambiotte, & Lefebvre, 2008; Huang et al., 2016).
Our approach is very different from the “principal component analysis” (PCA) for trees introduced by Aydin, Pataki, Wang, Bullitt, and Marron (2009) and extended to phylogenetic trees by Nye (2011). These methods proceed by analogy to classical PCA (Hotelling, 1933; Pearson, 1901), but do not actually map trees into vector spaces, and are therefore unable to use classical dimension reduction techniques and the corresponding visualizations (Legendre & Legendre, 2012). They produce optimal “tree lines” (Aydin et al., 2009), which are collections of nested trees meant to be representative of the entire tree set. While this concept is undoubtedly interesting, it does not provide a direct geometric representation for the trees, so that it cannot be used to assess relationships between the different phylogenies or identify phylogenetic islands (Maddison, 1991). In fact, while conceptually different, the identification of clusters of trees implemented in treespace is related to the idea of boundaries between tree topologies (Holmes, 2003b), and to the notion of “terraces” in the phylogenetic tree space (Sanderson, McMahon, & Steel, 2011). Both “boundaries” and “terraces” define regions of the tree space inside which trees are closely related through their topology (Holmes, 2003b; Sanderson et al., 2011) and their log‐likelihood under a specific evolutionary model (Sanderson et al., 2011). While we do not currently include the latter, it would be interesting to incorporate information on tree log‐likelihood as weights in the analysis.
Lastly, one of the key advantages of developing treespace within the R software (R Core Team 2016) is the resulting interoperability with other tools. Indeed, R is becoming a standard for phylogenetic analyses (Jombart et al., 2010, 2017; Kembel et al., 2010; Paradis et al., 2004; Revell, 2012; Schliep, 2011; Warren et al., 2017) and therefore represents an ideal environment for treespace to become a useful tool for the exploration of phylogenetic results. Its development within an open‐source, community‐based platform together with its availability as user‐friendly web interface will hopefully facilitate its adoption by a wide range of scientists and encourage further methodological developments.
ACKNOWLEDGEMENTS
TJ is funded by the Medical Research Council Centre for Outbreak Analysis and Modelling and the National Institute for Health Research—Health Protection Research Unit for Modelling Methodology. MK and CC are supported by the Engineering and Physical Sciences Research Council (EPSRC) EP/K026003/1. We are thankful to github (http://github.com/) and travis (http://travis-ci.org/) for providing great resources for software development. We are thankful to an anonymous editor for very useful comments on an earlier version of this work.
AUTHOR CONTRIBUTION
TJ, MK and JAG developed the package treespace. MK collated and analysed the data. TJ, MK, JAG and CC contributed to writing the manuscript.
SOFTWARE AVAILABILITY
The stable version of treespace is released on the Comprehensive R Archive Network (CRAN): http://cran.r-project.org/web/packages/trescape/index.html and can be installed in R by typing: install.packages(“treespace”). The development version of treespace is hosted on github: https://github.com/thibautjombart/treespace and can be installed in R using the devtools package by typing: devtools::install_github(“thibautjombart/treespace”). treespace is distributed under GNU Private Licence (GPL) version 2 or greater. It is fully documented in a vignette accessible by typing: vignette(“treespace”). treespace is documented in a dedicated website: https://thibautjombart.github.io/treespace/.
REFERENCES
Citing Literature
Number of times cited according to CrossRef: 47
- Lucas F. Bacci, André M. Amorim, Fabián A. Michelangeli, Renato Goldenberg, Flower morphology is correlated with distribution and phylogeny in Bertolonia (Melastomataceae), an herbaceous genus endemic to the Atlantic Forest, Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2020.106844, 149, (106844), (2020).
- Palash Sashittal, Mohammed El-Kebir, Sampling and summarizing transmission trees with multi-strain infections, Bioinformatics, 10.1093/bioinformatics/btaa438, 36, Supplement_1, (i362-i370), (2020).
- Daniel J. Becker, Kelly A. Speer, Alexis M. Brown, M. Brock Fenton, Alex D. Washburne, Sonia Altizer, Daniel G. Streicker, Raina K. Plowright, Vladimir E. Chizhikov, Nancy B. Simmons, Dmitriy V. Volokhov, Ecological and evolutionary drivers of haemoplasma infection and bacterial genotype sharing in a Neotropical bat community, Molecular Ecology, 10.1111/mec.15422, 29, 8, (1534-1549), (2020).
- Katharine S. Walter, Caroline Colijn, Ted Cohen, Barun Mathema, Qingyun Liu, Jolene Bowers, David M. Engelthaler, Apurva Narechania, Darrin Lemmer, Julio Croda, Jason R. Andrews, Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences, Microbial Genomics, 10.1099/mgen.0.000418, (2020).
- Xu Zhang, Yanxia Sun, Jacob B. Landis, Zhenyu Lv, Jun Shen, Huajie Zhang, Nan Lin, Lijuan Li, Jiao Sun, Tao Deng, Hang Sun, Hengchang Wang, Plastome phylogenomic study of Gentianeae (Gentianaceae): widespread gene tree discordance and its association with evolutionary rate heterogeneity of plastid genes, BMC Plant Biology, 10.1186/s12870-020-02518-w, 20, 1, (2020).
- Sayaka Miura, Tracy Vu, Jiamin Deng, Tiffany Buturla, Olumide Oladeinde, Jiyeong Choi, Sudhir Kumar, Power and pitfalls of computational methods for inferring clone phylogenies and mutation orders from bulk sequencing data, Scientific Reports, 10.1038/s41598-020-59006-2, 10, 1, (2020).
- Richard A. Stanton, Gillian McAllister, Jonathan B. Daniels, Erin Breaker, Nicholas Vlachos, Paige Gable, Heather Moulton-Meissner, Alison Laufer Halpin, Development and Application of a Core Genome Multilocus Sequence Typing Scheme for the Health Care-Associated Pathogen Pseudomonas aeruginosa , Journal of Clinical Microbiology, 10.1128/JCM.00214-20, 58, 9, (2020).
- Rong Zhang, Yin-Huan Wang, Jian-Jun Jin, Gregory W Stull, Anne Bruneau, Domingos Cardoso, Luciano Paganucci De Queiroz, Michael J Moore, Shu-Dong Zhang, Si-Yun Chen, Jian Wang, De-Zhu Li, Ting-Shuang Yi, Exploration of Plastid Phylogenomic Conflict Yields New Insights into the Deep Relationships of Leguminosae, Systematic Biology, 10.1093/sysbio/syaa013, (2020).
- April M. Wright, Graeme T. Lloyd, Bayesian analyses in phylogenetic palaeontology: interpreting the posterior sample, Palaeontology, 10.1111/pala.12500, 0, 0, (2020).
- Nicolás Mongiardino Koch, Luke A Parry, Death is on Our Side: Paleontological Data Drastically Modify Phylogenetic Hypotheses, Systematic Biology, 10.1093/sysbio/syaa023, (2020).
- Fanbo Meng, Yuxin Pan, Jinpeng Wang, Jigao Yu, Chao Liu, Zhikang Zhang, Chendan Wei, He Guo, Xiyin Wang, Cotton Duplicated Genes Produced by Polyploidy Show Significantly Elevated and Unbalanced Evolutionary Rates, Overwhelmingly Perturbing Gene Tree Topology, Frontiers in Genetics, 10.3389/fgene.2020.00239, 11, (2020).
- Antoine Bichat, Jonathan Plassais, Christophe Ambroise, Mahendra Mariadassou, Incorporating Phylogenetic Information in Microbiome Differential Abundance Studies Has No Effect on Detection Power and FDR Control, Frontiers in Microbiology, 10.3389/fmicb.2020.00649, 11, (2020).
- Sean D. Schoville, Sabrina Simon, Ming Bai, Zachary Beethem, Roman Y. Dudko, Monika J. B. Eberhard, Paul B. Frandsen, Simon C. Küpper, Ryuichiro Machida, Max Verheij, Peter C. Willadsen, Xin Zhou, Benjamin Wipfler, Comparative transcriptomics of ice‐crawlers demonstrates cold specialization constrains niche evolution in a relict lineage, Evolutionary Applications, 10.1111/eva.13120, 0, 0, (2020).
- Laura M. Carroll, Rachel A. Cheng, Jasna Kovac, No Assembly Required: Using BTyper3 to Assess the Congruency of a Proposed Taxonomic Framework for the Bacillus cereus Group With Historical Typing Methods, Frontiers in Microbiology, 10.3389/fmicb.2020.580691, 11, (2020).
- Luiz Henrique M. Fonseca, Lúcia G. Lohmann, Exploring the potential of nuclear and mitochondrial sequencing data generated through genome‐skimming for plant phylogenetics: A case study from a clade of neotropical lianas, Journal of Systematics and Evolution, 10.1111/jse.12533, 58, 1, (18-32), (2019).
- Juan Giglio, Mario Inostroza-Ponta, Manuel Villalobos-Cid, undefined, 2019 38th International Conference of the Chilean Computer Science Society (SCCC), 10.1109/SCCC49216.2019.8966433, (1-8), (2019).
- Jerome Kelleher, Yan Wong, Anthony W. Wohns, Chaimaa Fadil, Patrick K. Albers, Gil McVean, Inferring whole-genome histories in large population datasets, Nature Genetics, 10.1038/s41588-019-0483-y, 51, 9, (1330-1338), (2019).
- Deise J.P. Gonçalves, Beryl B. Simpson, Edgardo M. Ortiz, Gustavo H. Shimizu, Robert K. Jansen, Incongruence between gene trees and species trees and phylogenetic signal variation in plastid genes, Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2019.05.022, (2019).
- D.J.P. Gonçalves, B.B. Simpson, G.H. Shimizu, R.K. Jansen, E.M. Ortiz, Genome assembly and phylogenomic data analyses using plastid data: contrasting species tree estimation methods, Data in Brief, 10.1016/j.dib.2019.104271, (104271), (2019).
- Laura M. Carroll, Martin Wiedmann, Manjari Mukherjee, David C. Nicholas, Lisa A. Mingle, Nellie B. Dumas, Jocelyn A. Cole, Jasna Kovac, Characterization of Emetic and Diarrheal Bacillus cereus Strains From a 2016 Foodborne Outbreak Using Whole-Genome Sequencing: Addressing the Microbiological, Epidemiological, and Bioinformatic Challenges, Frontiers in Microbiology, 10.3389/fmicb.2019.00144, 10, (2019).
- Manuel Villalobos-Cid, Francisco Salinas, Eduardo I. Kessi-Pérez, Matteo De Chiara, Gianni Liti, Mario Inostroza-Ponta, Claudio Martínez, Comparison of Phylogenetic Tree Topologies for Nitrogen Associated Genes Partially Reconstruct the Evolutionary History of Saccharomyces cerevisiae, Microorganisms, 10.3390/microorganisms8010032, 8, 1, (32), (2019).
- Hyoung Tae Kim, Ki-Byung Lim, Jung Sung Kim, New Insights on Lilium Phylogeny Based on a Comparative Phylogenomic Study Using Complete Plastome Sequences, Plants, 10.3390/plants8120547, 8, 12, (547), (2019).
- Richard H Adams, Todd A Castoe, Probabilistic Species Tree Distances: Implementing the Multispecies Coalescent to Compare Species Trees Within the Same Model-Based Framework Used to Estimate Them, Systematic Biology, 10.1093/sysbio/syz031, (2019).
- Karolina Fučíková, Paul O. Lewis, Suman Neupane, Kenneth G. Karol, Louise A. Lewis, Order, please! Uncertainty in the ordinal-level classification of Chlorophyceae, PeerJ, 10.7717/peerj.6899, 7, (e6899), (2019).
- Matthew Hall, Caroline Colijn, Transmission Trees on a Known Pathogen Phylogeny: Enumeration and Sampling, Molecular Biology and Evolution, 10.1093/molbev/msz058, (2019).
- Nicholas Waglechner, Andrew G. McArthur, Gerard D. Wright, Phylogenetic reconciliation reveals the natural history of glycopeptide antibiotic biosynthesis and resistance, Nature Microbiology, 10.1038/s41564-019-0531-5, (2019).
- Kristof Theys, Philippe Lemey, Anne-Mieke Vandamme, Guy Baele, Advances in Visualization Tools for Phylogenomic and Phylodynamic Studies of Viral Diseases, Frontiers in Public Health, 10.3389/fpubh.2019.00208, 7, (2019).
- Gustavo A. Bravo, Alexandre Antonelli, Christine D. Bacon, Krzysztof Bartoszek, Mozes P. K. Blom, Stella Huynh, Graham Jones, L. Lacey Knowles, Sangeet Lamichhaney, Thomas Marcussen, Hélène Morlon, Luay K. Nakhleh, Bengt Oxelman, Bernard Pfeil, Alexander Schliep, Niklas Wahlberg, Fernanda P. Werneck, John Wiedenhoeft, Sandi Willows-Munro, Scott V. Edwards, Embracing heterogeneity: coalescing the Tree of Life and the future of phylogenomics, PeerJ, 10.7717/peerj.6399, 7, (e6399), (2019).
- Sean D. Schoville, Tierney C. Bougie, Roman Y. Dudko, Matthew J. Medeiros, Has past climate change affected cold‐specialized species differentially through space and time?, Systematic Entomology, 10.1111/syen.12341, 44, 3, (571-587), (2018).
- Marianne Espeland, Jesse Breinholt, Keith R. Willmott, Andrew D. Warren, Roger Vila, Emmanuel F.A. Toussaint, Sarah C. Maunsell, Kwaku Aduse-Poku, Gerard Talavera, Rod Eastwood, Marta A. Jarzyna, Robert Guralnick, David J. Lohman, Naomi E. Pierce, Akito Y. Kawahara, A Comprehensive and Dated Phylogenomic Analysis of Butterflies, Current Biology, 10.1016/j.cub.2018.01.061, 28, 5, (770-778.e5), (2018).
- Shawn Narum, Karen Chambers, Editorial 2018, Molecular Ecology Resources, 10.1111/1755-0998.12753, 18, 1, (1-13), (2018).
- John A. Lees, Michelle Kendall, Julian Parkhill, Caroline Colijn, Stephen D. Bentley, Simon R. Harris, Evaluation of phylogenetic reconstruction methods using bacterial whole genomes: a simulation based study, Wellcome Open Research, 10.12688/wellcomeopenres.14265.1, 3, (33), (2018).
- Diego Bogarín, Oscar Alejandro Pérez-Escobar, Dick Groenenberg, Sean D. Holland, Adam P. Karremans, Emily Moriarty Lemmon, Alan R. Lemmon, Franco Pupulin, Erik Smets, Barbara Gravendeel, Anchored hybrid enrichment generated nuclear, plastid and mitochondrial markers resolve the Lepanthes horrida (Orchidaceae: Pleurothallidinae) species complex, Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2018.07.014, 129, (27-47), (2018).
- John A. Lees, Michelle Kendall, Julian Parkhill, Caroline Colijn, Stephen D. Bentley, Simon R. Harris, Evaluation of phylogenetic reconstruction methods using bacterial whole genomes: a simulation based study, Wellcome Open Research, 10.12688/wellcomeopenres.14265.2, 3, (33), (2018).
- Melissa J. Whaley, Sandeep J. Joseph, Adam C. Retchless, Cecilia B. Kretz, Amy Blain, Fang Hu, How-Yi Chang, Sarah A. Mbaeyi, Jessica R. MacNeil, Timothy D. Read, Xin Wang, Whole genome sequencing for investigations of meningococcal outbreaks in the United States: a retrospective analysis, Scientific Reports, 10.1038/s41598-018-33622-5, 8, 1, (2018).
- Amy Willis, Rayna Bell, Uncertainty in Phylogenetic Tree Estimates, Journal of Computational and Graphical Statistics, 10.1080/10618600.2017.1391697, 27, 3, (542-552), (2018).
- Sion C. Bayliss, David W. Verner-Jeffreys, David Ryder, Rudy Suarez, Roxana Ramirez, Jaime Romero, Ben Pascoe, Sam K. Sheppard, Marcos Godoy, Edward J. Feil, Genomic epidemiology of the commercially important pathogen Renibacterium salmoninarum within the Chilean salmon industry, Microbial Genomics, 10.1099/mgen.0.000201, 4, 9, (2018).
- Manuel Villalobos-Cid, Marcio Dorn, Mario Inostroza-Ponta, undefined, 2018 IEEE Congress on Evolutionary Computation (CEC), 10.1109/CEC.2018.8477689, (1-8), (2018).
- Michael A. Martin, Robyn S. Lee, Lauren A. Cowley, Jennifer L. Gardy, William P. Hanage, Within-host Mycobacterium tuberculosis diversity and its utility for inferences of transmission, Microbial Genomics, 10.1099/mgen.0.000217, (2018).
- Angela Pena-Gonzalez, Luis M. Rodriguez-R, Chung K. Marston, Jay E. Gee, Christopher A. Gulvik, Cari B. Kolton, Elke Saile, Michael Frace, Alex R. Hoffmaster, Konstantinos T. Konstantinidis, Genomic Characterization and Copy Number Variation of Bacillus anthracis Plasmids pXO1 and pXO2 in a Historical Collection of 412 Strains , mSystems, 10.1128/mSystems.00065-18, 3, 4, (2018).
- Kevin Arbuckle, Phylogenetic Comparative Methods can Provide Important Insights into the Evolution of Toxic Weaponry, Toxins, 10.3390/toxins10120518, 10, 12, (518), (2018).
- Katharina T Huber, Vincent Moulton, Marie-France Sagot, Blerina Sinaimeri, Exploring and Visualizing Spaces of Tree Reconciliations, Systematic Biology, 10.1093/sysbio/syy075, (2018).
- Damien M de Vienne, Tanglegrams Are Misleading for Visual Evaluation of Tree Congruence, Molecular Biology and Evolution, 10.1093/molbev/msy196, (2018).
- Alexander Yermanos, Victor Greiff, Nike Julia Krautler, Ulrike Menzel, Andreas Dounas, Enkelejda Miho, Annette Oxenius, Tanja Stadler, Sai T Reddy, Comparison of methods for phylogenetic B-cell lineage inference using time-resolved antibody repertoire simulations (AbSim), Bioinformatics, 10.1093/bioinformatics/btx533, 33, 24, (3938-3946), (2017).
- Dimitri A. Skandalis, Paolo S. Segre, Joseph W. Bahlman, Derrick J. E. Groom, Kenneth C. Welch, Christopher C. Witt, Jimmy A. McGuire, Robert Dudley, David Lentink, Douglas L. Altshuler, The biomechanical origin of extreme wing allometry in hummingbirds, Nature Communications, 10.1038/s41467-017-01223-x, 8, 1, (2017).
- Michelle Kendall, Caroline Colijn, Mapping Phylogenetic Trees to Reveal Distinct Patterns of Evolution, Molecular Biology and Evolution, 10.1093/molbev/msw124, 33, 10, (2735-2743), (2016).
- Kara K. S. Layton, Jose I. Carvajal, Nerida G. Wilson, Mimicry and mitonuclear discordance in nudibranchs: New insights from exon capture phylogenomics, Ecology and Evolution, 10.1002/ece3.6727, 0, 0, (undefined).




