Genomics and the animal tree of life: conflicts and future prospects

Providing consistent resolution for the animal tree of life is a major goal of animal systematists and a desire of every zoologist. Towards this goal, many major nodes have been successfully resolved. However, some major controversies and poorly resolved deep nodes still remain. Here, I discuss some of these controversies (e.g. whether Ctenophora or Porifera is sister group to all other animals), clarify others (e.g. the position of Xenacoelomorpha) and identify major clades that still require resolution. But most importantly, a discussion about the possible conflict in some of these nodes and the relation to the nature of phylogenomic data are provided by exploring the meaning of total support in phylogenomic analyses, highlighting cases in which a data set can provide total support for contradictory nodes. Finally, our efforts should focus on generating genomic data for key candidate taxa, such as the large disparity of undescribed placozoans, which may in the end add to the current data quantity, the quality of data needed to resolve the base of the animal tree.


Introduction
Several controversies in the animal tree of life have taken the headlines in animal phylogenetics since the widespread usage of molecular data (see a current working hypothesis in Fig. 1). Perhaps the most prominent cases were the ones on the relative positions of arthropods and annelids (Articulata vs. Ecdysozoa) (e.g. Aguinaldo et al. 1997;Giribet & Ribera 1998;Schmidt-Rhaesa et al. 1998;W€ agele et al. 1999;Valentine & Collins 2000;Garey 2001;W€ agele & Misof 2001;Zrzav y 2001;Giribet 2003), the position of Lophophorata (in Deuterostomia vs. Protostomia) (e.g. Halanych et al. 1995;Conway Morris et al. 1996;Halanych 1996;Nesnidal et al. 2013;Laumer et al. 2015) or the recognition of Tetraconata (= Pancrustacea) as a clade of arthropods that relates hexapods to crustaceans instead of to myriapods (the Atelocerata or Tracheata hypothesis) (e.g. Boore et al. 1995;Friedrich & Tautz 1995;Giribet et al. 1996Giribet et al. , 2001Shultz & Regier 2000;Dohle 2001;Hwang et al. 2001;Richter 2002;Mallatt et al. 2004;Regier et al. 2005;B€ acker et al. 2008;Ungerer & Scholtz 2008;Strausfeld & Andrew 2011;Wägele & Kück 2014). These three questions generated heated debate mostly in the late 1990s and early 2000sand in some cases were more personal than scientific (e.g. W€ agele & Misof 2001)but nevertheless were quickly settled as they had total support by all sorts of molecular data sets; Arthropods appear more closely related to nematodes than to annelids, lophophorates are closer to annelids and molluscs (in a clade named Lophotrochozoa) than to the deuterostomes and hexapods (and its subclade Insecta) are nothing other than terrestrial crustaceans. Virtually, all analyses of all classes of molecular data support these relationships, as do most carefully analysed morphological data sets. Arguments to maintain the status quo of Articulata, the Deuterostomia nature of Lophophorata, or Atelocerata were often based on morphological characters that have since been shown to Correction added on Feb 24, 2017, after first online publication: figure 1 Zoologica Scripta be convergent, have been reinterpreted or have been superseded by other morphological or developmental evidence. There have obviously been other changes in animal phylogeny in recent decades, often related to groups of poor phylogenetic understanding. The placement of Platyhelminthes as derived spiralians (Carranza et al. 1997) excluding acoelomorphs, which were in turn considered to be the sister group to all other bilaterians (Ruiz-Trillo et al. 1999, 2002Jondelius et al. 2002), is another example, and the removal of the acoleomorphs also stirred up the animal phylogenetics community. But, few now question molecular phylogenetic results, as far as they are based on wellsampled studies and highly supported nodes, which are also stable across methodologies and homology schemes.
A 'new kid on the block', phylogenomics, is adding another type of controversy never seen before in molecular phylogenetics: highly supported contradictory results. For example, Rokas et al. (2003), in one of the earliest papers exploring genomic data, showed that subsets of genes can give highly supported contradictory results in a yeast phylogeny. Later, Salichos & Rokas (2013) argued against strict concatenation and suggested preselecting genes with the appropriate phylogenetic signal. This has however been tested empirically in several groups of animals by selecting subsets of genes (e.g. Fern andez et al. 2014; Andrade et al. 2015), and at least in these cases, results are stable and highly supported irrespective of the data analysed. Nevertheless, another recent study by Nosenko et al. (2013) reported radically contradictory hypotheses for the base of the animal tree of life when analysing sets of different genes for the same taxa using the same methodologies. Since then, other studies of large phylogenomic data sets have explored the effect of a diversity of factors, such as evolutionary rates and compositional biases on phylogenomic reconstruction (e.g. examples related to the early evolution of Metazoa and Bilateria to try to understand why and how analyses may conflict with respect to the position of difficult-to-place taxa.

Xenacoelomorpha: a new controversy?
Xenacoelomorpha is a higher taxon that includes the members of the taxa Acoela, Nemertodermatida (these two are often united in the taxon Acoelomorpha) and Xenoturbellida. Whether one treats Acoela, Nemertodermatida and Xenoturbellida as phyla or considers any of the more inclusive clades, such as Acoelomorpha or even Xenacoelomorpha as phyla, is irrelevant (see for example Giribet et al. 2016). The accepted relationships of the least inclusive taxa are (Xenoturbellida, (Acoela, Nemertodermatida)). As discussed above, Acoela and Nemertodermatida (= Acoelomorpha) were traditionally included as basal members of Platyhelminthes until the molecular analyses of Ruiz-Trillo et al. (1999) and Jondelius et al. (2002) formally removed Acoela and Nemertodermatida from Platyhelminthes, respectively. On another front, Xenoturbella bocki was being shuffled all over the animal tree (Reisinger 1960;Franz en & Afzelius 1987;Pedersen & Pedersen 1988;Israelsson 1997Israelsson , 1999Nor en & Jondelius 1997) until a series of molecular analyses suggested that they were deuterostomes (Bourlat et al. 2003(Bourlat et al. , 2006(Bourlat et al. , 2009Perseke et al. 2007), a result that, although difficult to fathom morphologically, had been implicitly suggested based on the structural similarity of the epidermis of Xenoturbella with that of some hemichordates (e.g. Pedersen & Pedersen 1988). Further molecular analyses including large sets of genes and representative Acoelomorpha species attracted these animalsformerly placed as sister group to Nephrozoa (bilaterians typically with an excretory system)with Xenoturbella, formalizing the taxon Xenacoelomorpha (Telford 2008). However, other analyses using improved data sets failed to place Xenacoelomorpha within Deuterostomia, maintaining the whole clade as the sister group to Nephrozoa, a position much more easily reconcilable with their simple morphology (Hejnol et al. 2009). This traditional position was in theory contradicted again by a paper analysing several sources of datamitochondrial genomes, microRNAs and a phylogenomic data set (Philippe et al. 2011)although none of the data sets per se supported the author's conclusions (Maxmen 2011). Subsequent phylogenomic analyses of Xenoturbellida and Acoelomorpha always placed them as sister groups to Nephrozoa, at the base of Bilateria (Ryan et al. 2013;Cannon et al. 2016;Rouse et al. 2016).
One can ask whether there is really a controversy and whether the controversy emerges from some of the intricacies of analysing molecular data. For example, it has been long demonstrated that the initial placement of Xenoturbella within molluscs (Nor en & Jondelius 1997) was due to food contamination (Bourlat et al. 2003), but this early spurious result led to a series of studies 'supporting' a molluscan affinity of Xenoturbellida using morphology (Israelsson 1997(Israelsson , 1999Israelsson & Budd 2005) that prevented the authors from seeing the fact that they were looking at embryos of their preythe embryos of Xenoturbella are quite different (Nakano et al. 2013). The papers placing Xenoturbellida as sister group to deuterostomes using mitochondrial genome data had non-significant support and did not include acoelomorphstheir potential sister group (Perseke et al. 2007;Bourlat et al. 2009). Philippe et al.'s (2011) miRNA analysis favoured Xenacoleomorpha as sister group to Bilateria over their proposed hypothesis by six parsimony steps (174 vs. 180; ca. 3% longer) supported a sister group relationship of Xenacoelomorpha to deuterostomes with the mitochondrial data (Bayesian pp = 0.99), but not as ingroup deuterostomes, and only found Xenacoelomorpha to nest within deuterostomes in the EST analysis. Even so, the two nodes placing them inside had bootstrap support of 63% and 78%, which are negligible in these phylogenomic data sets. Indeed, no data set really supported the position the authors defended, as they even recognized in their paper: 'Difficult phylogenetic questions such as that addressed here must ultimately be solved by the congruent patterns emerging from what, inevitably, are not highly supported results' (Philippe et al. 2011: p. 258). Considering that virtually all other analyses and data sets support Xenacoelomorpha as sister group to Nephrozoa, once more, this supposed incongruence of phylogenomic data sets is probably no more than a targeted quest for an artefactual result. Xenoturbella is neither a mollusc nor a deuterostome, but, as its morphology indicates, a simple basal bilaterian preceding the emergence of a highly condensed ganglionar nervous system and an excretory system.
What is the sister group of all other animals, Porifera or Ctenophora? of view and was supported by cladistic analyses of morphological data (e.g. Nielsen et al. 1996). It was however once more altered during the early addition of molecular data, which soon showed that placozoans were much more derived than originally thoughtand thus secondarily simplifiedand that ctenophores were more basal than cnidarians (e.g. Wainright et al. 1993;Medina et al. 2001;Wallberg et al. 2004). A morphological 'progression rule' was therefore not required any longer, and further cases of simplification of body plans were suggested by molecular data analyses (e.g. Siddall et al. 1995;Monteiro et al. 2002;Mikhailov et al. 2016).
As soon as the first phylogenomic analyses of Metazoa appeared, a further change of the paradigm was proposed; two studies suggested that Ctenophora did not only diverge prior to the formation of Cnidariaas already suggested by the previous standard molecular analysesbut prior to the separation of Porifera from the rest of extant Metazoa (Dunn et al. 2008;Hejnol et al. 2009). These results were soon taken with scepticism, but also triggered unparalleled research in ctenophores. Another study, however, suggested that Porifera were sister group to all other animals, followed by Placozoa, and that Ctenophora formed a clade with Cnidariathe old clade Coelenterata (Philippe et al. 2009)a much more traditional view of early metazoan evolution. A follow-up article by Pick et al. (2010) re-analysed the data set of Dunn et al. (2008) and found Porifera as the first offshoot of Metazoa, but as in earlier molecular analyses, with Ctenophora diverging earlier than Cnidaria and with Placozoa deriving much later, contradicting their previous paper (Philippe et al. 2009). The position of Porifera and Ctenophora was apparently interchanged only after removing many relevant outgroups (leaving only Choanoflagellata) and after applying a specific CAT + Γ4 model of sequence evolutionapparently to ameliorate problems of long-branch attraction. (Interestingly, Philippe et al. (2011), in their Xenoturbellida paper, excluded Ctenophora from their data set). Surprisingly, virtually, all other analyses of metazoan phylogenomics result in ctenophores being more basal than sponges (Ryan et al. 2013;Moroz et al. 2014;Chang et al. 2015;Whelan et al. 2015).
The problem of the base of Metazoa does not seem to be just a question of model or outgroup taxon sampling, as illustrated in a relatively recent article by Nosenko et al. (2013). These authors found that the same taxa analysed under the same exact model of sequence evolutionthe one suggested by Pick et al. (2010) produced both of the strongly supported conflicting topologies, depending on the set of genes selected for the analyses (from two sets of non-overlapping genes generated from the original set of 122). Accusations of deficient taxon sampling, gene heterogeneity and evolutionary rate issues, outgroup selection and model misspecification have filled out journal pages, but these factors cannot be the only ones dictating such conflict, if two sets of the same data set disagree (they contain the same taxa, including the outgroups, and were analysed under the same evolutionary model). More recently, two high-impact papers published in the same journal once more claimed opposite results, Ctenophora-basal (Whelan et al. 2015) or Porifera-basal (Pisani et al. 2015). Unlike the case of Xenoturbellida, the conflict here seems real, and the long branch separating the origin of Ctenophora from its current diversity may be responsible for the difficulties in resolving this early node. The fossil record has not helped resolve this issue either, even though both, Ctenophora and Porifera, are unambiguously known since the Cambrian, but early records are now suggested to be sponges (Yin et al. 2015), although Lower Cambrian vendobionts similar to some Ediacarans have also been suggested to be of ctenophore relation (Shu et al. 2006).
Biological studies of the anatomy and development of sponges and ctenophores have flourished as a response to this debate, providing new insights into the biology of these interesting animals . One of these studies has postulated a primitive sensory organ in sponges (Ludeman et al. 2014). Another has revisited the question of homology between choanocytes and choanoflagellates (Mah et al. 2014). As nicely put by , 'It is now clear that the phylogenetic placement of Porifera and Ctenophora are not independent questions, and must be addressed together'. We are beginning to assemble the data sets to attempt to resolve this conundrum, and the response is not a simple one that can find its only explanation in a model of sequence evolution or the inclusion/exclusion of certain outgroups. Development of phylogenetic methods and careful analysis of dataincluding a thoughtful choice of key taxa, not only those available in databaseswill be required to continue shedding light into the early evolution of animals. What it is now clear is that only Porifera and Ctenophora are left as the sole candidates for the sister group to all other animals, but it is also true that a greater genomic diversity of placozoans may also help to settle this debate, as we now know that the group includes enormous genetic diversity (Pearse & Voigt 2007;Eitel & Schierwater 2010;Eitel et al. 2013) while all phylogenomic analyses of animals are restricted to using the genome of Trichoplax elegans.
Does total support mean total support in phylogenomic analyses?
As in the case of ctenophores and sponges, the same data set can provide total support for alternative hypotheses simply by including/excluding some relatively distant outgroups (Pisani et al. 2015), and thus, at least one of the results must be spuriousan artefact of a series of interacting but poorly identified effects. This has been investigated in detail in at least three recent phylogenomic analyses (Sharma et al. 2014;Andrade et al. 2015;Lemer et al. 2016), which plotted nodal support as genes with faster evolutionary rates were being added to the data set. In the arachnid data set, most clades, such as Chelicerata or Tetrapulmonata rapidly accumulated total support, as expected and found in most phylogenomic data sets (Fig. 2). However, other well-defined morphological clades, such as Arachnida, showed a radically different behaviour, reaching total support by 500 genes, but decreasing after 600 to reach 0% support after 1,000 genes (Sharma et al. 2014). This behaviour may indicate that only the 700 slowest evolving genes in that data set can resolve Arachnida. Unfortunately, not all nodes of the same tree peak at 600 genes, and perhaps shallower nodes benefit from adding genes with faster evolutionary rates. For example, a clade containing Scorpiones and Pseudoscorpiones has maximum support with 200 genes, Pseudoscorpiones cluster with Acari with nearly total support at 600 genes, and by 900 genes, they reach total support forming a clade with Acariformes (a subclade of Acari). The point is that depending on which data set one analyses alternative and contradicting hypotheses may be fully supported, and we yet need to better understand how many genes or which are required to resolve a particular node in the tree of life. Furthermore, two nodes in the same tree may be optimally reconstructed using different information, something beyond current practices in phylogenetics.

New directions in animal phylogenetics
Irrespective of some of these pitfalls, in part led by the rapid acquisition of genomic data (Dunn & Ryan 2015), great progress has been made towards understanding animal evolution (see a recent review in Dunn et al. 2014). As the general structure of the tree continues to be refined, including for some of the largest and most difficult clades such as Spiralia (Laumer et al. 2015), others lack valuable genomic data, especially for a comparable diversity of the smaller phyla in Ecdysozoa. Resolution within this latter clade requires novel data, and no attempt to include genomes and Illumina data for all animal phyla yet exists. Recent efforts have focused at the base of the animal tree (see references above), or within specific phyla, proliferating especially for arthropods, molluscs and annelids. These have served as perfect study cases to test orthology assignment, analytical methods and data set properties, principles that can now be extrapolated to the larger and more complex metazoan-wide data sets. Now that data quantity is not a limiting factor any longer, data quality is playing an increasing role in phylogenomic studies, designing data sets to resolve specific nodes (e.g. Dell'Ampio et al. 2014;Fern andez et al. 2014Fern andez et al. , 2016Sharma et al. 2014;Andrade et al. 2015). Similarly designed data sets should help us to better understand the most difficult nodes of the animal tree, specifically at its very root. Perhaps new discoveries or the yet undescribed disparity of placozoans ) may produce quality genomic data to emerge as the taxa able to arbitrate this paradigm. and Letters and to the Royal Swedish Academy for organizing such a stimulating symposium, from which this review article derives. Many of the ideas presented here come from years of discussions and collaboration with several colleagues, especially those working on phylogenomics: S onia Andrade, Casey Dunn, Greg Edgecombe, Rosa Fern andez, Vanessa Gonz alez, Chris Laumer, Sarah Lemer, Prashant Sharma and Katrine Worsaae. Greg Edgecombe read a version of this manuscript of provided, as always, helpful insights. This work was supported by a 2016 John Simon Guggenheim Memorial Foundation Fellowship.