Systematics of tardigrada: A reanalysis of tardigrade taxonomy with specific reference to Guil et al. (2019)

The Tardigrada are a clade with a disputed and complex taxonomy. The three traditional tardigrade classes are the Heterotardigrada, Eutardigrada and the dubious, monotypic Mesotardigrada. A recent molecular study by Guil et al (Zoologica Scripta, 48, 2019, 120) suggested that the Apochela, previously considered an order of Eutardigrada, should become a new class, Apotardigrada, with the Parachela becoming THE new Eutardigrada. This new diagnosis was presented alongside compelling morphological evidence. Here, we test the assumptions of the molecular evidence for Apotardigrada by analysing tardigrade 18S and 28S data under four separate conditions, along with new BUSCO data, across both Bayesian‐ and maximum‐likelihood phylogenetic approaches. We found variable conclusions regarding the status of Apotardigrada could be drawn by comparing the ribosomal RNA and BUSCO results, and caution against using branch length data to determine taxonomic hierarchy.


| INTRODUCTION
The Tardigrada are a charismatic phylum of ecdysozoans that can be found in marine, freshwater and terrestrial environments (Nelson et al., 2009). Potential fossil stem group Tardigrada have been identified in the Cambrian (Maas & Waloszek, 2001), and modern genera of the phylum have been identified as far back as the Cretaceous (Bertolani & Grimaldi, 2000). In recent years, their miniature size and their extreme resilience have made them the focus of a concerted and expansive research effort (Barnes, 1982;Møbjerg et al., 2011;Yamaguchi et al., 2012;Hashimoto et al., 2016).
However, despite recent interest in the Tardigrada, the diversity of tardigrades is not reflected in the genomic data available for the clade. This paucity of information has led to unclear taxonomic groupings (Bertolani et al., 2014;Jørgensen et al., 2018;Guil et al., 2019). At the highest level of tardigrade taxonomy, the existence of the class Mesotardigrada is disputed and dubious, as it exists only from a single specimen, now lost Suzuki et al., 2017).
Traditionally, tardigrade taxonomy has divided the clade into three classes. The Heterotardigrada are united by the presence of a lateral cirrus, whilst the Eutardigrada are distinguished by the morphology of their buccopharyngeal appendages (Rahm, 1937;Schuster et al., 1980;Bertolani et al., 2014;Gąsiorek et al., 2017). The third class, Mesotardigrada, possesses heterotardigrade-type claws and a | 377 FLEMING aNd aRaKaWa eutardigrade-type buccopharyngeal apparatus, and so is often considered to be intermediate between the two extant classes (Rahm, 1937).
Of the two major classes, each is divided into two orders: the Arthrotardigrada and Echiniscoidea for the Heterotardigrada, and the Parachela and Apochela for the Eutardigrada (Schuster et al., 1980;Guidetti et al., 2009;Guil & Giribet, 2012;Guil et al., 2013;Jørgensen et al., 2018). The Heterotardigrada are internally more morphologically diverse and classified according to their cuticular extensions, dorsal plate patternings and cephalic appendages (Nichols et al., 2006). In the case of the Eutardigrada, claw morphology becomes a significant defining feature at the order level (Schuster et al., 1980, Pilato andBinda, 2010). Members of the Parachela possess claws with fused primary and secondary branches, whilst the Apochela possess claws with separate primary and secondary branches (Thulin, 1928;Ramazzotti, 1983;Pilato and Binda, 2010;Dabert et al., 2014).
Recently, within the Heterotardigrada, paraphyletic Arthrotardigrada has been recovered from both morphological and molecular data (Nichols et al., 2006;Jørgensen et al., 2018). This change to classical nomenclature through the use of new phylogenetic and systematic techniques has prompted a re-evaluation of the taxonomic nomenclature of the entire Tardigrada.
Recently, Guil et al. (2019) proposed a new systematic taxonomy of Tardigrada. This study elevated Apochela, forming a new class of tardigrade known as the Apotardigrada, making the Parachela the new Eutardigrada. The justification for this systematic change was to unite tardigrade morphological systematics through claw type definition and was supported by a branch length phylogenetic analysis of new 18S and 28S genes, providing the most comprehensive taxon phylogeny of Tardigrada to date.
However, the new nomenclature has proven controversial. The morphological reasoning provided for Apotardigrada has been questioned in light of the description of a new species of Milnesium (Morek and Michalczyk, 2020). In this study, the authors propose revision of the genus description in addition to the suppression of Apotardigrada (and reinstatement of Apochela and Parachela), arguing that the distinguishing morphological features of the clade are already accounted for by the description of the order Apochela (Morek and Michalczyk, 2020).
In addition, whether branch length data can be used in this way to ascertain the taxonomic level of clades is also unclear (Corneli & Ward, 2000;Philippe et al., 2011;Yang & Rannala, 2014). Branch lengths are a useful phylogenetic indicator of relative mutation distance between taxa in a data set-however, they are prone to variation based on data set sampling and phylogenetic modelling (Leaché andFujita, 2010, Olave et al., 2014). This raises concerns as to whether comparing branch lengths on a tree without a likelihood estimate is a fit proxy to assess the identity of monophyletic clades (Yang & Rannala, 2014). In addition, branch lengths in phylogenetics are implicated in many of its most disastrous artefacts, such as long branch attraction-determining taxonomic level by comparing branch lengths might well compound the problems already present within phylogenetic modelling (Philippe et al., 2011;Olave et al., 2014).
Here, we present a reanalysis of 18S and 28S genes from the Tardigrada, alongside a new analysis of highly conserved tardigrade BUSCO sequences from a smaller sample of high coverage tardigrade genomes. Across multiple reasonable models and new alignments, which recover topologies more consistent with present understanding of tardigrade family level taxonomy, we show that relative branch lengths are variable across analyses and sampling procedures. In conclusion, we support the use of the consistent claw-morphology-as-class schema, but question the reliability and appropriateness of phylogenetic branch length data as a tool to determine taxonomic hierarchy and caution against the use of this method in the future.

| Tardigrade sequencing
Tardigrade genomes were sequenced and assembled following the protocols outlined in Arakawa et al. (2016). The taxa analysed can be found in Supplementary Table S1, and the relevant sequences used in this analysis are included in the Appendix S1 alongside their NCBI accession numbers.

| BUSCO analysis
BUSCO was used to identify highly conserved genes present within our tardigrade genomes (Simão et al., 2015). We undertook BUSCO analysis of 32 tardigrade species, using Drosophila as the model comparison organism. Sequences that were present in at least 31 of the 32 tardigrade samples were then used in subsequent analyses.
BUSCO analysis of 10 out-group species was undertaken, under the same conditions as the tardigrade BUSCO analyses. The sequences recovered from our Tardigrada were then found in the out-group species and the previously published transcriptome of Echiniscoides sigismundi from Kamilari et al. (2019) to form the BUSCO phylogeny data set.

| 18S/28S sampling
Our analysis data sets utilised the Guil et al. (2019) data set as a basis, dividing the sequences provided in this paper into four distinct alignments: 18S sequences, 28S sequences, a concatenated alignment of all 18S and 28S sequences, and a concatenated alignment only of taxa that possessed both 18S and 28S sequences, in addition to the aforementioned BUSCO data set. Compared to the Guil et al. (2019) data sets, our 18S/28S out-group selection was then increased with more sequences from across the Ecdysozoa and Lophotrochozoa. We then removed sequences from the Guil et al. (2019) data set that possessed more than 30 'N' (undetermined) characters in a single continuous phrase, deeming them to be low quality.
Following this removal, we additionally removed the sequence 'GQ849040_1_Tanarctus_dendriticus_28S_ rRNA': this short 28S sequence resolved a paraphyletic Arthrotardigrada (consisting of itself as a monospecific clade sister to all other Heterotardigrada) under maximum likelihood and Bayesian under the 28S data set, but its inclusion within the 18S + 28S combined data set resulted in non-convergence after 10,000 generations and the resolution of a basal polytomy at the order level of Tardigrada that were not recovered when it was removed from the data set (Data S1, Figures S11-S14). Whilst we accept the real possibility of paraphyletic Arthrotardigrada (Jørgensen et al., 2018), and make no commentary on the plausibility of monophyletic Arthrotardigrada, the removal of this sequence allowed for clearer comparative assessment of the branch length to the earliest diverging heterotardigrade across methodologies.
In the absence of the Guil et al. (2019) alignment, all data sets were aligned in MUSCLE (Edgar, 2004). 18S/28S data sets were then trimmed using trimAL (Capella-Gutiérrez et al., 2009) under a gap threshold of 0.1, removing all sites in the alignment not possessed by more than 10% of the sample taxa.
For IQTree analyses, bootstrap support values over 1,000 bootstraps were generated. Nodes with a bootstrap support value of less than 50% were collapsed to polytomies. All nodes signifying order-level divergences or higher were recovered with greater than 90% support.
For Phylobayes analyses, convergence was assessed by comparing the maximum discrepancies observed over the bipartitions and effective sample size in bpcomp and tracecomp. For all analyses, two independent chains were run for more than 10,000 generations each. A burn-in of 50% of the sample size was used for all analyses, sampling every fiftieth tree following the burn-in period. Posterior probability node support values were calculated for each node, and nodes with less than 50% support were collapsed to polytomies. All nodes signifying order-level divergences or higher were recovered with greater than 90% support.
All phylogenies, including bootstrap and posterior probability node support levels, and their respective alignments can be found in this paper's Appendix S1 at: https://bitbu cket. org/James FVFle ming/tardi grade syste matic s/downl oads/

| Topology analysis
For each condition, a 'branch length ratio' was taken. This was a simple ratio of the lengths of the two branches deemed significant in the Guil et al. (2019) analysis-the branch leading from the common tardigrade ancestor to the Apochela/ Parachela divergence, and the branch leading from the common tardigrade ancestor to the common heterotardigrade ancestor. Guil et al. (2019) gave no indication of how high this branch length ratio must be in order to distinguish between taxonomic hierarchies, and so this measure lacks a threshold, but instead is used as a quantification of the reliability of the Apotardigrada hypothesis across the analysis.

| 18S Analysis
In our maximum-likelihood 18S analysis (Figure 1 Panel (a), Figure S1), the distance from the tardigrade common ancestral node to the heterotardigrade ancestral node is 0.0135. In comparison, the distance from the tardigrade common ancestral node to the Apochela/Parachela ancestral node is 0.0057.
Our Bayesian analysis ( Figure 1 Panel (b), Figure S2) shows a similar proportional relationship between branches to the maximum-likelihood analysis, with the distances to the Apochela/Parachela ancestral node and the distance to the heterotardigrade ancestral node being 0.0058 and 0.0122, respectively. This consistency in proportion between maximum-likelihood and Bayesian methods (ML branch length ratio: 2.368, Bayesian branch length ratio: 2.103) supports the suggestion of a class-level relationship. In addition, however, unlike the Guil et al. (2019) analysis, we were able to obtain a monophyletic Arthrotardigrada in this and all later analyses, which allows us to make comparative branch length assessments across more consistent topologies.
When the branch length ratio here is calculated and compared between ML and Bayesian analyses, the values are highly comparable (ML ratio: 1.812, Bayesian ratio: 1.724), evidencing further consistency within the analyses in this study set.

| Combined 18S/28S analysis
Our combined 18S/28S data set does not have an equivalent in the Guil et al. (2019) analysis, and so we compared our results to the 18S/28S results presented in that paper (as in our later BUSCO analysis and Strict Combined 18S/28S analysis). It utilised all data from both independent data sets, regardless of whether the samples lacked one of an 18S or 28S sequence.
Our maximum-likelihood analysis places the divergence of the earliest diverging Arthrotardigrada and the Heterotardigrada prior to the divergence of Apochela from Parachela (branch lengths of 0.0671 and 0.0416, respectively) (Figure 1 Panel (a), Figure S5). This result is corroborated by the Bayesian analysis, with branch lengths of 0.0668 and 0.0427, respectively (Figure 1 Panel (b), Figure  S6). Again, the comparative branch length ratios of 1.613 for the maximum-likelihood analysis and 1.564 for the Bayesian analysis suggest that the data are informative irrespective of modelling.

| Strict combined 18S/28S analysis
The strict combined 18S/28S analysis used only taxa that possessed both a valid 18S sequence and a valid 28S sequence in the prior data sets. This makes the analysis analogous to the Guil et al. (2019) 18S/28S analysis. Within both the maximum-likelihood and Bayesian analyses, the Apotardigrada/ Apochela diverge from the Parachela (ML BL: 0.0349, B BL: 0.0375) prior to the divergence of Echiniscoidea (here represented by Echiniscidae and Echiniscoididae) from the other Heterotardigrada (ML BL: 0.0638, B BL: 0.0696) (Figure 1, Figures S7 & S8). These branch lengths produce ratios of 1.828 and 1.856 under ML and Bayesian conditions, respectively.
Both our maximum-likelihood and Bayesian combined 18S/28S analyses broadly agree with the topology recovered by Guil et al. (2019), with the exception of slightly improved resolution of some low-quality sequences ( Figures S7 & S8), and the changes that we made to the data set by the removal of low-quality arthrotardigrade sequences that allowed us to recover a monophyletic Arthrotardigrada ( Figure S7 &  S8). This lends much more confident support to the Guil et al. (2019) hypothesis regarding branch length taxonomic hierarchy determination.

| BUSCO
Our BUSCO analysis used all BUSCO sequences that occurred in 31 of the 32 tardigrade study species that we explored. Under both maximum-likelihood and Bayesian analyses, the earliest diverging Arthrotardigrada diverge from the Echiniscoidea (here represented by the Echiniscidae) (ML BL: 0.3207, Bayesian BL: 0.3619) prior to the divergence of Apochela from Parachela (ML BL: 0.4683, Bayesian BL: 0.5185), as opposed to the results gathered from the 18S and 28S analyses ( Figures S9 and S10, Figure 1 Panels (a and b)). Between these two analyses, thereby, the branch length ratios were 0.685 under maximum likelihood, and 0.698 under Bayesian, respectively.
This result is not unexpected-branch lengths measure the number of substitutions over the data set over time, and whilst the relationships between species will have some bearing on this result, different sequences will have evolved at different rates and so will produce different branch lengths. Variant selection pressures and variations in evolutionary rates over different parts of the tree will produce different relative branch lengths across a topology (Yang, 1994;Bickel, 2000;Phillips, 2009).

| Branch lengths and proportionality
Across four of the five analyses, under both maximumlikelihood and Bayesian conditions, Apochela diverged from Parachela prior to the earliest divergences within Heterotardigrada. However, the proportional lengths of these branches relative to one another were variable: in our maximum-likelihood analyses, across the 4 different 18S and 28S conditions, the Arthrotardigrada/Echiniscoidea branch varied from 1.613x to 2.368x longer than the Apochela/Parachela branch. In the Bayesian 18S and 28S analyses, this variation was only slightly smaller, from 1.564x to 2.103x.
In the BUSCO analysis, this trend was reversed, and the Arthrotardigrada/Echiniscoidea branch was 0.685x the length of the Apochela/Parachela branch under maximum likelihood, and 0.698x under Bayesian conditions.

| Sampling concerns
All known extant Apochela belong to the family Milnesiidae, and the sampling of the clade belongs only to the genus Milnesium, which sits at the end of a long branch, with little diversity present across the samples (Figures S1-S10). Further sampling may well shorten the length of this branch internally. This may affect the length of internal branches within the phylogeny, drawing the entire family closer to, or further away from, the root. In this respect, we recommend that sampling from Bergtrollus, Limmenius and Milnesioides may be necessary to reliably use phylogenetics to interpret class-level relationships in Tardigrada.
We interpret Arthrotardigrada as monophyletic across all analyses, but it is possible that this is a sampling artefact. We used only 2 fewer 28S sequences than Guil et al. (2019). LC103160 Stygarctus ayatori was removed due to a large uninterrupted string of undetermined 'N' nucleotide bases, whilst GQ849040 Tanarctus dendriticus was removed for reasons detailed more clearly in the Methods section (Figures S11-S14, Data S1). We additionally used only 2 fewer 18S sequences than Guil et al. (2019). We removed LC103165 Stygarctus ayatori and LC103163 Batillipes sp., for the same reason as LC103160, from our 28S data set. The resolution change here is notable with comparison to studies focused on arthrotardigrade phylogeny (Fujimoto et al., 2017, Jørgensen et al., 2018. In particular, our results do not support the existence of paraphyletic Arthrotardigrada, as proposed by Fujimoto et al. (2017). This may be due to a lack of sampling within Arthrotardigrada-our sampling regime limited the diversity of arthrotardigrades within the data set. However, as our concern within this study was the nature of the relationship between Apochela and Parachela, or Apotardigrada and Eutardigrada, we do not consider this to have greater ramifications on the focus of our study and are reluctant to draw firm conclusions from our results regarding the monophyly or paraphyly of Arthrotardigrada.

| CONCLUSIONS
In four of our five test conditions, across both Bayesian and maximum-likelihood resolution methods, the earliest Arthrotardigrada diverged from the other Heterotardigrada prior to the divergence of Apochela/Apotardigrada from Parachela. This might lead to the suggestion that the branch length data presented in this study corroborate that of Guil et al. (2019). However, the similarly conserved BUSCO genes do not agree with the Guil et al. (2019) hypothesis. The authors believe that this variation calls in to question the reliability of the molecular evidence for Apotardigrada.
More generally, the practise of determining taxonomic levels through strict branch length comparisons is an ineffective and inappropriate use of branch length data. Considering relative branch lengths are highly dependent on models, alignment and sampling, this could lead to a proliferation of new names for synonymous groups based on modelling choices, which would serve only to make the field more difficult to grasp, whilst providing little to no additional clarity as to the taxonomic affiliations of these clades.
However, this study does not address the morphological reassessment presented within Guil et al. (2019). This diagnosis has proved controversial, moreso in light of the