Phylogenomics of novel ploeotid taxa contribute to the backbone of the euglenid tree

Euglenids are a diverse group of flagellates that inhabit most environments and exhibit many different nutritional modes. The most prominent euglenids are phototrophs, but phagotrophs constitute the majority of phylogenetic diversity of euglenids. They are pivotal to our understanding of euglenid evolution, yet we are only starting to understand relationships amongst phagotrophs, with the backbone of the tree being most elusive. Ploeotids make up most of this backbone diversity—yet despite their morphological similarities, SSU rDNA analyses and multigene analyses show that they are non‐monophyletic. As more ploeotid diversity is sampled, known taxa have coalesced into some subgroups (e.g. Alistosa), but the relationships amongst these are not always supported and some taxa remain unsampled for multigene phylogenetics. Here, we used light microscopy and single‐cell transcriptomics to characterize five ploeotid euglenids and place them into a multigene phylogenetic framework. Our analyses place Decastava in Alistosa; while Hemiolia branches with Liburna, establishing the novel clade Karavia. We describe Hemiolia limna, a freshwater‐dwelling species in an otherwise marine clade. Intriguingly, two undescribed ploeotids are found to occupy pivotal positions in the tree: Chelandium granulatum nov. gen. nov. sp. branches as sister to Olkasia, and Gaulosia striata nov. gen. nov. sp. remains an orphan taxon.


I N T RODUC T ION
EUGLENIDS are a large and diverse group of Euglenozoa with phototrophic, osmotrophic, and phagotrophic species (Kostygov et al., 2021;Leander et al., 2017). Most known euglenids have one or two flagella and usually either glide on substrate or swim (Lax & Simpson, 2020;Leander et al., 2017). One key characteristic is the euglenid pellicle-proteinaceous parallel strips underlying the cell membrane that are backed by a system of microtubules (Cavalier-Smith, 2017;Leander, 2004). The number of pellicle strips can vary drastically in euglenids, some phototrophic euglenids have upwards of 100 pellicle strips, whereas some phagotrophic petalomonads have only four pellicle strips (Leander et al., 2017). Species with over 20 or so pellicle strips can exhibit 'euglenid metaboly', rendering the cells highly flexible (Leander et al., 2007), in contrast to taxa with fewer pellicle strips, which are all rigid.
Both the phototrophic Euglenophyceae and osmotrophic Aphagea arose from phagotrophic euglenids, yet until recently relatively few molecular sequences of phagotrophs were available (Lee & Simpson, 2014a, 2014b. Even the selection of SSU rDNA sequences was not representative of phagotrophic euglenid diversity, with many genera completely unsampled, and most only having 1-2 sequences available (Leander et al., 2017). Recent work has greatly expanded the taxon-sampling of

Phylogenomics of novel ploeotid taxa contribute to the backbone of the euglenid tree Gordon Lax
| Anna Cho | Patrick J. Keeling SSU rDNA from phagotrophs, leading to many insights into the relationships between these species, but the inherent limitations of single-gene trees, together with the fact that euglenoid SSU rDNA is divergent in nature, has complicated these analyses (Busse et al., 2003;Cavalier-Smith et al., 2016;Lax et al., 2019;Lax & Simpson, 2020;Paerschke et al., 2017). To fully clarify these relationships, large multigene analyses are required, but such data has only recently been acquired . Currently known phagotrophic euglenids can be divided into three main groups: generally flexible Spirocuta with 20+ pellicle strips, rigid petalomonads with ≤10 pellicle strips, and rigid ploeotids with 10-12 pellicle strips (Leander et al., 2007(Leander et al., , 2017. Spirocuta also contains phototrophic and primary osmotrophic euglenids and their closest living phagotrophic relatives, and petalomonads have been identified as the deepest branch of euglenids to date . Meanwhile ploeotids are a phylogenetically diverse group of phagotrophs that share certain morphological characters. They are all rigid cells with 10-12 pellicle strips that can feed on either eukaryotes and/or bacteria  and glide on their posterior flagellum (Lax & Simpson, 2020;Leander et al., 2017). Despite these similarities, questions have been raised as to whether these taxa are indeed monophyletic. Recent SSU rDNA analyses with a focused taxon sampling on ploeotids have shown that their evolutionary relationships appear to be more complicated Lax & Simpson, 2020;Lee & Simpson, 2014a;Paerschke et al., 2017). Moreover, a recent multigeneanalysis showed similar results: rather than falling into a single monophyletic clade, ploeotids were divided into up to four discrete groups, indeed mostly each known genus forming its own group . The sole exception to date is the well-supported clade Alistosa, which contains the genera Keelungia, Lentomonas, Ploeotia, and Serpenomonas. The other ploeotid clades for which multigene-data are currently available are Entosiphon, Liburna, and Olkasia.
Several ploeotid taxa which-from SSU rDNA sequences-appear to fall in key phylogenetic positions  are still missing from multigene analyses. A single SSU sequence of soil-dwelling Decastava exists, but its relationship to other ploeotids is ambiguous  and no multigene data exists for this taxon. Hemiolia is morphologically very similar to Liburna: Both are only known to be marine, and both exhibit a characteristic fast gliding and sudden-stop motion (Ekebom et al., 1995;Larsen, 1987). SSU rDNA analyses do not decisively show their evolutionary placement amongst euglenids, and multigene analyses so far only included Liburna . Moreover, two seemingly important ploeotid strains are only known from single-cell sampled SSU rDNA sequences: 'SMS7' and 'CARR5' . Again, their placement in SSU-analyses is highly unstable and no additional molecular data exists.
To further our understanding of phagotrophic euglenid phylogenetics, we generated five single-cell transcriptomes of ploeotid taxa (one Decastava, two Hemiolia, two previously 'Unidentified ploeotids' closely related to single-cells SMS7 and CARR5 from a previous study). We use the transcriptomic data to update the available multigene phylogenetic data for euglenids. Based on this phylogeny, we evaluate the relationships with other euglenid groups, propose a new clade Karavia that encompasses Liburna and Hemiolia, and propose the novel taxa Hemiolia limna nov. sp. (the first freshwater member of this clade), Gaulosia striata nov. gen. nov. sp. (related to CARR5), and Chelandium granulatum nov. gen. nov. sp. (related to SMS7).

Sample collection and preparation
Marine intertidal samples were collected from various sites in British Columbia, Canada: Centennial Beach Park at Boundary Bay (Vancouver), and Powell River. A single soil sample was collected from Galliano Island in British Columbia, Canada, and a single freshwater sample was collected from Quadra Island, British Columbia, Canada. See Table S1 for additional information.
Marine samples were treated as described previously (Larsen & Patterson, 1990). Briefly, samples were placed in trays 1-2 cm high, covered with a Kimwipe tissue, and #1 coverslips added onto them. After 24-72 h in a natural day-night cycle, the coverslips were examined with the downside facing upwards on a Leica DLIM inverted microscope and imaged with a Sony a7RIII camera. For the Galliano sample (SPO2), minute amounts of soil were placed in a petri dish or on a coverslip, and a couple of drops of 0.2 μm-filtered tapwater was added. These samples were examined and imaged as above.
To lyse the cells, tubes were subjected to 3-5 freezethaw cycles (alternating between −80°C and room temperature). The Smart-seq2 protocol was followed to generate cDNA using poly-A selection, with 24 PCR cycles for final amplification (Picelli et al., 2014).
The resulting cDNA was then quantified using a Qubit HS DNA assay (Invitrogen), and libraries were prepared using Illumina Nextera XT or Illumina DNA Library Preparation kits (Table S2). Sequencing for all samples was done on an Illumina NextSeq instrument with 2 × 150-bp paired end reads, whereas sample BB14 was additionally resequenced on an Illumina MiSeq with 2 × 250-bp reads.
Maximum likelihood (ML) phylogenies were estimated using RAxML-NG 1.1.0 (Kozlov et al., 2019) under the GTR+GAMMA model and 1000 non-parametric bootstrap replicates. We also conducted a Bayesian analysis in MrBayes 3.2.7a (Ronquist et al., 2012) under the GTR+GAMMA model running four parallel chains with default heating parameters and 50,000,000 generations each (run on the CIPRES webserver: https://www. phylo.org/index.php). Trees were sampled every 10,000 generations, and 25% were discarded as burn-in. Chain convergence was assessed by PSRF values (Potential Scale Reduction Factor) approaching 1.0. Both analyses are summarized in Figure S1.

Phylogenomic analyses
We base our dataset on a recently published study that investigated euglenid phylogenetics with a 20-gene dataset . We also added two recently published members of the Prokinetoplastina (Tikhonenkov et al., 2021).
Coding regions of all transcriptomes were determined with Transdecoder 5.5.0 (Grabherr et al., 2011). Relevant marker genes were extracted with blastp from these translated peptide assemblies as queries against the 20 single-gene seeds. Hits were subsequently checked with blastp against Swiss-Prot. The surviving top five hits for each transcriptome were then appended to the existing 20 single-gene datasets, aligned with MAFFT L-INS-I, and trimmed with BMGE 1.12 (-m BLOSUM30 -h 0.5 -g 0.2; Criscuolo & Gribaldo, 2010). A phylogeny was estimated for each single-gene alignment using the LG4X model in IQ-TREE 1.6.12 (Nguyen et al., 2015) with 1000 ultrafast bootstraps (UFB). Each single-gene tree was then carefully checked, and contaminant, paralogous, or otherwise long-branching sequences removed from the untrimmed alignments. After re-aligning, re-trimming, and re-estimating the trees, this process was repeated one more time.
We noticed some deep paralogies amongst euglenids in PABPC4 trees, so we subsequently excluded this protein from our analyses. The cleaned, aligned, and trimmed single-genes were then concatenated, producing a dataset with 19 genes, 62 taxa, and 6194 positions. For this '19-base' dataset, a phylogeny was reconstructed under the LG + C60 + F + G model in IQ-TREE 2.2.0 (Minh et al., 2020), with 1000 UFB (Minh et al., 2013). All taxa included in this and the following analyses are listed in Table S4. We also generated 500 non-parametric bootstraps for the same dataset under the posterior mean site frequency model (PMSF) in IQ-TREE, using the previous LG + C60 + F + G output as a guide tree (Wang et al., 2018). The '19-base' dataset was run in PhyloBayes-MPI v1.8 (Lartillot et al., 2009) under the CAT+GTR model with four independent cold chains, each running for 30,000 cycles with a burn-in of 25%, sampled every 10 trees (all four chains converged). Both analyses are depicted in Figure 2.
To investigate the impact of fast-evolving sites on our phylogeny, a '19-FSR' dataset was created, where fastevolving sites were removed in 4%-increments as described previously . For each increment, a LG + C20 + F + G phylogenetic tree with 1000 UFBoot replicates was constructed. Figure S2 plots the bootstrap values for select relevant nodes for each of the incremental fast-site removal steps.

Gaulosia striata (BB9)
A single cell was isolated from oxygenated marine intertidal sediment, oblong-pyriform in shape with a rounded posterior end ( Figure 1A). The cell measured 29.3 × 14.5 μm in size (Table 1) and was gliding on its thickened posterior flagellum (2.2× cell length) with its anterior flagellum being 'cast out' in front of the cell (like a fishing line; 0.8× cell length; Movie S1). Five straight, broad pellicle strips were found on the dorsal and five on the ventral side, on a slight angle compared to the cell outline ( Figure 1B). The cell changed direction after slowly moving back on its posterior flagellum. The posterior half of the cell was smooth and presumbably housed the nucleus. The small, slim flagellar pocket was hard to see but was situated along the midline in the anterior quarter of the cell ( Figure 1A). We did not observe any feeding apparatus. The BUSCO completeness score was 43.2% complete single and duplicated, 14.9% fragmented, and 41.9% missing. The SSU rDNA sequence of this cell is deposited under GenBank accession number OQ331014.

Chelandium granulatum (BB14)
From oxygenated marine intertidal sediment, we isolated a single cell, oblong in shape and heavily granulated ( Figure 1C). It measured 27.1 × 14.6 μm in size and was gliding on its thickened posterior flagellum (4.4×  Table 1), while the anterior flagellum was slowly repeatedly being 'cast out' similar to Gaulosia (1× cell length; Movie S2). Neither pellicle strips nor feeding apparatus could be seen, likely due to the presence of refractile granules in the interior of the cell ( Figure 1D). For this cell, 3.5% complete single-copy and duplicate, and 2.4% fragmented BUSCOs were recovered, with 94.1% missing. The corresponding SSU rDNA sequence for this cell can be found under GenBank accession number OQ331015.

Hemiolia trepidum (PRC4)
One cell was observed and isolated from oxygenated marine intertidal sediment. The cell was oblong, with a slightly tapered posterior, slightly dorso-ventrally flattened, and measured 17.3 × 8.8 μm ( Figure 1E; Table 1). The thickened posterior flagellum (3.5× cell length) forming a hook-shape, was used for a fast gliding motion, which was stopped intermittently, at which time both flagella largely stopped moving. The anterior flagellum (1.3× cell length) was usually held in front of the cell, towards the right, and was trembling along its distal half (Movie S5). The cell occasionally stopped and jerked back on its posterior flagellum after which it resumed gliding in a different direction. 3-4 faint pellicle strips could be seen on its dorsal side ( Figure 1E). No feeding apparatus was visible at 630× magnification. We recovered 6.7% complete single-copy and duplicate, and 3.9% fragmented BUSCOs, with 89.4% of BUSCOs missing. The corresponding SSU rDNA sequence for this cell is deposited under GenBank accession number OQ331010.

Decastava sp. (SPO2)
We observed and isolated one cell from soil and forest litter. The cell was irregularly lentil-shaped, dorso ventrally flattened and measured 21.5 × 17.3 μm ( Figure 1F; Table 1). It was gliding on its posterior flagellum of ~1.4× cell length, while the anterior flagellum (~0.8× cell length) was flailing in front of the cell (Movie S6). A prominent hook-shaped feeding apparatus extends almost completely down the whole length of the cell, along its midline ( Figure 1F). No pellicle strips were observed with light microscopy at 630× magnfication. For this assembly, 31.4% complete single-copy and duplicate BUSCOs were recovered, 10.2% fragmented, and the remaining 58.4% of BUSCOs missing. Its corresponding SSU rDNA sequence is deposited under GenBank accession number OQ331009.

Hemiolia limna (ML5 and ML7)
We isolated two cells from the same freshwater sediment. Cells ranged from 24.4/24.4 × 9.8/12 μm in size were oblong with a pointed posterior (ML5 only) and slightly ventrally flattened (Figure 1G-I; Table 1). Cell ML5 had its anterior flagellum truncated and posterior flagellum sheared prior to measurement ( Figure 1G), whereas cell ML7 lost its anterior flagellum completely and likely had its posterior flagellum truncated. We observed a flagellar pocket in both cells (1/3× cell length; Figure 1H,I), and 3-4 faint dorsal pellicle strips could be seen on both cells (ML7: Figure 1H; ML5: Movie S3). Both ML5 and ML7 had some internal yellow-green material, presumably ingested algal material (Figure 1G-I; Movies S3 and S4). This co-assembly had 14.9% singlecopy and duplicate, and 20% fragmented BUSCOs recovered, with 65.1% missing. The SSU rDNA sequence for ML5 is deposited under GenBank accession number OQ331011, ML7 under OQ331012, and the sequence from co-assembly ML5-ML7 under OQ331013 (all three SSU rDNAs are identical).

Phylogenetics
We recovered euglenids as a monophyletic clade in all of our multigene trees, with high support (85%-99% BS, 1 pp; Figure 2; Figure S3; also see graph in Figure S2). Major groups of euglenids that were recovered are Petalomonads as the deepest branch (100% BS, 1 pp), Spirocuta as a large clade encompassing Euglenophyceae, Aphagea, and many phagotrophs (100% BS, 1 pp), and Alistosa as a clade containing various ploeotid taxa (100% T A B L E 1 Cell measurements of single cells isolated in this study.

Cell
Length ( BS, 1 pp). Some of our analyses also recovered Olkaspira, which includes spirocute taxa and Olkasia, albeit with very low support (54%-55% BS, 0.99 pp, not recovered in main PMSF analysis). The ploeotid taxa Hemiolia and Liburna were always recovered as a highly to fullysupported clade, for which we propose the name Karavia (99%-100% BS, 1 pp), with Gaulosia sometimes falling sister to this clade, but always with low support (52%-53% BS, 0.99 pp, not recovered in main PMSF analysis). The SSU rDNA analysis was, not surprisingly, less well resolved. We recovered euglenids with low support ( Figure S1), with symbiontids branching inside euglenids.
An additional analysis excluding all symbiontid sequences did not change the major topology of the tree (data not shown). Spirocuta is recovered with low to moderate support, while maximally supported Petalomonadida branches amongst ploeotids. Ploeotids fall into several clades, as in previous SSU rDNA analyses Lax & Simpson, 2020;Schoenle et al., 2019). Clade Alistosa breaks apart into 3 main branches (Lentomonas + Decastava; Keelungia; Ploeotiidae), rendering the group polyphyletic. Karavia is non-monophyletic, with long-branching Hemiolia forming the sister clade to Lentomonas + Decastava clade, whereas Liburna branches as part of a poorly supported clade of Petalomonadida, Gaulosia, and Ploeotiidae. Chelandium falls sister to Spirocuta with low support, whereas Olkasia is recovered as the deepest branch of euglenids with low support. Reference phylogeny. Figure 2 in this paper.

Taxonomic descriptions
Comments. This is a zoological name above the level of family and falls outside the zoological (and botanical) codes of nomenclature.
Etymology. From 'gaulos' (Greek, singular), a Phoenician trading ship. Refers to the rounded, nonflattened appearance and visible pellicle strips resembling wooden ship laps, and relative slow gliding speed.
Transfer of existing taxa. Transfer 'Unidentified Ploeotid CARR5'  to genus Gaulosia as Gaulosia sp. based on morphological similarities and phylogenetic affinity.
Etymology. From Latin 'striata' for 'striped', owing to the visible pellicle strips in light microscopy.
Gene sequence. The SSU-rDNA sequence extracted from the single-cell transcriptome assembly of cell BB9 is deposited under GenBank Accession number OQ331014.
Diagnosis. Free-living, rigid, biflagellated hetertrophic euglenid with an oblong outline, slightly flattened ventrally, cell is full of refractile granules. Anterior flagellum ~1× cell length, considerably thickened posterior flagellum ~4.4× cell length. Cell glides relatively fast on the posterior flagellum and in a straight line.
Transfer of existing taxa. Transfer 'Unidentified Ploeotid SMS7'  to genus Chelandium as Chelandium sp. based on morphological similarities and phylogenetic affinity.
Etymology. From Greek 'limne' for 'lake', since it was found in a freshwater environment.
Gene sequence. The SSU-rDNA sequence extracted from the single-cell transcriptome assembly of cell ML7 is deposited under GenBank Accession number OQ331012.

General phylogenetic relationships
Most phylogenetic relationships recovered in our analyses confirm those of the most recent multigene phylogeny of euglenids . We recover euglenids as a monophyletic group with high support in all of our analyses, which is in agreement with several recent multigene analyses that included euglenids Wideman et al., 2019;Záhonová et al., 2021). Spirocuta, a large clade that includes many phagotrophic euglenids, but also the diverse clade of Euglenophyceae (phototrophic euglenids) and Aphagea (primary osmotrophic euglenids), is fully supported in our analyses.
The middle section of the euglenid tree backbonenestled between Spirocuta and Petalomonadida-is occupied by a variety of ploeotid taxa and groups (Figure 2). While the phylogenetic placement of some ploeotid taxa like Gaulosia remain unclear, the majority of sampled ploeotids to date have now coalesced into two main groups in multigene studies: Alistosa and Karavia.

Decastava is an alistosan ploeotid
Our Decastava sp. SPO2 transcriptome is the first multigene-grade data for this genus, enabling its placement in a phylogenomic context. Decastava branches sister to Lentomonas with full support, as part of Alistosa in our multigene analyses (Figure 2). When included in previous SSU rDNA analyses Decastava branched sister to Lentomonas, but with low to moderate support only Lax & Simpson, 2020) or to Keelungia when Lentomonas was not included (Schoenle et al., 2019).
Alistosa was formally proposed as molecularly defined clade in 2021 and was inferred to include the genera Ploeotia, Serpenomonas, Lentomonas, Keelungia, and Decastava . Our analysis here confirms this inferred composition. While these genera all share certain morphological characters-such as 10 pellicle strips (often prominent), an anterior flagellum presumably used for prey capture Leander et al., 2017), and a posterior flagellum used for gliding-it is unclear what, if any morphological characters might be unique to them, since those listed above apply equally well to other ploeotid taxa (e.g. Olkasia). Additional ultrastructural studies might be needed to further our understanding of any potential shared characters in this group like the structure of the feeding apparatus or pellicle strips. Past studies have shown that Ploeotia, Serpenomonas, and Lentomonas all appear to have a similar 'type II' feeding apparatus (Farmer & Triemer, 1994;Linton & Triemer, 1999). This might be observable with light microscopy-all of the taxa examined have a prominent hook-shaped feeding apparatus ( Figure 1F). They all have bifurcations on their abutting pellicle joints, although the non-alistosan Entosiphon also has the same character Chan et al., 2013;Farmer & Triemer, 1988, 1994Triemer, 1986).
Our new Decastava cell has an SSU sequence that is 97.1% similar to the two existing Decastava SSU sequences from culture CCAP 1265/2. While this new cell might represent a new species of Decastava, we think it is impossible to tell from sequence similarity alone, and our light microscopy imagery is not sufficient to see any potential morphological differences between our cell and the culture. Additionally, while all existing Decastava have been found in soil samples only, the sampling location for our Decastava was different than the other sequences (Canada vs. Scotland/Europe).

Hemiolia and Liburna form novel group Karavia
While a previous multigene analysis has established Liburna as a ploeotid taxon fitting somewhere in the backbone of the euglenid tree, its sister taxon Hemiolia has been absent from these analyses . SSU rRNA analyses show both taxa sometimes forming a clade Lax & Simpson, 2020), which is in agreement with their shared morphology. Both Hemiolia and Liburna are fast gliders that use their posterior flagellum, and exhibit a characteristic sudden stop-and-resume motion . Now that single cell transcriptomic data for Hemiolia is available (cell PRC4), our multigene phylogeny perhaps unsurprisingly shows them both forming a strongly to fully supported clade in all our analyses (Figure 2; Figures S2 and S3).
As the Hemiolia described here appears to be unlike other described species from this genus, we propose a novel species H. limna to reflect the position of cells ML5 and ML7 in our analyses (cells ML5 and ML7 are co-assembled in our dataset). The new species branches with H. trepidum sequences in our SSU phylogenies ( Figure S1), but H. limna is clearly distinct from any sequence attributed to H. trepidum and merits a new species of Hemiolia based on this, and the fact that H. limna is currently the only freshwater member of this group (see below).
Indeed, the tree now also shows the SSU rDNA sequences from H. trepidum are not monophyletic: H. limna ML5 and ML7 branch specifically with a maximally supported clade of several H. trepidum ( Figure S1), to the exclusion of 'H. trepidum' STS7. We propose that the currently identified H. trepidum represent multiple distinct species and further revisions will be required. Since Hemiolia and Liburna were both established mainly on their distinct phylogenetic position , morphological markers other than a distinct mode of motility are crucial in furthering our understanding of these genera-e.g. the fine structure of the pellicle strips, or feeding apparatus structure (Triemer & Farmer, 1991). Unfortunately, these electron microscopy-based studies are predicated on culturing the organism in question.
To reflect the highly supported branching of Liburna and Hemiolia in our multigene trees, and their morphological similarities, we propose a new name for this clade, Karavia.

Freshwater-and marine-dwelling Karavia
Hemiolia limna falls into the Karavia clade and is the first known member of this group that has been found in freshwater. Initial descriptions of the only other currently known members of this group, Hemiolia trepidum and Liburna glaciale, are from marine intertidal sediments (Larsen, 1987;Larsen & Patterson, 1990). The delineation between freshwater and marine boundaries and their transition rates amongst euglenids has received little attention to date. Transition rates between marine and non-marine environments vary between major eukaryotic groups appears to be quite variable: For example, cercozoans and fungi seem to have a higher number of transitions from marine to non-marine, whereas the opposite is true for dinoflagellates (Jamy et al., 2022). The same study unfortunately excluded any sequence data of Discoba due to poor phylogenetic resolution and ambiguous rate estimates (Jamy et al., 2022), so the following discussion is based on unquantified observations. Curiously the phototrophic euglenid genera Eutreptiella and Eutreptia (order Eutreptiales) are predominantly marine, whereas the remainder (order Euglenales) have long been assumed to be predominantly freshwater (Kostygov et al., 2021;Marin et al., 2003). Using an environmental sequencing and metagenomics approach, a recent study has found indications of several low-abundance marine or brackish clades within Euglenales (Lukešová et al., 2020). This suggests that in Euglenophyceae, the transition from marine to freshwater seems to have occurred only rarely, the main event being from Eutreptiales to Euglenales. This is in contrast to phagotrophic euglenids where this event likely happened several times, if we assume the predecessor of euglenids to be marine. Most phagotrophic genera are in fact found in both freshwater and marine environments, like Anisonema, Dinema, Jenningsia, Urceolus, Heteronema, and Peranema (all part of Spirocuta). Entosiphon is currently the only phagotrophic clade that has been observed solely in freshwater or brackish environments (Kolisko et al., 2020;Triemer & Fritz, 1987). The distribution of freshwater taxa seems to be focussed more on spirocute taxa, with only Entosiphon and Hemiolia limna amongst ploeotids, and a large variety of petalomonads having been found in freshwater (Lee, 2022;Triemer & Fritz, 1987). Since ploeotids appear to harbor much of the phylogenetic diversity of euglenids, it is possible that additional freshwater-only clades might be found. Though it is noteworthy that the largest effort to examine freshwater euglenid diversity lists only five Entosiphon species and no other ploeotids as freshwater-dwelling (Huber-Pestalozzi, 1955). While we cannot know for sure without additional studies, it seems likely Petalomonadida is the richest in freshwater taxa amongst euglenids, but despite this, no clear delineations of freshwater-only or marineonly clades have emerged in this group either (Huber-Pestalozzi, 1955;Leander et al., 2017), likely due to insufficient taxon sampling.

Gaulosia and Chelandium
A previous study has generated SSU rRNA data for 'Unidentified Ploeotid' cells SMS7 and CARR5, but did not describe any new formal taxa, since their placement in SSU phylogenies is unclear and no real affiliation could be determined . Cells Gaulosia BB9 and Chelandium BB14 are morphologically very similar and branch sister to SMS7 and CARR5, respectively, in our SSU rDNA phylogeny ( Figure S1). Furthermore, our single cell transcriptomes enabled multigene phylogenetic analyses of these taxa.
Chelandium BB14 branches solidly as sister to the ploeotid Olkasia, and as part of Olkaspira (Figure 2). Chelandium and Olkasia seem to share few morphological characteristics, other than both appearing to glide on their posterior flagellum and waving the anterior flagellum around (Figure 1; Movie S2), both of which are traits of ploeotids in general . We could not observe any pellicle strips on our Chelandium isolate, potentially due to the limitations of the microscope used to isolate single cells (e.g. neither DIC optics nor 1000× magnification). Olkasia has large, well-defined pellicle strips . The refractive granules inside Chelandium are not found in Olkasia (Figure 1), and its posterior flagellum is much longer than that of Olkasia. Despite their strong phylogenetic grouping, we at this point refrain from formally describing a novel clade that includes Olkasia and Chelandium, chiefly because of their apparent morphological differences. We think Chelandium needs some further investigation with higher resolution light microscopy and possibly electron microscopy to warrant the establishment of a new group.
The previously reported 'Unidentified Ploeotid SMS7' cell is branching with full support with Chelandium BB14 in our SSU tree ( Figure S1), and generally shares the same morphology. While SMS7 is half the size (16 × 8 μm) compared to BB14 (27 × 14.5 μm), both show the same gliding pattern on a very long thickened posterior flagellum and are heavily granulated. Owing to the size difference, difference in sampling locality (SMS7 was isolated from marine subtidal sediment in Curaçao), and their branch lengths, we consider SMS7 as Chelandium, but likely a different species. We thus propose to transfer 'Unidentified Ploeotid SMS7' to 'Chelandium sp. SMS7'.
The phylogenetic position of Gaulosia on the other hand unfortunately remains unclear-its position as sister to Karavia is tentative at best in any of our analyses and does not always get recovered (Figure 2; Figure S2). Neither are there many morphological similarities between BB9 and Karavia. Both glide on their posterior flagellum, but the typical fast gliding and stop-and-go pattern of both Hemiolia and Liburna are missing in Gaulosia (Larsen, 1987;Lax et al., 2019), it instead has a slower gliding pace. The protein sampling for our Gaulosia BB9 isolate is high (18 out of 19 proteins, with 87% of sites), so either increased taxon sampling for this part of the tree or additional and/or different phylogenetic marker genes are likely needed to fully resolve the position of this novel taxon.
The SSU rDNA sequence of cell 'Unidentified Ploeotid CARR5' could not be placed confidently amongst ploeotids in previous SSU analyses and was treated as an orphan ploeotid taxon without any known close relatives . Our Gaulosia BB9 SSU sequence forms a maximally supported clade with CARR5 and a short environmental sequence from Antarctic lake sediment ( Figure S1). The branch lengths separating Gaulosia BB9 and CARR5 are not particularly large. In terms of morphology, both cells are more or less the same size (BB9 is 29 × 14.5 μm vs. CARR5 is 32 × 20 μm; Table 1), but more importantly they show the same number of easily visible pellicle strips and share the same pyriform shape. The posterior flagellum in CARR5 was reported to be 4.5× cell length, while BB9 only has a 2.2× cell length long flagellum . Like with Chelandium, based on the morphological similarities and phylogenetic affinities, we propose to transfer 'Unidentified Ploeotid CARR5' to Gaulosia and treat it as an undetermined species 'Gaulosia sp. CARR5'.
BLASTing our Gaulosia BB9 sequence against GenBank's nr/nt database, we recovered a short environmental sequence AB510392. This sequence and our BB9 form clade in our SSU rDNA analysis with Gaulosia CARR5 falling sister to them both, together forming a maximally supported clade ( Figure S1). This environmental sequence was generated from sediment from hypersaline lake Suribati in Antarctica and was classified as an embryophyte by SILVA. This demonstrates the need for a more comprehensive sampling of phagotrophic euglenids, considering there might be additional 'hidden' phagotrophic euglenid sequences in public databases that currently are not being classified correctly because the reference databases used underrepresent euglenids and other taxa with highly divergent SSU sequences (Forster et al., 2016).

Ploeotids harbor much of the phylogenetic diversity of phagotrophic euglenids
While the sampling efforts in a recent paper  and in this study have significantly furthered our understanding of the phylogenetic history of euglenids, some of euglenid diversity likely remains to be sampled, as the discovery of Chelandium and Gaulosia emphasizes. The backbone of the tree consists of Petalomonadida (currently the deepest branching euglenids) and several ploeotid subgroups. We now are only starting to discover all the different taxa in this part of the tree, but are still some time away from understanding how they are related to each other.
Considering the apparent species richness of ploeotids , we can assume at least many of them should fall into some of the currently established clades, Alistosa (contains Ploeotia vitrea), Karavia (Hemiolia and Liburna), and the Olkasia + Chelandium clade. Members of Alistosa likely share certain morphological characteristics, like the hook-shaped feeding apparatus Linton & Triemer, 1999). As such, we expect taxa with a similar hook-shaped feeding apparatus and 10 pellicle strips-like Ploeotia adherens, P. discoides, and P. plana-to be part of Alistosa, if molecular data will ever be collected for these species (Larsen & Patterson, 1990;Lee, 2012;Schroeckh et al., 2003). However, given the phylogenetic diversity of ploeotids, it is not unlikely that some undescribed taxa might not fall into established clades. A prominent example is 'Unidentified Ploeotid WF2_3' , for which SSU data exists but its placement in the phylogenies is unresolved, other than it likely being a ploeotid. Multigene phylogenetics would likely help determine its placement and help to further our understanding of the backbone of the euglenid tree. It is also not unlikely that new taxa will be found to fall into unexpected parts of the tree, like we now know Chelandium does. Its phylogenetic position to its closest relative Olkasia might seem surprising when examining cells with light microscopy only, but based on the tree, we now suspect scanning and transmission electron microscopy studies of both organisms might reveal shared morphological characters (e.g. structure of pellicle joints, of feeding apparatus structure). This underscores the need for deeper investigations of the ultrastructure of the euglenid groups that are currently being established.
One hypothesis as to why the backbone of the euglenid tree remains unresolved is that some key taxa/groups are missing in our current sampling, e.g. 'Unidentified Ploeotid WF2_3' or potentially other undescribed species that remain to be sampled. The hope is that continuous sampling of euglenids in general, and ploeotids in particular, will eventually resolve the tree. Fortunately, single-cell transcriptomics is well-suited for generating multigene-grade data of phagotrophic euglenids. Even our worst transcriptome reported here (Chelandium granulatum BB14 with 94.1% BUSCOs missing, compared to the Euglena gracilis GEFR01 transcriptome assembly with only 12.6% BUSCOs missing) still yielded 43% of sites in our multigene dataset (Table S4). By contrast, Gaulosia striata BB9 had 87% of multigene sites and 41.9% BUSCOs missing, yet could not be confidently placed in our trees (e.g. Figure 2), again hinting at the need to sample additional taxa. Another potential way forward is increasing the number of phylogenetic marker genes. Our study uses 19 markers, which resolves most of the tree. However, most recent analyses of deeper branching relationships use upwards of 150, which could negate some systematic errors in the data if used with appropriately complex models (Gawryluk et al., 2019;Rodríguez-Ezpeleta et al., 2007;Sierra et al., 2022;Tice et al., 2021).

AC K NOW L E DGM E N T S
We thank the UBC Sequencing and Bioinformatics Consortium and Sunita Sinha for library prep and sequencing, Noriko Okamoto for assistance with sampling, Eric Peterson and Christina Munck of the Hakai Institute for their generous support of our research, as well as the Hakai Biomarathon team for sample collection and field support. This work was supported by grants to PJK from the Natural Sciences and Engineering Research Council of Canada (RGPIN-2014-03994) and GenomeBC (R02MSE). GL was supported by GenomeBC, and AC was supported by NSERC-PGSD and UBC 4YF.

DATA AVA I L A BI L I T Y STAT E M E N T
Raw transcriptome reads are deposited under GenBank BioProject PRJNA929644, SSU rDNA sequences under GenBank NCBI accessions OQ331009-OQ331015. Assembled transcriptomes and their predicted protein sequences, alignment files, tree files, additional micrographs, and multiQC summary can be found under DataDryad accession doi: 10.5061/dryad.w3r2280vv.