Barcodes Reveal 48 New Species of Tetrahymena, Dexiostoma, and Glaucoma: Phylogeny, Ecology, and Biogeography of New and Established Species

Tetrahymena mitochondrial cox1 barcodes and nuclear SSUrRNA sequences are particularly effective at distinguishing among its many cryptic species. In a project to learn more about Tetrahymena natural history, the majority of >1,000 Tetrahymena‐like fresh water isolates were assigned to established Tetrahymena species with the remaining assigned to 37 new species of Tetrahymena, nine new species of Dexiostoma and 12 new species of Glaucoma. Phylogenetically, all but three Tetrahymena species belong to the well‐established “australis” or “borealis” clades; the minority forms a divergent “paravorax” clade. Most Tetrahymena species are micronucleate, but others are exclusively amicronucleate. The self‐splicing intron of the LSUrRNA precursor is absent in Dexiostoma and Glaucoma and was likely acquired subsequent to the “australis/borealis” split; in some instances, its sequence is diagnostic of species. Tetrahymena americanis, T. elliotti, T. gruchyi n. sp., and T. borealis, together accounted for >50% of isolates, consistent with previous findings for established species. The biogeographic range of species found previously in Austria, China, and Pakistan was extended to the Nearctic; some species show evidence of population structure consistent with endemism. Most species were most frequently collected from ponds or lakes, while others, particularly Dexiostoma species, were collected most often from streams or rivers. The results suggest that perhaps hundreds of species remain to be discovered, particularly if collecting is global and includes hosts of parasitic forms.


ABSTRACT
Tetrahymena mitochondrial cox1 barcodes and nuclear SSUrRNA sequences are particularly effective at distinguishing among its many cryptic species. In a project to learn more about Tetrahymena natural history, the majority of >1,000 Tetrahymena-like fresh water isolates were assigned to established Tetrahymena species with the remaining assigned to 37 new species of Tetrahymena, nine new species of Dexiostoma and 12 new species of Glaucoma. Phylogenetically, all but three Tetrahymena species belong to the wellestablished "australis" or "borealis" clades; the minority forms a divergent "paravorax" clade. Most Tetrahymena species are micronucleate, but others are exclusively amicronucleate. The self-splicing intron of the LSUrRNA precursor is absent in Dexiostoma and Glaucoma and was likely acquired subsequent to the "australis/borealis" split; in some instances, its sequence is diagnostic of species. Tetrahymena americanis, T. elliotti, T. gruchyi n. sp., and T. borealis, together accounted for >50% of isolates, consistent with previous findings for established species. The biogeographic range of species found previously in Austria, China, and Pakistan was extended to the Nearctic; some species show evidence of population structure consistent with endemism. Most species were most frequently collected from ponds or lakes, while others, particularly Dexiostoma species, were collected most often from streams or rivers. The results suggest that perhaps hundreds of species remain to be discovered, particularly if collecting is global and includes hosts of parasitic forms.
"We suspect that the named species are a small fraction of the total existing in nature, and that a substantial fraction of the whole are facultative parasites in a wide array of invertebrate hosts." (Preparata et al. 1989) WHEN the above was written there were approximately 30 named species of Tetrahymena, a large number compared to other ciliate genera. The optimism was based on the observation that nearly every field collection, including the writers' own, found new species, though many were unnamed. Most were biological species separated by mating incompatibility, while others displayed particular morphological or life history traits. Some species were described based solely on genetic distance. Presently, there are 44 recognized species (Lynn and Doerder 2012;Pitsch et al. 2016;Quintela-Alonso et al. 2013;Zahid et al. 2014), some of which are parasitic, including the most recently described species . This paper describes the collection of another 37 species of Tetrahymena as well as many new species in related genera Dexiostoma and Glaucoma. The results imply that there are dozens, perhaps hundreds more, sustaining the 1989 prediction.
Since 1996 neither morphology nor mating specificity that traditionally defined Tetrahymena species have played decisive roles in describing new species. Rather, the decision to declare new species has been based on differences in SSUrRNA and/or cox1 sequences. Examples include T. empidokyrea (Jerome et al. 1996), T. farleyi (Lynn et al. 2000), (amicronucleate) T. aquasubterranea (Quintela-Alonso et al. 2013), T. farahensis (Zahid et al. 2014), T. utriculariae (Pitsch et al. 2016), and, most recently, T. glochidiophila ). There are three reasons why molecular criteria prevailed. First, most Tetrahymena species are cryptic, indistinguishable by microscopy. Failure of morphology to distinguish among most Tetrahymena species was recognized long ago (Corliss 1953) and is accepted as requiring alternatives (e.g., DNA sequences) for species description (Warren et al. 2017). Second, mating tests not only require investigators to perfectly maintain a panel of now dozens of mating type testers but such tests fail to identify the numerous isolates that are immature, selfing or asexual. Third, species are readily distinguished by molecular criteria, first by isozyme profiles (Nanney and McCoy 1976), and now primarily by cox1 barcodes and secondarily by SSUrRNA sequences (Chantangsi et al. 2007;Kher et al. 2011). Molecular description of species is increasingly common (Renner 2016).
The cox1 barcodes successfully identify Tetrahymena species, including those with identical SSUrRNA sequences, because the typical 0-2% intraspecific variation is considerably less than the average 10.5% interspecific variation (Chantangsi et al. 2007;Kher et al. 2011). The barcodes essentially confirmed past species assignments, including questionable ones. Both papers designated type strains and provided diagnostic cox1 barcodes and SSUrRNA sequences for most species. In addition to their pivotal role in decisions to name the species listed above, the sequences were used by Doerder (2014) to associate numerous wild amicronucleates with known (mostly micronucleate) species and to assign others to new species as further described here. The barcodes also were used to identify wild isolates as T. australis in a re-description of this species (Liu et al. 2016). The few exceptions to barcoding success appear to be due to mislabeling or misidentification of certain stock strains, a situation partially remedied here and discussed in more detail below. There are, however, six named Tetrahymena species for which living strains appear not to exist and therefore have little prospect of being barcoded.
Despite their morphological similarity, tetrahymenas are biologically diverse. Many possess a micronucleus and are sexual, but others, averaging 25% of wild isolates (Doerder 2014), are amicronucleate and asexual. Most species were isolated as free-living bacterivores, but some were found as parasites, primarily of aquatic invertebrates. Among the bacterivores are several species capable of transforming into large-mouthed predators which eat other tetrahymenas. Some species are globally distributed, appearing in two or more biogeographic zones, while others give evidence of endemism. All of these features cut across the "australis", "borealis" and "paravorax" clades revealed by molecular data (Brunk et al. 1990;Chantangsi et al. 2007;Lynn and Struder-Kypke 2006;Nanney et al. 1998;Preparata et al. 1989;Williams et al. 1984). The "borealis" clade is larger and more diverse while the "australis" clade is somewhat smaller and contains a cluster of closely related species; only one traditional species belongs to the "paravorax" clade.
The new species of Tetrahymena, Dexiostoma and Glaucoma formally named here are easily diagnosed by their unique cox1 barcodes and SSUrRNA sequences. Species descriptions also include information on the micronucleus, the self-splicing intron (SSI) of the large ribosomal subunit rRNA (LSUrRNA) and habitat data. The micronucleus by itself has no diagnostic value, but it is an important life cycle trait because only cells with a micronucleus are sexual, usually in the form of conjugation but also autogamy. The loss of the micronucleus happens frequently across a wide range of species; some amicronucleates, for which collecting has not identified their micronuclear counterparts, may be ancient and evolutionarily stable as species (Doerder 2014).
The SSI presence too is not diagnostic as species may be SSI+, SSIÀ or mixed; however, in some cases its sequence is diagnostic. The SSI is a Group I intron that removes itself from the LSUrRNA precursor and joins the two exons (Zaug and Cech 1980). Its presence is easily detected by PCR. From the lack of congruence of SSI and SSUrRNA sequence differences, Sogin et al. (1986) concluded that SSIs were independently acquired subsequent to splitting of the Tetrahymena major clades. In this survey, the presence/absence of the SSI for most isolates was determined and new information on SSI population biology and evolution is presented.
An advantage of cox1 and other sequences is that they can be used to make inferences regarding population biology, including population structure. This is relevant to the often cited "everything is everywhere" debate regarding the biogeography of microorganisms (reviewed in Caron 2009). Some investigators (Fenchel et al. 1997;Finlay 2002) have argued that species of microorganisms, including ciliates, are globally distributed with large populations that exchange genes through frequent migration. Others (including Foissner 1999;Foissner et al. 2008) have argued for endemism. For Tetrahymena, some species are indeed global, found in 2-3 biogeographic zones (though the Ethiopian zone has rarely been sampled) (Simon et al. 2008), but they have not been studied regarding population structure. Other species, however, have more restricted distributions, and one, T. thermophila, gives evidence of population structure consistent with "moderate" endemism (Zufall et al. 2013). In this report, species with sufficient sample sizes are examined for evidence of population structure and endemism.

MATERIALS AND METHODS
Water samples were collected from freshwater sources primarily in the northeast quadrant of the United States; most were collected since 2006 with the goal of finding species in addition to T. thermophila. Samples were collected and processed as described by Doerder and Brunk (2012). Following isolation as clonal populations, cells were cultured in Cerophyll inoculated with Klebsiella pneumoniae or in axenic PPY (1% proteose peptone, 0.15% yeast extract, 0.001M FeCl 3 ; autoclaved). Some isolates failing to grow in PPY grew successfully in axenic LP (per liter: 15 g proteose peptone; 1 g yeast extract; 2.5 g bactotryptone; 2.5 g liver fraction L; 5 g glucose; 1 g KH 2 PO 4 ; 1 g Na 2 HPO 4 ; autoclaved). Of isolates that could be grown only in Cerophyll, many could not be so maintained for long periods. A complete list of isolates, their collecting sites, Tetrahymena Stock Center accession numbers and associated GenBank accession numbers, is contained in Table S1; this table includes previously barcoded amicronucleates (Doerder 2014). Initials FPD and KD (Kristen Dimond) are used to identify primary collectors. GenBank accession numbers for the previously sequenced wild isolates and ATCC (American Type Culture Collection) strains (Chantangsi et al. 2007;Kher et al. 2011) are contained in those publications and are not repeated here.
Prior to any molecular work, most FPD isolates were challenged with a panel of all seven T. thermophila mating type testers to determine if they were mature T. thermophila and if so, their mating type. If conjugants were observed in the unmated control, the isolate was marked as a "selfer". Starved cells of the control well were vitally stained with acridine orange and examined for the presence/absence of the micronucleus by fluorescence microscopy (2509,4009).
DNA from Cerophyll (typically 15 ml) or PPY (or LP) (8-12 ml) grown cells was purified with a modified microwave procedure (Goodwin and Lee 1993) as previously described (Zufall et al. 2013). Standard PCR with primers listed in Table S2 was used to amplify DNA for the cox1 barcode region, nuclear SSUrRNA, nuclear D2LSUrRNA, mtSSUrRNA, and a portion of the LSUrRNA containing the SSI. For the latter, the primers bind to highly conserved flanking sequences resulting in products that are SSI+ (~900 nt) or SSIÀ (~500 nt) as revealed by agarose gel electrophoresis. For sequencing, all PCR products were prepared with shrimp alkaline phosphatase and exonuclease III, and, in most instances, were sequenced in both directions with the same or additional primers. Sequences were assembled, edited and aligned with Geneious Versions 5 and 6 created by Biomatters at http://www.geneious.com/. Trees were drawn and edited with Mega 6.0 (Tamura et al. 2013). Networks were drawn by Network and Network Publisher (fluxus-engineering.com) (Bandelt et al. 1999;Forster et al. 2001). Population parameters were calculated with DnaSP5.10 (Librado and Rozas 2009) and Arlequin 3.5.1.2 (Excoffier and Lischer 2010). The Automatic Barcode Gap Detection (ABGD) program (http://wwwabi.snv.jussieu.fr/public/abgd/) was used to bin barcoded isolates into hypothetical species. The cox1 alignment for this purpose contained 778 sequences of 588 nucleotides (85% of barcode length, to allow the inclusion of more isolates with shorter sequences). The alignment included sequences of wild isolates, unnamed ATTC strains, and type strains of named species; it has been deposited at the Tetrahymena Stock Center. Upper value of P was set to 3%, and the distance value was set to "Distance K80 Kimura" with the transition/transversion ratio set to 1.65 as calculated by Mega.
Most isolates were identified as to species by 98-100% identity of their cox1 sequence to the cox1 barcode of one of the type strains (Table 1) as designated by Chantangsi et al. (2007) and Kher et al. (2011); type strain GenBank accession numbers are found in Lynn and Doerder (2012). Identification was usually corroborated by 100% identity to type SSUrRNA sequence, though in some instances species have identical SSUrRNA sequences. Barcodes are not available for the following recognized species: T. chironomi, T. dimorpha, T. edaphoni, T. rotunda (amicronucleate), T. sialidos, and T. stegomyiae. It is possible, but likely unverifiable, that one or more of the new species reported here is one of these species. For six named Tetrahymena species, identification by cox1 barcode was verified by comparison of D2LSUrRNA sequences (Nanney et al. 1998) (Table 1). For T. elliotti, initial identification was by D2LSUrRNA. This is because, as explained in Text S1, the strain originally chosen for barcoding (Chantangsi et al. 2007) is not T. elliotti. The D2LSUrRNA sequences of 12 isolates (collected at seven different sites and including micronucleates, amicronucleates and selfers) were identical to those of 13/15 T. elliotti strains sequenced by Nanney et al. (1998). Following identification by D2LSUrRNA, T. elliotti isolates were identified using the cox1 barcode of the designated type. As discussed in Text S1 and Text S2, four "australis" species with identical SSUrRNA sequences, T. cosmopolitanis, T. nipissingi, T. nanneyi, and T. sonneborni, do not have reliable cox1 barcodes, and thus isolates suspected as belonging to them were grouped as "cosnipnanson". Identification of T. americanis required the designation of a type strain because as mentioned in Kher et al. (2011) and explained in Text S1, strain FL71 (ATCC 205052) chosen to represent T. americanis (Chantangsi et al. 2007) is not that species. Therefore, two of the original T. americanis ("T. pyriformis" syngen 2) as collected by Gruchy (1955) were obtained from ATCC, cox1 barcodes were determined and a type strain was designated. New Tetrahymena, Dexiostoma and Glaucoma species were designated based on cox1 sequence differences of >4%, as described in Results.
This paper makes use of a database (the NS database) of strain and biogeographical information collected by David Nanney and Ellen Simon and described in Text S1. This unpublished database formed the basis for assigning species to biogeographical regions in Simon et al. (2008) and is the basis for much of the biogeographical information at the website http://www.life.illinois.edu/nanney/tetra hymena/biogeography.html. The NS database was reconstructed from printouts gifted by David Nanney and is here made available in Excel spreadsheet form as Table S3 with caveats discussed in Text S1. Problematic strains highlighted in the spreadsheet were not used either for species identification or in population calculations. The NS database has also been deposited at the Tetrahymena Stock Center.

Species identification
Water samples were collected at a total of 939 sites ( Fig. 1)  were examined for cox1 barcodes and/or nuclear SSUrRNA sequence and assigned to existing or new species by matches to type strains (see Materials and Methods). Cox1 barcodes identified 651 isolates as belonging to 15 (possibly 17) Tetrahymena species named prior to 2017 and to one previously barcoded unnamed strain as listed in Table 1 and discussed species-by-species in Text S2. Of these, 252 isolates belonged to T. thermophila, many of which were confirmed by mating tests. This number should not be interpreted as indicating that T. thermophila is a common species; it is in fact a relatively rare species, and the above number is inflated for reasons associated with collecting (see below). Another 275 isolates were assigned to 32 new species of Tetrahymena (Table 2, Text S2), using criteria discussed below; after it was independently discovered during the course of this project, one of these new species, T. glochidiophila, was named in a separate publication ). Among Colpidium/Dexiostoma, three isolates were assigned to C. striatum, one to D. campylum and 29 to nine new Dexiostoma species (Table 3). Among Glaucoma, eight belonged to G. scintillans, 22 belonged to G. chattoni and 47 belonged to 12 new species (Table 3, Text S2).

New species of Tetrahymena, Dexiostoma, and Glaucoma
Isolates whose cox1 and SSUrRNA sequences found no match to those of type strains were candidates for new species (Tables 2 and 3). A cox1 sequence difference of >4% was used to assign isolates to new species, the same as previously used to declare amicronucleates as putative new species (Doerder 2014). This difference was chosen based on three considerations. The first, as cited in the Introduction, was that interspecific differences average 10.5% whereas intraspecific differences typically are 0-2%. The second consideration was that a plot of 222,778 pairwise differences among individual isolates and type strains ( Figure S1A) shows a break at 4%; with exceptions described below, pairwise differences <4% are intraspecific. The third consideration is that in most instances species with >4% cox1 barcode differences also have divergent SSUrRNA sequences, which in the near absence of SSUrRNA intraspecific variation, is indicative of species level differences. There are, however, in the "australis" clade two groups of biological species that not only have cox1 differences of <4% but also have identical SSUrRNA sequences (see below). These are nevertheless distinguished by their unique cox1 haplotypes. In any event, it is recognized that the value of 4% is arbitrary and a compromise between false positives and false negatives. As with past species criteria, future studies may alter the conclusions presented here. The validity of diagnosis of Dexiostoma/Glaucoma/ Tetrahymena species by cox1 barcodes was further investigated with the distance program ABGD (see Materials and Methods). ABGD uses the barcode "gap" that occurs when intraspecific differences are smaller than interspecific differences. By recursively inferring intraspecific limits and detecting barcode gaps, the program bins barcode sequences into hypothetical species until the number of groups is stable. For the present analysis, the alignment  Table S1. (Materials and Methods) of 788 sequences included wild isolates assigned by distance to both known and new species, designated type strains of named species, and miscellaneous strains for which cox1 sequences are available.
The inclusion of named biological species and the wild isolates assigned to them provides a test of the ability of barcodes to delineate species. In particular, should ABGD fuse species, the use of cox1 barcodes to diagnose new species would be compromised. ABGD runs typically produced~114 hypothetical species. Based on the number of named species and new species using the 4% rule, 104 species were expected. ABGD correctly grouped named species in all genera and fused only the problematic T. nanneyi and T. nipissingi ("cosnipnanson", see Materials and Methods). ABGD split T. americanis, T. borealis, T. thermophila, and T. tropicalis into 2-4 hypothetical species each, because in each species, there is a barcode "gap" between clusters of closely related haplotypes. Tetrahymena americanis was split into three hypotheticals, one consisting of 89 strains including type strain UM616, one consisting of 16 strains including strain UM351, and one consisting solely of isolate 20866-1. Strains UM616 and UM351 differ by 2.4% and are bone fide mating strains of T. americanis (Text S1). Notably, ABGD did not fuse any T. americanis group with closely related T. hegewischi; both species have identical SSUrRNA sequences. Tetrahymena borealis was split into two hypothetical species differing by 1.5%, one containing 52 isolates (<0.3% intragroup difference) and the other containing five isolates (<0.5% intragroup difference). Tetrahymena thermophila was split into two hypotheticals, with 93 and 17 isolates, respectively. These divergent haplotype clusters have been described before (see figure 2 of Zufall et al. 2013). The most extreme haplotypes differ by 3.8%, but the average pairwise sequence difference is 1.3% (mode 0.9%). Last, T. tropicalis was split into four hypothetical species. For this and reasons discussed in Text S2, this problematic species is in need of reinvestigation.
ABGD did not fuse any new species defined by the 4% rule but did split four into two hypothetical species each (see also Text S2). Again, the splits occurred because distinct haplotype clusters created a barcode gap. Amicronucleate T. alphathermophila n. sp. (nsp15) contains two sets of identical haplotypes collected at two separate locations and differing by 2%. For T. williamsi n. sp. (nsp21) one hypothetical species consisted of five isolates (two nucleotide differences) and the other consisted of two identical isolates, the two hypotheticals differing by 3.1%. SSUrRNA sequences of both hypotheticals are identical. The species T. conneri n. sp. (nsp41) was formed from two wild micronucleate isolates and amicronucleate strain ATCC50413. The ATCC50413 cox1 sequence is 99.1% identical to one wild isolate, but is missing the 5 0 region, For explanation, see legend for Table 1. where many of the differences between the two wild isolates occur (96.2% identical). The SSUrRNA sequences of the two wild isolates are identical, but there is no SSUrRNA sequence for the ATCC strain. Of the new species split into two groups by ABGD, T. conneri is the most likely to consist of two species. Mating tests were not performed. ABGD results indicate that cox1 barcodes readily distinguish among species and suggest that the 4% threshold is a conservative approach, less likely to split species. For Dexiostoma and Glaucoma average intrageneric pairwise differences were 7.8% and 11.6%, respectively. Among the new Tetrahymena species the average pairwise cox1 difference was 11.8% ( Figure S1A; for phylogenetic tree see Fig. 2). For all Tetrahymena species, the average intragroup pairwise differences in the "australis", "borealis", and "paravorax" clades were 8.4%, 10.7%, and 13.5%, respectively, with the "australis" and "borealis" clades showing considerable overlap ( Figure S1B). The relatively few differences <4% were mostly among previously named species, none of which were fused by ABGD. For the "australis" clade, these include T. americanis and T. hegewischi (2.1-2.5% difference), and these two species and ATCC strain FL71 (~4% difference) (see Text S1). Also, <4% are differences among members of the "cosnipnanson" group (differences ranging from 0.3% to 3.4%) and that group compared to T. pigmentosa (3.1-3.6% difference). Significantly, the only new species at <4% is T. mcdonaldae n. sp. (nsp30) which differed from T. pigmentosa by 3.4% (differences between T. mcdonaldae and "cosnipnanson" are 4.3-5.8%). In this case, T. mcdonaldae was declared a new species based on cox1 phylogeny as supported by its unique SSUrRNA sequence. In the "borealis" clade, T. borealis and T. canadensis differ by 3.9% and are the only two historical "borealis" species below the 4% threshold. The other "borealis" instance involves strains CO/NI/RA9/bSSU and their relationship with T. furgasoni, one of the originally named amicronucleate species (Nanney and McCoy 1976). As explained in Text S2, though cox1 difference is 2.7% (NI vs. T. furgasoni), uncertainties with respect to SSUrRNA sequences ( Fig. 2) and SSI divergence led to restraint in declaring these four strains and similar wild isolates as belonging to T. furgasoni.
The separation of species by differences in cox1 barcodes is consistent with low per site nucleotide diversity (p) of most species despite their often high haplotype diversity (Tables 1-3). The high p values of T. tropicalis and T. conneri n. sp. (nsp41) may indicate subspecies or even cryptic species (see ABGD results above; see also Text S2). The separation of species by cox1 barcodes is also consistent, in most cases, with differences in SSUrRNA sequences (see Phylogeny below). In some cases, the converse is true, as some species have identical SSUrRNA sequences but are distinguished by cox1 differences. For named species, these include members of the "australis" clade as mentioned above and in the "borealis" clade species T. borealis and T. canadensis. Other species pairs with identical SSUrRNA include T. alphapyriformis n. sp. (nsp46) and T. pyriformis; T. newhampshirensis n. sp. (nsp18) and T. colerunensis n. sp. (nsp23); T. holzi n. sp. (nsp14) and T. zeutheni n. sp. (nsp53); and several new species of Dexiostoma and Glaucoma.

Phylogeny
The SSUrRNA NJ phylogenetic tree shown in Fig. 2 places Dexiostoma, Glaucoma and Tetrahymena in separate clades and supports the "australis" and "borealis" clades of Tetrahymena. It also places T. paravorax and two relatives in a separate "paravorax" clade. Except for the inclusion of many more species, this tree strongly resembles the SSUrRNA NJ of Chantangsi and Lynn (their Supplementary figure 3, 2008). It is also similar to Chantangsi and Lynn's (their figure 2, 2008) SSUrRNA Bayesian tree in which T. paravorax is in its own clade but within monophyletic Tetrahymena. Trees with full length cox1 sequences (Chantangsi and Lynn 2008), regardless of method, while generally agreeing with SSUrRNA trees often differ in detail. For instance, such cox1 trees often include Dexiostoma and Colpidium within Tetrahymena and variably place T. paravorax in either the "australis" or "borealis" clade. NJ trees ( Figure S2, also ML and MP trees, not shown) with the cox1 barcode (representing less than half of the gene) generally lack deep branch bootstrap support and therefore meaningful phylogeny. For instance, some "australis" members are grouped with "borealis" members, and T. malaccensis and T. thermophila, close relatives in SSUrRNA and full-length cox1 trees, are not closely related in this tree. The cox1 barcode tree of Chantangsi et al. (2007) also groups species in unexpected combinations with low bootstrap values.
To explore the possibility that cox1 sequences are indicative of different trajectories between mitochondrial and nuclear genomes, and to explore their possible use in species identification, two nonoverlapping regions of the mitochondrial small subunit rRNA (mtSSUrRNA) of selected isolates were sequenced (Fig. 3). MtSSUrRNA sequences also were extracted from the complete mitochondrial genomes of T. malaccensis (Strain MP75, Gen-Bank DQ927303.1), T. paravorax (Strain RP, GenBank DQ927304.1), T. pigmentosa (Strain UM1060, DQ927305. 1), and T. pyriformis (Strain GL, GenBank L28677). Although the R (right) region is represented by more sequences, the L (left) region yielded similar results. Both R and L regions were highly variable, with R region average pairwise difference of~11% for the "borealis" clade and~3% for the "australis" clade. For the R region, unlike for nuclear SSUrRNA sequences, intraspecific variability was found among 5/16 species for which two or more isolates were examined; these include T. americanis (two transition sites), T. borealis (two transition sites), T. canadensis (one transition site), and T. pyriformis (one transition site). The more complex situation with T. tropicalis (three transitions, one transversion) contributes to the need to reinvestigate this species as possibly consisting of subspecies or more anciently diverged asexuals.  Sogin et al. (1986). The relationship between T. americanis and T. hegewischi also is complex. Among six T. americanis isolates were three variants (n = 3, n = 2, n = 1); the sole T. hegewischi was identical to the n = 2 variant of T. americanis. For T. thermophila, the two identical wild isolates differed from the GenBank sequence of the Strain B type strain at one transversion site, a difference not only likely due to the older sequencing method used to produce the type strain sequence, but also possibly due to mutation in separately maintained laboratory strains. Among polymorphic T. pyriformis, one wild isolate was identical to its type strain. At the L region, with fewer sequences, 5/6 species with multiple isolates were polymorphic, including T. glochidiophila (nsp10) and G. chattoni which were not polymorphic at the R end. Full length sequences likely will reveal more polymorphisms and polymorphic species.
The NJ phylogenetic tree of the R region of mtSSUrRNA (Fig. 3) is in substantial agreement with SSUrRNA trees (e.g., Fig. 2). The genera Glaucoma, Colpidium/Dexiostoma, and Tetrahymena form separate clades, and T. paravorax and T. glochidiophila (nsp10) together also form a separate clade. The "australis" clade is separated with strong bootstrap support, though it is included in a moderately supported clade with two "borealis" species. For the remaining members of "borealis" clade, their positions are similar to those SSUrRNA trees (Fig. 2). Unlike in the cox1 barcode tree ( Figure S2), T. thermophila, T. malaccensis, and T. elliotti form a wellsupported clade that also contains T. gruchyi n. sp. (nsp7) and T. alphathermophila (nsp15) as in nuclear SSUrRNA trees. Although sequences from additional species are desirable, the present mtSSUrRNA data do not indicate substantially different nuclear and mitochondrial histories as suggested by some cox1 phylogenies. The variability in cox1 that makes it useful as a barcode simultaneously renders it of limited use in phylogeny except for recently separated species (Chantangsi and Lynn 2008).
Among species which have identical nuclear SSUrRNA sequences, mtSSUrRNA sequences resolved T. borealis and T. canadensis (2.3% difference) and T. thermophila and T. alphathermophila (3% difference). They did not, however, resolve T. "cosnipnanson" where all five isolates were identical and are expected by cox1 sequences to contain three species.

The self-splicing intron (SSI)
Most FPD isolates were examined for their SSI phenotype, and SSIs of SSI+ species were sequenced. As shown in Tables 1 and 2 and Fig. 2 most Tetrahymena species lack the SSI, as do all members of Dexiostoma and Glaucoma. Among the SSI+ species, four contained both SSI+ and SSIÀ isolates. SSIs were found in Tetrahymena species throughout the collecting area ( Figure S3). Among the newly sequenced SSIs, those of 11 new species (Table 9) and T. malaccensis are unique and therefore potentially diagnostic of their species.
As suggested by mapping onto the SSUrRNA phylogenetic tree (Fig. 2) SSIs likely were acquired subsequent to the "australis/borealis" split as previously concluded (Sogin et al. 1986). This is further supported by SSI phylogeny that does not cleanly break along "australis" and "borealis" clades ( Figure S4B) and by the related observation that SSIs of close relatives such as T. thermophila and T. malaccensis are among the most divergent, differing by 13%. Conversely, among certain lineages SSIs appear to be vertically inherited from a common acquisition, for example, "cosnipnanson" and T. mcdonaldae; T. betathermophila, and T. thermophila. A fuller discussion of SSI acquisition and phylogeny is provided in supplemental Text S3.

Population biology, ecology and biogeography
Four species, T. americanis, T. elliotti, T. gruchyi, and T. borealis together account for over half of all Tetrahymena isolates (Table 4). Among the 15 most abundant species, six are new species, and nine belong to previously named species, including the unresolved "cosnipnanson". T. thermophila and T. vorax are omitted from Table 4 because the present collection was not random with respect to them. T. thermophila was sometimes purposely collected by revisiting ponds where it is resident. For T. vorax, all instances in which an isolate ate the T. thermophila mating type testers (see Materials & Methods) were further examined as there are several tetrahymenas that are predatory macrostomes; with one exception, all such isolates had the typical T. vorax shape, and cox1 sequences showed that they were T. vorax. Estimates place T. thermophila just below T. glochidiophila and T. vorax just below T. pyriformis in relative abundance. The relative abundance of previously named Tetrahymena species in the present collection is compared to North American entries in the NS database in Table 5, again with the exception of T. thermophila and T. vorax. In both collections, the most frequent species was T. americanis followed by T. elliotti and T. borealis, collectively accounting for 68% of previously named species in the present collection and 48% of previously named species in the NS database; these three species account for~40% of all isolates (Table 4). Rank order of other species is similar in both databases; the major exception is T. australis which was more frequent in the NS database (found frequently in FL and New England) than in the present collection (found once in NH). The third place rank of T. gruchyi (Table 4) is skewed by its abundance in KD's southern collection in which it was the most frequent isolate. In FPD's collection, T. gruchyi ranks between T. kentuckyensis n. sp. (nsp9) and T. glochidiophila in relative abundance. These results hint at geographical variation in species occurrence.
Of the pre-2017 named species found in this collection all except three were previously reported as present in North America (Simon et al. 2008). The new Nearctic species are: T. shanghaiensis previously known from Shanghai, China (Feng et al. 1988), here represented by a single amicronucleate isolate collected in OH; T. mobilis originally from sludge in Salzburg, Austria (Schiftner and Foissner 1998) here collected in KY, OH, and VA; T. farahensis previously collected from industrial wastewater in Pakistan (Zahid et al. 2014) here represented by two GA isolates. Representatives of strains designated bSSU/NI/RA/CO (n = 4, "borealis") and SIN (n = 1, "australis") were also found. Guppy parasites, NI and SIN were isolated in Singapore, and RA and CO were isolated in Israel (Leibowitz and Zilberg 2009). The new species T. alphapoecilia includes strain SIN. More detail is presented in Text S2; see also below for ecological preferences.
The richness of Tetrahymena species was observed both regionally (Table 6) and locally ( Table 7). The heavily sampled ANF (209 hectares) yielded as many species as the lesser sampled (with more sites) broadly defined Midwest states. As few as 15 eastern PA sites yielded 10 species. Among 351collecting sites (FPD), yielding one or more Tetrahymena-like isolates,~42% yielded two or more species, and one site, the most sampled CRWP, yielded eight species (Table 7), including three new species. One or more of the new species in Tables 2 and 3 was found in 17/18 collecting sites yielding four or more isolates; among the 33 sites yielding three species, seven yielded no new species and one consisted entirely of them. Table 7 indicates that relatively few samples per site are needed to potentially yield both named and new species.
Water bodies were not sampled so as to allow reliable quantitative estimates of proportions of species in them. However, finding two or more species in bodies with relatively few samples suggests that species abundance in these instances may be similar. Some ponds, however, appear to have dominant species. For example, Lake Gaston on the NC/VA border was sampled nine times at four locations, and all six Tetrahymenapositive samples yielded only T. gruchyi. This species also was the sole species at many of KD's collecting sites. For some ANF, western PA, eastern PA and New England ponds T. thermophila is the dominant species, present in the majority of Tetrahymena-positive water samples. This species has been found without fail since 1987 in CRWP, even though the pond has been (incompletely) drained for repairs at least twice; a more or less permanent beaver pond just downstream of the drainpipe likely is a refuge. The frequency of minority species in CRWP is small. Of 44 isolates which were either refractory to T. thermophila mating type testers or consumed the testers, nine (~20%) belonged to other species as determined molecularly; the others were (immature) T. thermophila. Since approximately 20% of all CRWP isolates were refractory to the mating testers, this sample suggests that for CRWP, about 4% of isolates belong to a species other than T. thermophila. T. elliotti was found twice, T. vorax, an easily noticed species because of its carnivory and shape,  was found only once in CRWP (in 1991). It is possible that minority species are transient. The best studied tetrahymena, T. thermophila, has restricted distribution (Zufall et al. 2013), low effective population size (N e ) (Katz et al. 2006;Zufall et al. 2013), and significant population structure (F ST ) (Zufall et al. 2013), all of which are consistent with what has been called "moderate endemism". Species in the present survey with sufficient cox1 sequences were examined for both N e and F ST . N e l values (N e proxy) are of the same order of magnitude as for T. thermophila or smaller (Table S4). There is a strong tendency for amicronucleate species to have the smaller values; the exceptional T. alphathermophila is an artifact of the fusion of two homogeneous populations (see Text S2). F ST values (Table 8) indicate population structure in 6/9 species. The exceptions are T. borealis¸T. canadensis, and T. glochidiophila. The latter has a large number of haplotypes distributed throughout the collecting region (see Text  S2). Surprisingly, globally distributed species such as T. americanis and T. elliotti also gave evidence of population structure. For more discussion of population structure along with appropriate maps and haplotype networks for specific species, see Text S2.
In addition to evidence of population structure in some species, there also appear to be ecological preferences. For instance, T. thermophila was typically found in smaller ponds and lakes, rarely in streams, whereas T. canadensis was collected more often from streams. T. aquafluente (nsp22) was found in streams, and of the seven Tetrahymena singletons, six were found in streams. Dexiostoma was isolated primarily from flowing water, as were many of the new species of Glaucoma. T. alphapoecilia previously reported as guppy parasite SIN (Leibowitz and Zilberg 2009) was here found in a stream, as were representatives of unnamed strains guppy parasites NI/ RA/CO; strain bSSU previously found in soil (Brandl et al. 2005) was also found in a stream. These species may have diverse ecological preferences. Aquatic invertebrates, larval and adult, that have in the past yielded species of Tetrahymena (see list in Lynn and Doerder 2012) were not sampled here. Such invertebrates have been sampled even less than streams, which, in general, are undersampled (see Discussion).

The number of species and biogeography
The biogeography of Tetrahymena reviewed in 2008 (Simon et al. 2008) integrated the results of collecting by Nanney's group with the earlier collecting by Elliott and his students. Though both groups collected mostly in North America, between them they also sampled parts of South and Central America, Europe, Australia, Asia, and several Pacific islands. However, vast portions of these areas and Africa were not sampled, and in North America, most collecting was in the northeast quadrant of the United States, as in this study. The Simon et al. review added to the list of named species present in two or more biogeographic zones, and listed 12 as found only in North America. There are now over twice as many described species, more species now have been found in two or more biogeographic zones, and several species, including ones with global distribution, give evidence of population structure predicted by endemicity.
The finding of 32 new Tetrahymena species in the FPD/ KD collection is in accord with the prediction (see Introduction) that there are many more species and is consistent with the apparent ease which past investigators (e.g.,   Batson 1983Batson , 1985Cho 1971b;Elliott 1973;Gruchy 1955;Jerome et al. 1996;Lynn et al. 2000;Nyberg 1981b;Pitsch et al. 2016;Quintela-Alonso et al. 2013;Simon et al. 1985;Zahid et al. 2014) discovered new species. Not only does the finding of additional new species in each successive collecting season of this study suggest that the sampled area has more species yet to be discovered, it follows that collecting on a larger, global, scale would identify dozens, more likely hundreds, of additional species, especially if collecting includes hosts for parasitic forms and moist soils (see below). In both the present collection and NS database, T. americanis, T. elliotti, and T. borealis were among the most abundant species, suggesting rather stable populations. By contrast, given its abundance (third rank) and range, it is perhaps surprising that new species T. gruchyi was not discovered in previous collections, particularly because it gives evidence of endemism.
The large number of Tetrahymena species and the implication of the same for Dexiostoma and Glaucoma are consistent with observations on other ciliates. The morphospecies Paramecium "aurelia" contains 15 cryptic species, and the genus itself contains many more named species, some of which also consist of cryptic species, for example, P. putrinum (Tarcz et al. 2014) and P. bursaria  Rautian et al. 2015) no new species have recently been added to the P. "aurelia" complex. This study suggests that Tetrahymena may be exceptionally species rich, even if limited to micronucleate (sexual) forms.
The amicronucleate condition is rare in ciliates but is particularly common in Tetrahymena, constituting about 25% of isolates (Doerder 2014). In most instances, cox1 haplotypes of amicronucleates are either identical to or sufficiently similar to those of micronucleate isolates to indicate recent origin (see Text S2). In other instances, however, there are no close haplotypes, for example, T. thermophila and its relatives T. alphathermophila and T. betathermophila, or a species is represented solely by amicronucleate isolates. For reasons discussed by Doerder (2014), some of these could be ancient, their micronucleate progenitors having become extinct. T. pyriformis, for example, is on an SSUrRNA branch (Fig. 2) that includes many amicronucleates, and neither the NS database nor this study reports its micronucleate counterpart. Similarly, amicronucleate T. aquasubterranea (Quintela-Alonso et al. 2013) from South Africa, though most closely related to micronucleate T. eriensis n. sp. (6.3% cox1 difference), places in a clade that also contains many amicronucleates. Further collecting, especially globally, could possibly locate close micronucleate relatives of these and other amicronucleates.
This survey extended the biogeographic range of three previously named species into the Nearctic: T. shanghaiensis first reported from China (Feng et al. 1988), T. mobilis first reported from Austria (Schiftner and Foissner 1998), and T. farahensis, first reported from Pakistan (Zahid et al. 2014). It also extended the habitat and range of guppy parasites found in Singapore (NI, SIN) and Israel (RA, CO) (Leibowitz and Zilberg 2009). All of the above were rare, isolated just 1-4 times. As this paper was being prepared, the new species T. gruchyi with evidence of endemism was found on the Pacific island of Guam (R. Zufall, personal communication). Whether other new species described here have ranges that extend beyond North America can only be determined by more extensive collecting.
The distribution of Tetrahymena species is relevant to the question of whether microorganism species are cosmopolitan or endemic (see Introduction). The former are found in multiple biogeographic regions, whereas the latter have limited, local distribution with little or no migration. Endemism predicts significant population structure whereas global distribution maintained by migration among large populations does not. While many Tetrahymena species are found in 2-4 regions and those above were added by this study, the matter is more nuanced. For instance, global species T. americanis and T. elliotti gave evidence of (North American) population structure, while T. canadensis did not (see below for haplotype distribution). Similar conflicting results were obtained with species found (so far) only in North America. T. thermophila, a species presently confined to eastern United States (Zufall et al. 2013), gives evidence of moderate endemism having limited migration from New England and significant population structure. The present F ST and N e results suggest similar endemism for G. chattoni, T. hegewischi, T. gruchyi, and T. kentuckyensis. Other new species with insufficient sample sizes also may be endemic. However, T. borealis confined to North America in the NS database does not have significant population structure. In this context, the issue of population size is relevant to the question of geographical distribution as it affects the ability to disperse. There are many variables to consider, such as seasonal fluctuation in population density and, in most species, the absence of cysts. Based on collection volumes, the average density of T. thermophila is 6-7 cells/liter in summer months when sampling efficiency is highest, a seemingly low density. Density is highest in the benthos, lowest in open water free of vegetation (Doerder and Brunk 2012). Effective population sizes (Table S4) of Tetrahymena species are similar to that of T. thermophila, and to the extent that N e is a proxy for census size, results suggest that population density is unlikely to contribute to dispersal. Both broadly distributed T. americanis and T. elliotti have N e proxies similar to endemism candidates listed above. Those Tetrahymena species have small population sizes that mitigate against ready dispersal should, however, be considered provisional. The present analysis does not include isolates from throughout the range of globally dispersed species. Paramecium species with global distributions are reported to have large effective population sizes (Snoke et al. 2006). In sum, what the present results indicate, more than any resolution of the question of cosmopolitanism versus endemism, is the need for extensive, global sampling. In this respect, Tetrahymena and related species are behind Paramecium (Przybos and Surmacz 2010).
Dispersal of cystless species has always raised questions as to mechanism, especially when long distances are concerned. For instance, what accounts for the occurrence of the same haplotype of T. canadensis in CO and Malaysia, a distance of over 10,000 km? Or, what is the explanation for the observation that T. thermophila and its closest relative, T. malaccensis, have so far only been found in separate life zones on opposite sides of the globe? Over shorter distances and perhaps timescales, there is the question as to how T. thermophila present in natural ponds in New England and eastern PA came to be present in artificial ponds further west in the ANF and western PA or further south in FL. These questions likely have different answers, but a facile answer is under sampling; despite new data presented here, the biogeography of Tetrahymena is woefully incomplete. Evidence of population structure suggests dispersal is rare, and but population intrafertility, though rarely studied (Nyberg 1981a), suggests it is sufficiently common to maintain species integrity, as also observed for widely dispersed and intrafertile species of Paramecium (Przybos and Surmacz 2010). As for dispersal itself, if some species are indeed tens of millions of years old (Doerder 2014;Wright and Lynn 1997), continental drift may be important. On a more recent time scale, dispersal may have been facilitated by an abundance of wetlands following glaciation. Species such as T. glochidiophila might be dispersed when infected glochidia are attached to fish gills; the ranges of T. glochidiophila and its host are roughly congruent . Other natural means of dispersal might include storm-borne water droplets or facultative parasitism in which infected hosts are carried long distances by storms. In the case of T. thermophila, however, the apparent migration west and south from New England is against prevailing weather patterns. In this case and perhaps others, human activity should be considered. People have transported water across long distances in ships either as drinking water or as ballast, a possible explanation for T. gruchyi appearing in Guam. On land, ciliates may have occupied water barrels of wagon trains or been carried in water tanks of the early steam railroads. The close connection between New England and ANF populations of both T. thermophila and T. vorax might be due to the spread of steam driven logging railroads in the late 19th and early 20th centuries. Fish stocking might also have contributed to the spread of some species. It could be interesting to look at Tetrahymena distribution in more anciently settled areas of Africa, Europe, and Asia.

Phylogeny
Tetrahymena phylogeny as historically based on morphological, life history, and ecological traits (reviewed by Lynn and Doerder 2012) has long been abandoned in favor of phylogeny based molecules (Brunk et al. 1990;Chantangsi and Lynn 2008;Preparata et al. 1989;Sadler and Brunk 1992). Rather than various species complexes ("pyriformis" for bacterivores, "rostrata" for parasites, "patula" for macrostomes), molecules clearly separate Tetrahymena into "australis" and "borealis" clades of mostly cryptic species with each clade including bacterivores, parasites, and macrostomes. The SSUrRNA phylogeny ( Fig. 2) with double the number of species of previous studies also supports these clades. Similarly, the mtSSUrRNA tree (Fig. 3), though consisting of fewer species clearly define an "australis" clade and suggest that T. eicheli and T. williamsi are its closest "borealis" relatives; low mtSSUrRNA interspecific variability in the "australis" clade suggests more recent radiation. The SSUrRNA and mtSSUrRNA trees also identify a third, smaller "paravorax" clade of more distantly related species. Tetrahymena paravorax was an outlier in studies cited above and has the largest cox1 difference compared to other Tetrahymena species. Although this suggests that the "paravorax" clade should be reexamined for possible elevation to the generic level, morphologically both T. paravorax and T. glochidiophila resemble other Tetrahymena species . The mtSSUrRNA sequences obtained here also corroborate nuclear SSUrRNA phylogeny that clearly separates Tetrahymena, Dexiostoma/Colpidium and Glaucoma into separate clades, indicating that nuclear and mitochondrial genomes are not on separate trajectories as suggested by some cox1 trees that place Dexiostoma/Colpidium within Tetrahymena.
The present results indicate that mtSSUrRNA sequences, even with their polymorphisms, may be useful in diagnosing most Tetrahymena, Dexiostoma, and Glaucoma species, including some with identical nuclear SSUrRNA. However, a more extensive survey with the complete mtSSUrRNA sequence is needed to properly assess its utility. The intraspecific polymorphisms may be of use in analyzing migration and population structure.
The SSUrRNA phylogeny (Fig. 2) is not congruent with phylogeny of the SSI ( Figure S4B). Absent in Dexiostoma, Glaucoma and the "paravorax" clade, as concluded by Sogin et al. (1986) SSIs appear to be acquired subsequent to the split between the "borealis" and "australis" clades, with multiple instances of independent acquisition. Though likely acquired from bacterial food sources, no similar bacterial Group I intron is detected in BLAST searches of GenBank; the closest related intron is that of a slime mold. No intraspecific sequence variation was observed, and no SSI gave evidence of retaining the homing endonuclease gene, suggesting that SSIs sweep through a species upon acquisition. Mixed SSI+/SSIÀ species are likely in the process of losing, rather than acquiring, the intron (Text S3). Sequences of SSIs from globally collected Tetrahymena may help resolve their origin. SSI sequences may in some instances be diagnostic of species.
Another reason for further SSI study is their possible relevance to the question of Tetrahymena speciation. Two main drivers of speciation are partitioning of food sources and avoidance of predators. As envisioned by Nanney (1999), the evolution of the basic tetrahymena form was followed by species radiation facilitated by an abundance of bacterial food. Unfortunately, nothing is known about food preferences in wild populations; in the lab, most species can be successfully maintained on Klebsiella pneumoniae, though the exceptions suggest that specialization exists. Another hint that there may be differences in Tetrahymena food preferences is if the different SSIs were acquired from different, albeit related, bacterial food sources. A test of this hypothesis requires the rather unlikely finding, particularly if millions of years are involved, of similar bacterial SSIs. In this context, it would be useful to know whether Tetrahymena species indeed have food preferences, as such preferences might also help explain species distribution.

Parasitism and the edaphic environment
Insect hosts of Tetrahymena species include larvae and sometimes adults of black flies (Simuliidae), mosquitos (Culicidae), midges (Chironomidae), and alder flies (Sialidae); other hosts include slugs (Limacidaea), glochidia (Unionidae), guppies (Poeciliidae), and (!) dogs (Canidae) (for references see Lynn and Doerder 2012). Additional hosts infected with tetrahymena-like ciliates are reported in the older literature (for summary see Corliss 1979). More recent host examples include planaria (Planariidae) (Wright 1981) and amphibians (Leiopelmatida) (Shaw et al. 2011), and doubtless there are many others. Infection rates typically are low, usually just a few percent or less. An unknown species of Tetrahymena was found in dead mosquito larvae (Takahashi et al. 2004), and I have found Tetrahymena in casts of dragonfly larvae and in remains of dead invertebrates and fish, perhaps attracted by bacteria. From the edaphic environment, Tetrahymena species have been isolated from mosses (Kahl 1926;Kozloff 1946Kozloff , 1957Roque et al. 1970) and soil (Brandl et al. 2005;Foissner 1987). Were this survey to have included aquatic invertebrates, mosses, and soils more species likely would have been discovered, as predicted (see Introduction). There is a long history of sampling from these sources, but relatively little study in most cases beyond initial species descriptions. For parasites, available evidence suggests that there is host species specificity in some instances (Batson 1983(Batson , 1985Jerome et al. 1996;Lynn et al. 1981), but other species are known to infect multiple host species (Van As and Basson 2004). Conversely, there is evidence that some hosts may be infected by multiple Tetrahymena species, for example, guppies infected by T. alphapoecilia (SIN), strain NI and possibly T. corlissi (based on morphology Hoffman et al. 1975).
There are currently 12 Tetrahymena species listed as parasites (aka histophages) (Lynn and Doerder 2012;Lynn et al. 2017) and characterized as either facultative or obligate, the latter meaning that free-living cultures could not be established. Though some may disperse and be acquired as cysts, most others (because cysts were not observed) likely disperse and are acquired by their hosts as free-living forms. For instance, T. rostrata and T. limacis, better known as gastropod parasites, have been found in moss where they could be easily acquired by feeding snails. Transmission of other parasitic tetrahymenas is less obvious, but free-living forms of T. farleyi (dog parasite) have been reported (Quintela-Alonso et al. 2013), and this study found free-living forms of the guppy parasites NI/SIN and mussel parasite T. glochidiophila. Similarly, this study found aquatic forms of the soil strain bSSU.
The parasitic species of Tetrahymena do not cluster on SSUrRNA phylogenetic trees, and the placement of histophage T. glochidiophila in the "paravorax" clade (see below) strengthens the suggestion that histophagy/parasitism is an ancestral trait Struder-Kypke et al. 2001). As with many parasites and pointed out by Struder-Kypke et al. (2001), tetrahymenas have complex nutritional requirements, requiring 10 amino acids, sterols, and two nitrogenous bases, among others (Hill 1972), consistent with their having originated as parasites in which nutrient rich environments such as hemocoels or renal glands made metabolic losses tolerable. Tetrahymenas commonly regarded as free-living bacterivores are capable of histophagy, evidenced by their successful long-term storage with axenic rat gut (Williams et al. 1980), and laboratory infection of invertebrate hosts using standard laboratory strains has been known for some time (e.g., Grassmick and Rowley 1973;Thompson 1958). The presence of variant surface antigens, with structure similar to variant surface antigens of other protist parasites (Simon and Schmidt 2007), might also indicate an ancient disposition toward parasitism. In short, using barcodes to unambiguously identify tetrahymenas, many ecological and life-cycle hypotheses can now be tested.
Finally, a deliberate search for parasites likely would find many more parasitic species but also might find parasitic forms of species currently known only as free-living. With barcodes, isolates can now be unambiguously identified even if the parasitic and free-living forms are morphologically different as was found, for example, in T. dimorpha (Batson 1983). It is unlikely, however, that named parasitic species without barcodes can be unambiguously rediscovered. Several of these are reported only once, decades ago, and may not be recoverable from original collecting sites, and even then species identity is not certain.

Criteria for diagnosing species
Species are artificial constructs with no single all-encompassing definition. Species diagnosis by DNA sequences is increasingly common and is especially useful for cryptic species in which, by definition, morphology fails to distinguish among them (J€ orger and Schr€ odl 2013; Renner 2016). Sonneborn (1957) long ago proposed that obligate inbreeders (e.g., selfers) and asexuals, neither of which satisfies biological species criteria, could be declared species based on some minimum threshold of (genetic) divergence. Modern experts have supported this suggestion (Schlegel and Meisterfeld 2003), and others, noting the difficulties associated with protist alpha-taxonomy, recommend that species definitions be made "on a case-by-case basis" (Boenigk et al. 2012). Abandoning morphology in favor of genetics is not, however, so straightforward. For instance, in a recent paper on best practices, Warren et al. (2017) place considerable emphasis on morphology as part of a species description. While acknowledging that cryptic species present special challenges, they nevertheless encourage (though explicitly do not require) investigators to search for new morphological traits suitable for multivariate analysis as a way of distinguishing among species. While this might reveal interesting evolutionary detail, the analysis involved is beyond the expertise (and perhaps resources) of most investigators and is clearly impractical for field studies such as this one. A more detailed response to Warren et al. and references to ICZN rules with respect to molecular description appear in the paper formally naming T. glochidiophila . It is, however, perhaps worthwhile in the present context to cite two examples in which expert morphological description has failed. The first concerns the naming of the amicronucleate T. aquasubterranea (Quintela-Alonso et al. 2013). Despite elegant, detailed (light, electron) microscopic description these authors did not find distinguishing morphological characters, though to be fair, few such detailed descriptions exist for Tetrahymena species. Other traits such as its "biphasic life cycle" consisting of feeding trophonts and starvation-induced theronts are found in T. mobilis (Quintela-Alonso et al. 2013), and forms similar to theronts have been seen in other species (Nelsen 1978;Nelsen and DeBault 1978). Moreover, the description of highly starved theronts as "very small, broadly lenticular or ovate lying motionless on the bottom" (Quintela-Alonso et al. 2013) applies to many wild isolates observed in this collection; indeed, the phenomenon was sufficiently common that instances were not recorded. Ultimately, the naming of T. aquasubterranea was, like other recent examples, "Based on the considerable genetic distance. . ." of its SSUrRNA and cox1 sequences (Quintela-Alonso et al. 2013). The second example concerns a discrepancy regarding G. chattoni discovered in this study. In this case, the original strain and the independently identified reference strain appear to be different species. As detailed in Text S2, the D2LSUrRNA sequence (Nanney et al. 1998;Preparata et al. 1989) of the original strain described by Corliss (Corliss 1959) differs from that of present isolates identified as G. chattoni by cox1 similarity to the designated type strain (Chantangsi et al. 2007). The latter was identified as G. chattoni by an expert, and assuming no error or strain mix-up it likely is a separate, cryptic species, possibly (and unknowably) one of the syngens of G. chattoni morphospecies studied by Cho (1971a). Since the original Corliss isolate exists in ATCC, its D2LSUrRNA, cox1, and SSUrRNA sequences should be determined.
In lieu of morphology, this paper names new species of Tetrahymena, Dexiostoma, and Glaucoma using as species diagnostics cox1 barcodes and SSUrRNA sequences as supplemented by SSI and mtSSUrRNA sequences. As the equivalent of unique morphological traits , these sequences permit the unambiguous identification (with the few exceptions that have not been adequately barcoded) of isolates independent of actual morphology or sexual status. The sequences can be obtained from isolates that are difficult to culture in sufficient quantities for morphological and breeding studies, and they are currently the only practical way to identify isolates in field studies where identification restricted to morphospecies severely underestimates biodiversity. Although obtaining barcodes requires some molecular expertise, they are relatively easily obtained by standard PCR and sequencing techniques; indeed, with support, the procedures are suitable for high school students (https://tetrahymenaasset.vet.cornell.edu/science-modules/ by-name/field-research/). Sequence data have the additional advantage of providing insight into evolutionary relationships and population biology. Perhaps most importantly, sequences are archievable in public databases, useful in the absence of living or preserved material. Based on these arguments, it is unfortunate that in applying barcoding to Paramecium (Krenek et al. 2015) the authors refrained from formally naming three cryptic species, although both SSUrRNA and cox1 sequences differences clearly indicated their status as separate species. Rather, Krenek et al. assigned provisional names with the prefix "Eucandidatus" to indicate their tentative status as species, a solution that compromises their scientific utility. Przybo s and Tarcz (2016) recently named three morphologically indistinguishable species of "P. jenningsi" based both on molecular divergence and mating incompatibility.

New species of Tetrahymena, Dexiostoma, and Glaucoma
Forty-seven new species are described in Table 9 (Tetrahymena) and Table 10 (Dexiostoma and Glaucoma). The same information in conventional paragraph form is presented in Supplemental Information Text S4. The 48th new species found in this survey is T. glochidiophila reported by Doerder (2014) as unnamed species nsp10; it was formally named in a separate publication .
Type strains (holotypes) were designated based on combination of one or more of the following (i) successful accession of live cells and/or DNA at the Tetrahymena Stock Center at Cornell University (Ithaca, NY); (ii) sufficient cox1 sequence; (iii) availability of SSUrRNA sequence; (iv) presence of a micronucleus; (v) presence of the SSI. For some type specimens, SSUrRNA sequence was not available in which case Tables 9 and 10 indicate the GenBank accession number of a secondary type so indicated by parentheses. Unfortunately, some isolates,  particularly not only Dexiostoma and Glaucoma, but also Tetrahymena, were lost because they were difficult or impossible to maintain in the laboratory. While they reproduced for a time in bacterized media, they resisted transfer to axenic medium (e.g., PPY) or grew slowly. Some fastidious isolates were successfully grown axenically in LP medium. Other investigators have also reported difficulty growing certain isolates under laboratory conditions (e.g., Corliss 1960;Lynn et al. 1981Lynn et al. , 2000. A few isolates could not be successfully frozen in liquid nitrogen. The cox1 barcodes and SSUrRNA and other sequences provided here will permit investigators to unambiguously assign new isolates to these species. Though GPS coordinates of type strain collecting sites are provided, there is no guarantee that the species can be collected again from that water source. In addition, it should not be assumed that the type locality (e.g., stream or pond) is typical.
Of the Tetrahymena species, 32 are the result of the collecting reported here, and five are strains deposited at ATCC by David Nanney and Ellen Simon (see Acknowledgements). The cox1 and SSUrRNA sequences of the latter were reported by Kher et al. (2011).
Only two species of Glaucoma (Table 10) are named here. The other 10 are not named for two related reasons. First, though all isolates belong to a well-supported clade (Fig. 2), this clade contains not only Glaucoma but also the genera Glaucomides and Bromeliophrya. Only Gnsp1 and Gnsp2 are considered congeneric with G. chattoni and G. scintillans. Based on genetic distance, the remaining 10 species likely constitute two new genera. Second, the Catalog of Life (catalogueoflife.org) lists 33 species of Glaucoma, many described decades ago. Although several names are clearly synonymous (e.g., variant spellings) and others are likely invalid, five of these species bear some resemblance to Gl. bromelicola (Foissner 2013). Until barcodes are provided for these species (if indeed any still exist in culture or are unambiguously identified among new isolates), naming of new Glaucoma species here risks misclassification and duplication. Unfortunately, isolates reported here did not survive to be archived as cryopreserved specimens; DNA samples, however, were deposited. The barcodes should allow any new isolate to be assigned to one of Gnsp1-Gnsp12, or, as may be more likely, yet another new species.

ACKNOWLEDGMENTS
I am especially grateful to Kristen Dimond (University of Houston) for permission to include the results of her southern collections in this paper. She is responsible for naming nsp44 and nsp45. Also contributing to this paper are former CSU students Shannon Valentine who determined SSI presence/absence for most of FPD's isolates, Ryan Phillips who sequenced portions of the mitochondrial SSUrRNA, and Scott Fulton who sorted out the relationship between T. americanis and T. hegewischi and designated the type strain for the former. Renee Heberle meticulously transcribed old computer printouts to create the NS database. I also acknowledge the continued