Distribution and biogeography of Sanguina snow algae: Fine‐scale sequence analyses reveal previously unknown population structure

Abstract It has been previously suggested that snow algal species within the genus Sanguina (S. nivaloides and S. aurantia) show no population structure despite being found globally (S. nivaloides) or throughout the Northern Hemisphere (S. aurantia). However, systematic biogeographic research into global distributions is lacking due to few genetic and no genomic resources for these snow algae. Here, using all publicly available and previously unpublished Sanguina sequences of the Internal Transcribed Spacer 2 region, we investigated whether this purported lack of population structure within Sanguina species is supported by additional evidence. Using a minimum entropy decomposition (MED) approach to examine fine‐scale genetic population structure, we find that these snow algae populations are largely distinct regionally and have some interesting biogeographic structuring. This is in opposition to the currently accepted idea that Sanguina species lack any observable population structure across their vast ranges and highlights the utility of fine‐scale (sub‐OTU) analytical tools to delineate geographic and genetic population structure. This work extends the known range of S. aurantia and emphasizes the need for development of genetic and genomic tools for additional studies on snow algae biogeography.

. While snow algae are diverse, perhaps the most well known are algae that form red blooms in late season open field snows caused by the recently established genus Sanguina (Procházková et al., 2019) [previously assigned to Chlamydomonas cf. nivalis] which currently comprises two delineated species, S. nivaloides and S. aurantia.
We know relatively little about these Sanguina spp. due to our current inability to culture Sanguina, which would facilitate controlled experimentation and genomic characterization, but they are presumed to consist of green haploid vegetative cells. These cells take advantage of a dynamic layer of water that has a fluctuating solid and liquid phase where peak levels of liquid water occur during the summer/spring. During that time, abiotic and biotic deposits are more readily solubilized and made accessible to the vegetative cells (Jones, 1999). These flagellated haploid cells are thought to reproduce asexually, and when nutrients, primarily nitrogen and phosphorous, become limited, haploid gametes fuse and produce a diploid hypnozygote with thick cell walls that produce vast quantities of the secondary carotenoid astaxanthin and its fatty ester derivatives, leading to red snow coloration (Gorton, Williams, & Vogelmann, 2001;Müller et al., 1998). These hypnozygotes are resting cysts that allow overwintering, and upon snow ablation, meiosis occurs. As snows continue to melt, vegetative cells will also encyst to protect the organisms and facilitate oversummering. S. nivaloides is globally distributed and has been found on every continent (Brown & Jumpponen, 2019;Hoham & Remias, 2020;Novis, 2002;Procházková et al., 2019;Segawa et al., 2018) but much about the basic biology, metabolism, and reproduction strategies of this genus remains unresolved. Further, despite numerous morphological examinations (Kol, 1968;Leya, 2013;Procházková et al., 2019;Remias, 2012;Remias, Lütz-Meindl, & Lütz, 2005;Weiss, 1983) and molecular characterizations (Brown & Jumpponen, 2019;Brown et al., 2016;Krug et al., 2020;Lutz et al., 2016;Procházková et al., 2019;Segawa et al., 2018), we know very little about global dispersal capabilities of these algae, which is likely to be an important factor in structuring landscape population assemblies.
Given that Sanguina nivaloides is cosmopolitan and has been found across the globe where perennial snows are present, and S. aurantia has an apparent circumpolar and alpine distribution across the Northern Hemisphere (Procházková et al., 2019), it is surprising that no population structure, even across intercontinental distances, has been seen to occur (Brown & Jumpponen, 2019;Procházková et al., 2019). Recent work has suggested very little sequence variation within targeted loci (Brown & Jumpponen, 2019;Brown et al., 2016;Procházková et al., 2019;Segawa et al., 2018), and no observed local isolation by distance can be found (Brown et al., 2016). Further, Procházková et al. (2019) conducted the most detailed to date investigations into population and genetic structure using a haplotype approach for the ITS2 region and failed to find distinct population structure across the ranges of these Sanguina species, although they did identify serval S. nivaloides haplotypes suggesting some genetic differences. Two recent publications expand on the apparent lack of population structure of Sanguina nivaloides. Procházková et al. (2019) wrote the following: Our data showed a cosmopolitan distribution of S. nivaloides in alpine and polar snowfields in both hemispheres, which supports the theory of a trans-equatorial dispersal of microbes (Hodač et al., 2016). No population structure was detected when analysing the ITS2 rDNA data, as there was no phylogeographic signal. Metagenomic analyses have shown red pigmented snow algae to be cosmopolitans based on the analysis of partial sequences of the 18S rRNA gene (Lutz et al., 2016) as well as of the ITS2 rDNA (Segawa et al., 2018).
Further, Brown and Jumpponen (2019) similarly came to this following conclusion: These core algae … highlight two main points of discussion: (1) common snow algae are extremely conserved globally with nearly identical ITS2 sequences found across vast distances and across years, and (2) we know very little about the global genetic diversity or dispersal patterns of these snow algae.
Taken together, this suggests that genetic variation within populations may be indistinguishable globally and even may be near-identical and all accessions in the global sequence repositories seemingly support this assertion (Brown & Jumpponen, 2019;Brown et al., 2016;Lutz et al., 2018). This brings up important but currently unanswered questions: how are these algae dispersed and how do they colonize snows? It seems unlikely that global populations of Sanguina have active gene flow over vast distances or trans-equatorial dispersal capabilities given that long-distance aerial transport is presumed unlikely outside extreme weather events (Brown & Jumpponen, 2019). This lack of population structure globally may be an artifact of sequence representation as most investigations into Sanguina molecular ecology target the 18S ribosomal RNA gene (SSU) or the Internal Transcribed Spacer region 2 (ITS2) of the rRNA gene operon. The 18S is generally highly conserved, which may preclude fine-scale demarcation of algal taxa (Lutz et al., 2018) but ITS regions are hypervariable and can be readily used to demarcate algal species (An, Friedl, & Hegewald, 1999;Brown et al., 2016). It is surprising, given the hypervariable nature of ITS regions, that we would observe near-identical sequences globally, but similar observations from the fungal literature suggest that on rare occasions, some species have extreme conservation in ITS sequences across vast geographic separations (Hughes, Morris, & Segovia, 2015;Hughes, Tulloss, & Petersen, 2018). This potential hypersimilarity of rRNA regions may explain these observed patterns in Sanguina, but we do not have enough non-rRNA sequence data as of yet to determine whether this is an aberration or if these species are in fact globally hypersimilar. Here, we harvest all publicly available ITS2 sequences for both S. nivaloides and S. aurantia to investigate global population structures using a minimum entropy decomposition approach to examine if Sanguina spp. do in fact have homogeneous population structure across their ranges.

| MATERIAL S AND ME THODS
To investigate global population structure of Sanguina, we use a minimum entropy decomposition (MED; Eren et al., 2015) framework to create sensitive unsupervised sequence partitions (MED nodes) based on local base pair entropy. MED iteratively partitions gene marker data into homogeneous nodes (MED nodes) based on only information-rich nucleotide base pairs, thereby omitting stochastic variation from the obtained nodes. This has been demonstrated to provide sensitive, but informative, separation of closely related sequences and strains. To do so, we analyzed all available and verified ITS2 sequences at the time of analysis from Sanguina species from GenBank, SRA, and supplemental information from associated publications. We chose to analyze the ITS2 region as opposed to 18S or other gene targets because ITS2 has the most available data, and ITS regions have great potential for species-level population analysis for algae . We gathered the following Sanger  (Katoh & Standley, 2013)  and remaining sequences were determined to not belong to Sanguina and were discarded. Discarded sequences were mainly assigned to the Trebouxiophyceae, other non-Sanguina Chlorophyceae or were poorly matched to any reference taxa. It may be that a few errant sequences not belonging to either target Sanguina species may have been included as part of the OTU clustering, but we have no evidence that casts doubt on the veracity of these sequences. These retained OTUs will hereafter be referred to as S. aurantia or S. nivaloides. All associated retained sequences were collected (Table 1; Appendix S1) and coded by location for Sanguina species-specific MED analyses (S. nivaloides and S. aurantia were analyzed separately). Some locations were binned to increase sequence representation or based on geographic proximity; Colorado and Wyoming sequences were combined as Rocky Mountains, Finland, Sweden, Norway (including Svalbard) were combined as Fennoscandia (+ Svalbard), and all European samples apart from the Nordic countries were binned as Europe.
All aligned sequences for S. aurantia and S. nivaloides separately underwent minimum entropy decomposition (Eren et al., 2015) to demarcate ecologically relevant operational units (MED nodes) for each species. This yielded 36 MED nodes for S. aurantia and 25 MED nodes for S. nivaloides (Table S1. Appendix A1). MED node distribution networks were visualized using the program Gephi (v.0.9.2; Bastian, Heymann, & Jacomy, 2009), and cluster analysis (as implemented in the program MED) along with associated visualizations was conducted using Canberra distances on MED Node × Location matrices for S. nivaloides and S. aurantia separately. Canberra distance (Lance & Williams, 1967) maximizes the effect of differences between samples with many low or zero values which some of our locations have (Table 1) and was calculated using the program MED

| RE SULTS
Our collected sequences (Table 1) (Table 1), so associated inferences about these distributions should be taken with reasonable skepticism, but given the geographic proximity of the Antarctic and Argentinian samples, we think this is likely a true pattern but more data are needed to confirm.

| D ISCUSS I ON
Here, we present an in-depth investigation into Sanguina spp.
snow algae biogeography. We utilize all public Sanguina ITS2 rDNA F I G U R E 2 Results of clustering analysis using Canberra distances for Sanguina aurantia (top) and S. nivaloides (bottom) depicting regional similarity of snow algae populations. Where population structure is indistinguishable, and of sufficient sample size (Table 2) it is indicated with dashed connecting lines and denoted with "NS." Here, we confirm previously suspected Northern Hemispheric endemism of S. aurantia, as we found no sequences in any of the combined genetic repositories that belong to S. aurantia south of Colorado (USA) and expand the currently known S. aurantia range to include Fennoscandia, Alaska, and the Cascade Mountains.
Additional sampling will likely extend this range, but it is uncertain if the true range will be extended south of the equator. Further, we detect distinct structure (different MED node distributions) of S. aurantia between all sampling locations (  (Buonomo et al., 2017). However, one of the tenants of metapopulation theory is that subpopulations have a reasonable probability of movement across the metapopulation landscape (Keymer, Marquet, Velasco-Hernández, & Levin, 2000) and it is uncertain if Sanguina are capable of this movement.
Future studies must examine additional loci and have expanded sampling ranges to confirm that these two locations are in fact genetically similar.
In contrast, S. nivaloides appears to have a global and bi-bipolar  Table 2] but this is only based on a single Argentinian sequence, so caution must be exercised when making inferences about these populations). This suggests long-distance isolation by distance patterns in these snow algae (similar to Schmidt et al., 2011) but such an isolation by distance is not seen on local or regional scales (Brown et al., 2016). This is in direct opposition to previous studies that suggest a lack of global population structure of this snow alga (Brown & Jumpponen, 2019;Procházková et al., 2019;Segawa et al., 2018).
Again, as with S. aurantia, this is likely due to the limited resolution TA B L E 2 Results of k-Sample-based Anderson-Darling (AD) tests of Sanguina species MED node distributions with post hoc comparisons between groups afforded by traditional distance-based OTU clustering methods to distinguish between populations that an entropy-based analysis appears not to suffer from, or due to analyses based on limited sequence representation (Procházková et al., 2019). Further, while several other comparisons were indistinguishable in our post hoc analysis ( Table 2), given that Antarctica and Argentina consisted of so few sequence (two and one sequence respectively), these population similarities should be treated with caution, but in the interest of transparency, these data are retained here. The question remains, why are populations similar between the Cascade and Rocky Mountains (~1,500 km distant) but dissimilar elsewhere? These two mountain ranges are not continuous but do have several substantial ranges in between them including the Teton, Sawtooth, and Wallowa Mountains suggesting the potential for metapopulation maintenance via the stepping-stone hypothesis (Yang et al., 2016). Or, these populations could be a result of similar physicochemical conditions found in these two ranges. More research is needed to disentangle snow physical and chemical properties between these two sites. It remains unresolved why these two Sanguina species have contrasting metapopulation dynamics; S. aurantia populations were genetically distinct between the Cascade and Rocky Mountains whereas S. nivaloides were not.
Here, we used ITS2 rDNA locus-targeted sequencing data to examine global population structure of Sanguina spp. which had previously been assumed to be largely nonexistent. It should be noted that the ITS2 is not a common gene target for phycological examinations but has been demonstrated to be invaluable in delineating species descriptions Procházková et al., 2019) and for snow algae community ecological examinations (Brown & Jumpponen, 2019;Brown et al., 2015Brown et al., , 2016Segawa et al., 2018).
However, this seems unlikely for long-distance transport as insects generally do not move on the scale of intercontinental transport and most long-distance migratory birds exhibit atrophy of the digestive system during migrations which limits aerial waste release (McWilliams & Karasov, 2005). Additionally, snow algae could rely on aerial transport. However, long-distant transport may be unlikely outside of extreme weather events (Brown & Jumpponen, 2019) due to the relative large size of their hypnozygotes (ca. 20 μm in diameter); propagating units of this size are only modeled to be capable of atmospheric transport for around 12 hr (Wilkinson, Koumoutsaris, Mitchell, & Bey, 2012), and the limited published aerial sampling for snow algae has revealed no appreciable algae cysts (Novis, 2001).
However, there is evidence of mid-range aerial dispersal of Antarctic algae (Marshall & Chalmers, 1997). Sanguina hypnozygote morphology may provide the answer; when hypnozygotes are slightly desiccated, Sanguina may have raised veined ridges (see figure   5c in Procházková et al., 2019) which may assist aerial transport.
This is reminiscent of echinolophate pollen morphology in some Compositae plant species that is hypothesized to aid in long-distant transport (Bolick, 1978;Keeley & Jones, 1979). Of course, any apparent morphological similarity may be inconsequential but together, this suggests that aerial dispersal may be viable mode or organismal transport, but likely not on an intercontinental scale, and additional work is needed to confirm this capability.
Populations might also be the result of legacy effects from previous global snow and glacier algal colonization. Snow communities, and algae in particular, may have played a large historic role during the Cryogenian (720-635 MYA), a period marked by extreme cold and near-global glaciation. Geologic records and modeling to this end suggest cold-tolerant algae were the dominant primary producers during the Cryogenian, sequestering massive stores of organic carbon that subsequently released when the climate warmed (Hoffman, 2016). We may find that current extant populations may be remnant populations of this historic radiation, but dated phylogeographic analyses to confirm this are not feasible with current data. Based on models of snow cover during that period (Hoffman et al., 2017), snow algae potentially covered the majority of Earth's habitable surface area and may have influenced snow melt rates and movements of organic carbon pools (Ganey, Loso, Burgess, & Dial, 2017;Hood, Battin, Fellman, O'Neel, & Spencer, 2015). However, it remains unresolved if contemporary snow algae are as influential to global or local nutrient cycling dynamics (but see Hamilton & Havig, 2020).
Here, we suggest that snow algae within Sanguina do not have homogeneous population structure across their respective ranges as has previously been suggested. The discrepancy between our results and those previously reported is likely due to the hypersimilarity of the ITS2 region of Sanguina species that traditional distance-based OTU clustering analyses are unable to resolve.
Instead, we see that S. aurantia has a circumpolar and temperate alpine distribution in the Northern Hemisphere with largely distinct population structures and S. nivaloides exhibits bipolar and alpine distributions that are broadly genetically distinct. This represents a novel understanding of Sanguina distributions and highlights that there is a dearth of information about these snow algae,