Diatom (Bacillariophyceae) assemblages in tidal environments of Vancouver Island, British Columbia, Canada

To understand distributions of coastal diatoms along Vancouver Island, British Columbia, Canada, this paper describes diatom assemblages observed in 47 surface sediment samples from intertidal environments. One hundred and eighty‐four diatom taxa were identified from five transects crossing tidal flats, salt marshes, and freshwater forests in Tofino, Ucluelet, and Port Alberni. Distributions of the diatom assemblages were consistent with those reported elsewhere in the Pacific Northwest, but a few diatom taxa show different trends in their distributions. For example, one benthic species Denticula subtilis shows widespread distributions along the transect in Tofino. An ordination shown by Detrended Correspondence Analysis (DCA) using a combined dataset indicated overlapped scatter plots of diatom assemblages, suggesting that assemblages with similar species compositions are observed in more than one location. Hierarchical and k‐means clustering analyses using Euclidean distance recognized unique small groups along each transect. Rank abundance curves show different trends for richness and evenness of diatom assemblages among the five transects.


INTRODUCTION
In coastal areas, different diatom species are known to have different tolerance and optima to certain environmental variables, such as salinity, substratum, and tidal exposure time (e.g. Sullivan, 1975Sullivan, , 1978Hemphill-Haley, 1993, 1995Sawai et al. 2016). To understand the relations of coastal diatom assemblages with environmental variables, distributions of coastal diatoms and their relationship with their proximate environmental variables have been reported worldwide (e.g. Aleem 1950;Patrick & Reimer, 1966, 1975McIntire & Overton 1971;McIntire, 1973McIntire, , 1978Riznyk 1973;Main & McIntire 1974;Sullivan, 1975Sullivan, , 1978Moore & McIntire 1977;Admiraal & Peletier 1979, 1980Admiraal et al. 1984;Hemphill-Haley, 1993, 1995. Some of the studies of coastal diatoms have used multivariate analysis to evaluate relationships among the assemblages objectively (Sullivan, 1975(Sullivan, , 1982Sherrod 1999;Zong & Horton 1999;Busse & Snoeijs 2003;Haubois et al. 2005;Horton et al. 2006;Ulanova & Snoeijs 2006;Roe et al., 2009;Rovira et al. 2012;Sawai et al., 2016Sawai et al., , 2017Hocking et al. 2017;Hong et al. 2021). However, most papers provide a similar distribution of the same diatom species from previous studies (Cunningham & McMinn 2004) mainly because they only investigated a narrow spatial environment (e.g. within a single site, or similar type of depositional environment). To address this problem, the number of studies on diatom assemblages from various types of environments should be increased so that the complete range of ecological tolerance of diatom assemblages can be better understood (Sawai et al., 2016;Watcham et al. 2013).
Paleoecologists have previously used coastal diatoms as environmental indicators (e.g. Friele & Hutchinson 1993;Hutchinson et al., 1997Hutchinson et al., , 2004Clague et al. 1999;Patterson et al. 2005;Letham et al. 2016;Fedje et al. 2018), but only a limited number of papers on modern coastal diatoms have been published for western Canada (Roe et al. 2009). In this paper, we provide new data on modern diatom assemblages from coastal environments along Vancouver Island, Canada. The dataset includes 47 control samples from five transects across a freshwater forest and swamp, salt marsh, and tidal flat. The results of this study will help to reinforce understanding regional distributions and ecology of coastal diatoms in Vancouver Island, Canada and elsewhere. salinity, and vegetation of vascular plants at the study sites are described herein.
The West Coast of Vancouver Island is primarily characterized by rocky coasts with numerous bays and narrow inlets. Salt marshes are often restricted to the fjord-head areas. According to astronomical tidal prediction (see Materials and Methods), the maximum tidal range in these areas is about 4.0 m. In Tofino, the measured salinity in tidal flat and salt marshes ranges between 25 and 30. Whereas in Ucluelet and Port Alberni, salinity is measurable only in low elevation locations (tidal flat and low marsh) and was approximately 30 and 10, respectively. The freshwater forests are characterized mainly by coniferous trees (e.g. Western Red cedar, Thuja plicata and Sitka Spruce, Picea sitchensis). Dominant vascular plants in the salt marshes were Carex spp., Triglochin maritimum Linnaeus, Salicornia virginica, Cyperaceae sp., and Argentina sp. Tidal flats in Tofino were dominated by Zostera spp., but there was no coverage of vegetation on tidal flats in other areas. Five transects were selected near Tofino, Ucluelet, and Port Alberni to represent various types of environments from a freshwater forest, salt marshes (high, highmiddle, middle, low-middle, and low marshes), and tidal flats.

Sediment samples
Surface sediments (less than 2-3 mm thick) were collected from 47 locations on Vancouver Island in August 2015 (transects A, B, D, and E) and June 2016 (transect C) ( Fig. 1). At each study site, the vegetation zones of sampling locations were divided based on differences in composition of dominant taxa of vascular plants. For example, the low marsh is characterized mainly by Carex lyngbyei Hornemann and Salicornia virginica, Triglochin maritimum, and the middle marsh is dominated by Triglochin maritimum. In cases where mixed assemblages of dominant taxa of vascular plants were recognized, we described them as highmiddle and low-middle marshes. The sampling locations were positioned along leveled transects. In all cases, samples were collected using a small spatula during low tide. The samples were brought back to the laboratory for immediate processing.
Elevations of the sampling locations were surveyed using a total station. For each sampling location, the highest tides of the day were measured by leveling the sea level every 6 min near each transect. The highest tides and elevations of the sampling locations were then calibrated to the elevation relative to the mean tidal level based on the nearest tide gauge data of Canadian Hydrographic Service and astronomical tide predictions.
The salinity of the sampling locations was measured using a water quality meter (U-52; HORIBA, Ltd) at the same time as sampling surface sediments. The mud content of the sediment samples was calculated by difference in weight before and after gentle washing through sieves in the laboratory during grain-size analysis.

Converting from elevation to duration of tidal inundation
To facilitate comparison among study sites with a different tidal range, a number of studies since Horton et al. (2006) calculated standardized water-level index (SWLI) to normalize absolute elevation by applying the trait of the tidal curve with respect to a tidal datum (mean higher high water, MHHW; highest astronomical tide, HAT, etc.). However, SWLI calculations result in a poor vertical alignment at higher elevations because of non-linear relationships between elevation and tidal inundation (Wright et al., 2011;Kemp & Telford, 2015) (Fig. 2).
This study converted elevations into a tidal exposure index (TEI) following Sawai et al. (2016). Duration time of tidal inundation is a direct variable for species distributions and enables one to compare samples in different sites with different tidal ranges, without any standardization. Tidal levels were calculated every 6 min over a year based on astronomical tidal predictions using a computer program WXTide32 release 4.7 (Hopper, 2006). The tidal exposure index (TEI) was then represented by percentage of duration time in 1 year.

Data analysis
Cluster analyses have been applied for grouping of microfauna and algae (e.g. Servant-Vildary & Roux 1990;Kemp et al. 2012;Dalu et al. 2016;Sawai et al. 2016;Pilarczyk et al. 2020). Hierarchical clustering, one of the most common methods of cluster analyses, enable researchers to make dendrograms without a fixed number of clusters. In the dendrogram, the nodes represent clusters, and the lengths of the stems (e.g. distance, similarity index) addresses the similarity that clusters are joined. If a horizontal line were drawn across a given level of the distance, the group (cluster) would be given (e.g. Sneath & Sokal 1962;Everitt et al. 2011), although giving the level of the distance in the dendrogram needs to be diagnosed by researchers (e.g. number of the clusters). Another clustering model, k-means clustering (centroid-based clustering), segregates samples more objectively than the hierarchical approach. However, it requires users to decide the specific number of clusters to partition the assemblages before running the cluster analysis (e.g. Kaufman & Rousseeuw, 1990;Everitt et al. 2011). Although the number of clusters can be determined by maximum average silhouette widths (Kaufman & Rousseeuw, 1990), it is sometimes difficult to resolve this because the silhouette method does not always give a clear optimal number of clusters. In this study, these two cluster analyses were used for robust identification of groups of diatom assemblages.
Both hierarchical and k-means analyses were carried out using Euclidean distance among the assemblages along each transect. Agglomerative hierarchical cluster analysis employed Ward's Minimum. K-means clustering produces silhouette score plots with widths ranging from À1 to +1 © 2021 The Authors. Phycological Research published by John Wiley & Sons Australia, Ltd on behalf of Japanese Society of Phycology. (Rousseeuw 1987;Kaufman & Rousseeuw, 1990). Silhouettes are used for evaluating a sample's classification; values close to +1 indicate that the sample is within the appropriate cluster, whereas values close to À1 suggest an incorrect cluster (Rousseeuw 1987). The maximum average silhouette widths were used to determine how many clusters should be recognized for k-means clustering. These analyses were executed using the NbClust package in R version 3.6.3 (Charrad et al., 2014). In the cluster analyses, rare species (those with a maximum abundance of <3% in any samples) were excluded from the dataset of individual sites. Detrended Correspondence Analysis (DCA) was used to understand the relationships among the diatom assemblages from each transect. DCA, an ordination technique, represents assemblage samples as points in multi-dimensional space. In the ordination diagram, similar assemblages are located close together and dissimilar assemblages farther apart. The DCAs were carried out for diatom assemblages both from local and all samples. The combined data of diatom assemblages have a gradient length of 5.29. This indicates that the linear methods (Principal Component Analysis; PCA, Redundancy Analysis; RDA) are not appropriate for the dataset because the data are too heterogeneous and there are too many species for linear response models (ter Braak & Šmilauer 2012). Also, the gradient lengths of local datasets are approximately over 3.0 (transect A: 2.94, transect B: 5.23, transect C: 3.74, transect D: 3.01, transect E: 3.27). Therefore, the DCA as a unimodal model of the diatom abundance data is reasonable for ordination of the assemblages. DCA was executed using CANOCO release 5.0 (ter Braak & Šmilauer 2012). As with the two cluster analyses, rare species (<3% in any samples) were removed from the dataset for DCA.
The rank of dominant species versus their relative abundance curve (rank abundance curve, abundance curve, or Whittaker plot) is used to understand the richness and evenness of diatom assemblages identified in this study. The rank abundance curve is drawn as a simple two-dimensional plot with relative abundance (%) on the Y axis and the abundance rank (the first dominant species is given rank 1, the second dominant species is given rank 2, etc.) on the X axis. The steeper gradient in the curve shows lower evenness and richness; higher rank taxa have higher abundance than lower rank taxa at the location. The plateau curve represents high richness and evenness with similar abundances among different taxa (Whittaker 1965).

Diatom assemblages
One hundred and eighty-four diatom taxa were identified from Vancouver Island (Table S1). Among them, 82 taxa had abundances greater than 3% in one or more samples, and 34 appear at a maximum abundance greater than 10%. Figures 3-7 display how two cluster analyses were grouped and relationships among samples along each transects, and Figures 8 and 9 shows result of DCA using local and combined dataset, respectively. Detailed explanation of the results of cluster analyses and local DCA are described below;

Transect A in Tofino
Sixty-one diatom taxa were identified from nine samples along transect A across high marsh, middle marsh, low marsh, and the tidal flat (Fig. 3a,b). One benthic species, Luticola mutica

Transect B in Tofino
This transect crosses freshwater forest, high-middle marsh, low marsh, and a tidal flat. Here 97 diatom taxa were identified from 12 samples (Fig. 4a,b). Cosmioneis pusilla, Nitzschia palustris Hustedt, and Pinnularia lagerstedtii dominate in the freshwater forest. These species also show high abundance in high-middle marsh locations, but decrease in abundance in low marsh and tidal flat locations. Four diatom species Luticola mutica, Navicula cryptotenella Lange-Bertalot, Gyrosigma eximium, and Fallacia pygmaea (Kützing) Stickle & D.G.Mann dominate in low marsh locations. Tryblionella debilis Arnott ex O'Meara and Denticula subtilis are dominant in the transition zone between high-middle and low marshes. Navicula salinicola Hustedt, N. salinarum Grunow, and N. gregaria are widespread in both the low marsh and tidal flat. Some of the diatom species show samplespecific distributions. For example, the benthic Nitzschia palustris is dominant in assemblages of location TB-11 and TB-12, but decreases in other samples (Fig. 4a). Hierarchical cluster analysis reflected these sample-specific assemblages and indicated small clusters that consist of two samples (samples TB-11 and 12, TB-1 and 2, TB-3 and 4, TB-6 and 7, TB-5 and 8, TB-9 and 10) (Fig. 4e). In contrast, k-means clustering proposed two clusters which is well consistent with vascular plant zonation (Fig. 4c,d). DCA ordination placed the assemblages from low (left) to high (right) elevation samples along axis one (Fig. 8), and this was consistent with the two cluster analyses.

Transect C in Tofino
Transect C covers high marsh, high-middle marsh, middle marsh, and tidal flat. On this short transect near transect B, 92 diatom taxa were identified from 10 samples. Changes in composition of diatom assemblages on transect C were clearer than those in transects A and B (Fig. 5a,b). Three species Tryblionella debilis, Luticola mutica, and Denticula subtilis dominated high and high-middle marshes, whereas brackish species Nitzschia scalpelliformis Grunow, Navicula cryptotenella, N. gregaria, Gyrosigma eximium, and Fallacia pygmaea were common in the middle marsh. Planothidium delicatulum was abundant in the tidal flat. Navicula tenelloides, N. salinicola, N. salinarum were widespread throughout the transect. K-means and hierarchical cluster analyses provided similar results. These two tests suggest five clusters of the assemblages (Fig. 5c-e). Samples from this transect were characterized by a high abundance of a few species (Denticula subtilis, Tryblionella debilis, Navicula cryptotenella, and Planothidium delicatulum) (Fig. 5a,d,e). Such distributions of diatom assemblages are probably a cause for producing many small clusters. In DCA ordination, samples generally occupy the ordination from low (right) to high (left) along axis one, though TC-8 is shown as an outlier sample on the ordination (Fig. 8). Samples from high and high-middle marshes and from middle marsh were placed on the lower left and lower right, respectively. This ordination shows results more similar to the hierarchical cluster analysis than k-means clustering.

Transect D in Ucluelet
Sixty-two diatom taxa were identified from nine samples along a transect across high-middle, middle, and low marshes (Fig. 6a,b). Two benthic diatoms, Nitzschia scalpelliformis and N. palustris, dominated at the highest location on this transect. The two species decreased in abundance in sampling locations from lower elevations. The dominant species was instead replaced with Denticula subtilis, Luticola mutica, Navicula salinarum, N. gregaria, and Melosira nummuloides C. Agardh. Two cluster analyses provided consistent results (Fig. 6c-e). In the results of both k-means and hierarchical clustering, low and middle marsh locations formed the largest cluster (samples UC-1-5; cluster UC-4 in Fig. 6d), whereas high-middle marsh locations characterized cluster UC-b (samples UC-6-8 in Fig. 6d). The highest cluster UC-c reflected peaks of two species, Nitzschia scalpelliformis and N. palustris. DCA ordination well represented vertical relationships of samples along the transect. The assemblages were placed from low samples (left) to high (right) along axis one (Fig. 8). The DCA ordination represents similar results both to k-means and hierarchical clustering, but more similar to k-means clustering than hierarchical cluster analysis.

Transect E in Port Alberni
At the fjord-head site of Alberni Inlet, 67 diatom taxa were identified from seven samples along the transect. Transect E covers high marsh, low-middle marsh, and a tidal flat (Fig. 7a,b). The high marsh locations are characterized by Nitzschia sigma (Kützing) W. Smith, Navicula tenelloides, N. gregaria, and Gyrosigma eximium, whereas low-middle marsh and tidal flat samples are characterized by Planothidium delicatulum, Paralia sulcata (Ehrenberg) Cleve, Biremis lucens (Hustedt) Sabbe, Witkowski & Vyverman, and Amphora salina W. Smith. Two clusters were proposed by k-means and hierarchical clustering. Samples from the low-middle marsh and tidal flat contained one cluster (samples PA-1-4; cluster PA-b in Fig. 7d), whereas the other three samples characterized another cluster (samples PA-5-7; cluster PA-a in Fig. 7d). DCA ordination reflects the results of the two clustering. The assemblages were placed from low samples (left) to high (middle to right) along axis one (Fig. 8).
Result of DCA for all diatom assemblages Figure 9 shows the results of DCA using a combined dataset to show a relationship of compositions in diatom assemblages among our study sites. Forest locations (low salinity, short duration of tidal inundation) from Tofino and high-middle marsh from Ucluelet lie to the left along axis one, whereas tidal flat and low-middle marsh (high salinity, long duration of tidal inundation) from Tofino and Port Alberni are to the right. In the ordination diagram, samples from multiple sites on Vancouver Island are widespread along axis one but also overlap each other. The overlapped scatter plots suggest that similar assemblages are observed in more than one location. However, three samples (TA-1, TB-12, and TC-8) are outliers of the other plots on the ordination diagram, which implies that species compositions of diatom assemblages of the three samples differ markedly. Sample TB-12 is the only sample from a freshwater forest, hence it locates as an outlier position in samples from intertidal environments. The other two samples (TA-1 and TC-8) are probably related to high abundance of a few species (Melosira moniliformis and Nitzschia Hassall sp.-15 in TA-1, Mastogloia exigua F.W. Lewis in TC-8).
Diversity of diatom assemblages Figure 10 displays cross-plots of ranked dominant species (top 20 species) and their relative abundance (%) as rank abundance curves of each samples. The rank abundance curves show different trends according to transects. In transects A, B, and D, the rank abundance curves have different shapes within each transect. For example, in transect B, marshy samples show higher richness and evenness with gentler curves than tidal flat samples. In contrast, tidal flat samples indicate higher richness and evenness than marshy samples in transect D (Fig. 10). In transect C, all curves represent steep curves as low richness and evenness. The curves in transect E overall show high richness and evenness relative to the other transects.
Difference in diversity of diatom assemblages along the transects is affected by numerous factors. As mentioned in the previous paragraph, rank abundance curves of transects B and C show different shapes, although the two transects are very close (Fig. 1). This difference may simply represent differences in compositions of microflora according to microtopography and patchy vegetation cover. Or, because the samples were taken in different years, the difference may mean seasonal or/and annual changes in abundance of the 1st to 3rd dominant taxa. Furthermore, the differences in rank abundance curves among the transects A-C (Tofino), D (Ucluelet), and E (Port Alberni) probably show site-specific distributions of diatom assemblages. At this moment, this study has insufficient data to formally address these issues.

Response to environmental variables
Scatter plots display relationships between the relative abundances of 12 diatom taxa and certain environmental variables (salinity and tidal exposure index; TEI) (Fig. 11). The taxa shown in the scatter plots had abundances greater than 5% in five or more samples. Paralia sulcata was also displayed to compare with their distributions of Washington and Oregon states, although their abundances of greater than 5% appear only in three samples. In the plots, Navicula salinarum show a unimodal distribution along TEI, whereas Navicula gregaria and Luticola mutica distribute as a bimodal and exponential response, respectively. Three species, Denticula subtilis, Navicula tenelloides, and Planothidium delicatulum show widespread distribution along TEI. All scatter plots except for Denticula subtilis are consistent with those previously reported on the coast of USA (e.g. Sullivan 1975Sullivan , 1978Whiting & McIntire 1985;Nelson & Kashima 1993;Hemphill-Haley 1995;Sherrod 1999;Sawai et al. 2016), Canada (Roe et al. 2009), and other regions (Zong & Horton 1999). For example, three common salt marsh species, Cosmioneis pusilla, Luticola mutica, and Tryblionella debilis, appear on high marsh environments both in this study and other reports from Washington and Oregon states (Hemphill-Haley 1995;Atwater & Hemphill-Haley 1997;Sawai et al. 2016). Another diatom species common in brackish environments, Navicula gregaria (e.g. Puget Sound; Sherrod 1999, Oregon coast;Sawai et al. 2016, Baltic Sea;Snoeijs & Vilbaste 1994), is also widespread in intertidal environments in other regions. In contrast, there is a significant difference in the distribution of diatom species compared with previous studies in Washington and Oregon coasts. One brackish diatom, Denticula subtilis, is known for its narrow vertical zonation in tidal environments (e.g. Atwater & Hemphill-Haley 1997;Sawai et al. 2016). However, the distribution of this species is widespread along TEI (ranging from about 0 to 40% in Fig. 11), appearing along the entire transect A (Fig. 3). The high abundance of Denticula subtilis on transect A may be explained simply by frustules/ valves transported by daily tides like other taxa discussed in the other regions (e.g. Nelson & Kashima 1993;Sawai 2001). Showing local ecological tolerance and optima of the species is not possible because this study does not have sufficient data to address this possibility. Additional sampling and analysis of the ecological data for species is required. Paralia sulcata, a planktonic or tychoplanktonic species, is well known as its widespread distributions in samples from salt marsh surface sediments in Washington and Oregon, USA (e.g. Nelson & Kashima 1993; Hemphill-Haley 1995; Sawai et al. 2016) and Hokkaido, Japan (Sawai et al. 2001). In these regions, straight celled chains of Paralia are considered to float easily in daily tides and are distributed across all intertidal environments (Sawai et al. 2001). However, the appearance of Paralia is very limited in these study sites. Paralia specimens are distributed only on tidal flat and low-middle marsh locations in Tofino and Port Alberni, respectively (Figs 4 and 7). Their limited distributions suggest that daily tidal ebbs and flows have little effect on distributions of Paralia in our study sites and their distributions may simply represent a difference in their local ecological optima compared to other regions.
Future systematic sampling, measuring environmental variables, and counting diatoms will be needed to reveal species response to environmental variables along the Pacific coast of British Columbia, Canada. Statistical analyses have been carried out in other coastal areas of Canada and USA by others (e.g. Sullivan, 1975Sullivan, , 1982Wilderman 1987;Sherrod et al. 1999;Roe et al. 2009;Sawai et al. 2016). Roe et al. (2009) applied Canonical Corresponding Analysis (CCA) to study relationships between diatom assemblages and environmental variables in salt marshes of northern Vancouver Island, and showed that diatom species distributions are related to elevation and grain size/organic content. Sherrod et al. (1999) used CCA and concluded that diatom distributions in Puget Sound, Washington, were controlled primarily by salinity and secondarily by elevations. Sawai et al. (2016) also used CCA and showed that grain size distributions are one of the most important variables to control diatom distributions in salt marshes of Washington and Oregon. Hong et al. (2021) employed RDA for diatom assemblages in Willapa Bay, Washington, and concluded that elevation, salinity, and substrata affect distributions of diatom assemblages. These reports noted that the majority of the total variation in diatom assemblages was unexplained by measured environmental variables (e.g. more than about 50% in Roe et al. 2009, and over 89% in Sawai et al. 2016). Their unexplained variance is regarded to be to the result of overlooked variables, or to stochastic variation (Sawai et al. 2016). Sawai et al. (2016) also noted that other environmental variables, such as light intensity (Admiraal & Peletier 1980), nutrient concentrations (Underwood et al. 1998), adhesion capacity and sediment sorting (Jonge 1985), as well as grazing by insects and fish (Hill & Knight, 1987Abe et al., 2001Abe et al., , 2006Abe et al., , 2009Pillet et al. 2011;Pillet & Pawlowski 2013), may affect distributions of coastal diatoms. This study is unable to proceed along a similar discussion using our dataset because some environmental variables were not measurable due to surface conditions (e.g. too dry to measure water conditions; Figs 3b, 4b, 5b, 6b and 7b).

CONCLUSIONS
To reveal distributions of coastal diatom assemblages, samples from five transects across tidal environments were investigated along Vancouver Island, British Columbia, Canada. A majority of diatom assemblages in this study are consistent with those previously reported along other coasts. However, there are significant differences in the distributions of diatom species. A benthic diatom Denticula subtilis, shows widespread distributions compared with previous studies. Also, Paralia sulcata, of which widespread distributions have been reported in other regions, are only distributed on tidal flat and low-middle marsh locations in Tofino and Port Alberni, respectively. Rank abundance curves, which represent richness and evenness of diatom assemblages, show different trends among studied transects. The difference of the rank abundance curves may represent local blooming of dominant species. Further study requires a larger number of samples from a greater range of modern tidal environments including collection during different seasons, to fully capture the ecological features (structures and diversities) of coastal diatoms in Vancouver Island.

ACKNOWLEDGMENTS
Funding was generously provided by Geological Survey of Japan, AIST and the Geological Survey of Canada. Generic Mapping Tools (GMT, Wessel et al. 2013) was used to draw index maps in Fig. 1a. Yuichi Namegaya discussed in calculating tidal predictions. The manuscript was improved by comments from two anonymous reviewers.