Hit or miss? Impact of time series resolution on resolving phytoplankton dynamics at hourly, weekly, and satellite remote sensing frequencies

Characterizing marine phytoplankton community variability is crucial to designing sampling strategies and interpreting time series. Satellite remote sensing, microscopy sampling, and flow through imaging systems have widely different resolutions: from weekly or monthly with microscopy sampling to daily when no cloud cover or glint is present with polar‐orbiting satellites, and hourly for autonomous imaging instruments. To improve our understanding of data robustness against sampling resolution at different taxonomic levels, we analyze 2 yr of data from an Imaging FlowCytobot with hourly resolution and resample it to daily, satellite‐temporal, and weekly microscopy sampling resolution. We show that weekly and satellite‐temporal resolutions are sufficient to resolve general community composition but that the randomness of satellite‐temporal resolution can result in overrepresenting or underrepresenting certain categories. While the yearly phytoplankton biomass bloom is detected in late winter by all four resolutions, category‐specific yearly blooms are generally consistent in timing but often underestimated or missed by the weekly and satellite‐temporal resolutions, introducing a bias in year‐to‐year comparisons. A minimum of biweekly sampling, particularly during known bloom periods, would lower the bias in such categories. Similarly, sampling time should be considered as daily variations are category‐specific. Overall, morning and low tide sampling tended to have higher biomass. We provide tables for categories detected by the IFCB in Narragansett Bay with their major bloom characteristics and recorded daily variability to inform future sampling designs. These results provide tools to interpret past and future time series, including possible detection of specific taxonomic groups with targeted satellite algorithms.

Who is here?Phytoplankton are microscopic organisms generally too small to be detected by the human eye, yet their environmental importance has prompted a wide variety of sampling techniques and the establishment of time series with different resolutions over time.These methods include discrete light microscopy, satellite remote sensing, and particle imaging, each complementing each other through their own strengths and limitations, and with resolutions that may be detailed enough for some species, but not for others.
Historically, one of the most common approaches was discrete weekly or monthly sampling at ocean time-series sites (e.g., Hawaiian Ocean Time Series, Bermuda Atlantic Time Series, Narragansett Bay Plankton Survey [NBPTS]).For example, the Hawaiian Ocean Time Series has highlighted primary production changes coupled with climate oscillations associated with shifts in plankton assemblage composition (Corno et al. 2007;Karl et al. 2021).Microscopic counts have also allowed the identification of emblematic taxa for certain regions and the monitoring of toxins and toxin-producing phytoplankton (e.g., Belin et al. 2021).By increasing the temporal sampling resolution, particle imaging instruments (e.g., Imaging FlowCytobot [IFCB], Cytobuoy; Lombard et al. 2019) deployed over the last couple of decades have served as early warning systems in the detection of harmful algae blooms (HABs) as well as understanding their environmental forcings (Kraft et al. 2021;Kenitz et al. 2023;Carney et al. in press) and forecasting them (Agarwal et al. 2023).They have recorded multiyear blooms related to climatic variables (Campbell et al. 2017) and novel interactions (e.g., temperature-dependant Guinardia delicatula and parasites infection; Peacock et al. 2014;Catlett et al. 2023).On larger time and spatial scales, ocean color satellite remote sensing time series have been used to characterize changes in phytoplankton bloom timing and size (Friedland et al. 2018), and new algorithms can go beyond bulk biomass and retrieve a small number of phytoplankton types and size classes (Mouw et al. 2017), with increasing possibilities to detect more precise groups with the expansion of hyperspectral remote sensing (Wolanin et al. 2016;Vandermeulen et al. 2017) and the launch of the new NASA PACE mission (Werdell et al. 2019).
While a coarse resolution might be sufficient to describe large-scale processes, dynamic coastal regions require higherresolution sampling (Mouw et al. 2015).Daily sampling, for instance, allows the retrieval of species succession in complex spring blooms (He et al. 2022).Zhang et al. ( 2022) also used simulated time series of chlorophyll from MODIS with 2-to 30-d resolution to show that responses of the phytoplankton blooms to climatic factors in inland waters varied based on resolution.This is also the case in coastal regions where the impact of tides, resuspension, and river plume dispersion makes satellite hourly observations suitable and desirable for monitoring surface processes associated with biological activity (Mouw et al. 2015;Arnone et al. 2017).In a coastal system, chlorophyll can vary on an hourly timescale with tidal cycles, both during the day-night and spring-neap tidal cycles (Blauw et al. 2012).The wide range of available temporal scales opens the possibility of adapting sampling strategies more closely to research questions, targeted species, funding, or allocated people to the job.
In situ autonomous sampling instruments require careful maintenance but provide very high data resolution (hourly), making them suitable to capture dynamic subdaily processes.Samples are collected as images, providing a permanent record of raw data and allowing the possibility to go back, if needed, to improve analysis and minimize phytoplankton counting errors and taxonomic misidentification.However, the sample volume is limited, and long chain-forming species can either be excluded or broken up by built-in prefilters installed to prevent clogging, generally around a 150 μm size.The large volume of collected images also requires automated image analysis through machine learning techniques (Orenstein et al. 2022).These can have a high computational load (e.g., neural network/deep learning) or need an extensive library of preidentified species to ensure accuracy (e.g., random forest).Species that are rare in the dataset can be missed or misidentified.Light microscopy, although the historical method, also presents some limitations as it is timeconsuming, often uses fixatives, and provides a lower time resolution (weekly or monthly).Human factors, such as personnel changes, fatigue, inexperience, and expert inconsistency, can also affect the results (Culverhouse et al. 2003(Culverhouse et al. , 2014)).This can reduce the number of species considered to ensure consistency across the dataset (Peperzak 2010).However, light microscopy provides us with intricate morphological details from different angles and is a trade-off between temporal resolution, taxonomical resolution, and available funding.Eventually, a third widely used source of data is satellite imagery.Being freely available, satellite data covers the largest spatial scale and is most accessible, but to the detriment of taxonomic details.Multispectral satellite remote sensing has focused on global chlorophyll a and a few phytoplankton types (Mouw et al. 2017).Algorithms and models can also be specifically tuned to detect particular taxa that are known to occur in a local area and have distinguishable optical properties, such as Trichodesmium sp.(McKinna 2015) or Karenia brevis (Soto et al. 2015).Determining specific phytoplankton species from hyperspectral optical data is complex, but some studies have shown that hyperspectral optics can allow retrieval of 20 different phytoplankton species (Zhu et al. 2019), and by providing global hyperspectral data, the NASA PACE mission will enhance the possibility to monitor phytoplankton diversity from space (Cetini c et al. 2024).The obtention of data is also tightly linked to the satellites orbiting time and the absence of cloud cover and glint, providing a less regular sampling than light microscopy or imaging.Polar-orbiting satellites generally provide data every 2-3 d in the absence of clouds and products are often aggregated on a weekly or monthly scale.Coastal products also tend to have a sparser temporal resolution due to the additional of optical complexity of the in-water constituents, land adjacency effects, the potential for more complex atmospheric correction needs, and the possibility of uncertainty from bottom reflectance.
It is thus important to understand how the resolution of these different sampling strategies impacts the retrieval of phytoplankton dynamics at different taxonomic precisions to interpret past and future time series and design well-adapted sampling protocols.Here, we resample a 2-yr hourly time series from an IFCB deployed in Narragansett Bay, Rhode Island (USA) to a daily, satellite, and weekly microscopy resolution.By comparing the community composition from broad groups to categories, as well as the characteristics of the main phytoplankton bloom and the category-specific blooms across temporal resolutions, we aim to highlight which sampling patterns require a careful interpretation and which research questions might gain from a higher temporal sampling resolution, throughout or at specific times of the year, or when a lower temporal resolution may be sufficient.

Data acquisition
The IFCB is a phytoplankton imager that uses the principle of flow cytometry to orient cells in a stream and image them one by one (Olson and Sosik 2007).At the University of Rhode Island, Graduate School of Oceanography (GSO) pier (2 m depth, 41.6220, À71.3527), the IFCB is located inside a pumphouse where seawater is continuously provided at a rate of 10 L min À1 with a diaphragm pump.A 400 μm cartridge filter was installed in-line to remove any large particulates such as macro-algae.The IFCB intake is also fitted with a 150 μm filter to prevent clogging of the internal fluidics of the instrument.Approximately every 20 min, the IFCB takes a $ 5 mL sample from the water flow, and imaging is triggered by chlorophyll fluorescence for particles in a size range of $ 5-150 μm.The volume analyzed is recorded by the instrument and varies depending on the volume of particles.It is used to calculate the biovolume concentration during data processing.Samples for this analysis were recorded between 09 November 2017 and 01 November 2019, and visually quality controlled for bubbles or blurry samples.This quality control was done manually, using the ifcb-annotate software (https://ifcb-annotate.whoi.edu) to visualize a subset of images for 10 samples at a time and identify problematic ones.This amounts to 552 sampling days, including 26,440 samples with 104,505,923 images.The dataset is hosted at https://ifcbdashboard.gso.uri.edu/timeline?dataset=GSO_Dock.IFCB data are processed with open-source workflows and scripts available on GitHub (https://github.com/hsosik/ifcb-analysis/wiki;Sosik and Olson 2007;Sosik et al. 2016).The ifcb-analysis MATLAB (The MathWorks Inc. 2022) package processes the images into 244 features that serve as input to a classification algorithm.
For training purposes, 57,924 images from our dataset were manually identified into 83 categories using a combination of ifcb-annotate and EcoTaxa (Picheral et al. 2017) tools.Some categories with the same genus or morphology were grouped to obtain enough images for both training and validation and 70 categories, each with more than 50 images, were included in the identification algorithm.Each category's taxonomic broad group ($ class) and subgroup ($ order) were retrieved from the World Register of Marine Species (https://www.marinespecies.org/).Between 50 and 300 images per category were used to train a random forest algorithm with 100 trees (Breiman 2001).When applied to unknown images, the random forest algorithm assigns a probability that the image belongs to each category; the category with the highest probability is retained unless the probability is too low and the image is labeled as "unclassified."All images labeled "unclassified" ($ 23%) or falling into non-planktonic category (e.g., detritus, bubbles) were removed for this analysis.We evaluated the classifier's performance for each category by calculating the F1-score on the validation set, that is, manually annotated images not used for training (Table 1).The F1-score summarizes the precision and recall of the algorithm into a single metric from 0 to 1, 1 indicating a better performance of the algorithm (see Orenstein et al. 2022).For the considered classified images, the average F1-score is 0.91, ranging from 0.73 (Odontella aurita) to 0.99 (Odontella mobiliensis).
To duplicate satellite-temporal and weekly microscopy time series, we resampled our IFCB time series to match local dataset resolutions (Fig. 1).The NBPTS is one of the longest phytoplankton time series, with data collected in the bay since 1959, generally on Mondays, with some punctual variations based on weather, making it a weekly time series.We downloaded the sampling dates from the weekly NBPTS (https:// web.uri.edu/gso/research/plankton/) as well as the Sentinel-3 Ocean and Land Color Instrument (OLCI) images corresponding to our time frame from the NASA Ocean Color website (https://oceancolor.gsfc.nasa.gov).To build the coastal satellite-temporal resolution, given the adjacency of land to the pier, we created a 3 Â 4 seaward pixel area off the GSO pier location on each image and set to null pixels in this area with flags for land (LAND), probable cloud or ice contamination (CLDICE), product failure (PRODFAIL), or probable stray light contamination (STRAYLIGHT).For the resampling, we only included dates for which at least one pixel within these 12 pixels had a valid chlorophyll value.These two time series (NBPTS and OLCI) were used as filters over the IFCB time series to generate realistic lower-resolutions time series.

Data processing and analysis
Time series were created by using the biovolume calculated for each IFCB image with the method of Moberg and Sosik (2012).For each IFCB category, we summed the biovolume over the hour and divided it by the total volume sampled to obtain biovolume concentration in μm 3 mL À1 .Biovolume was chosen instead of image counts to both measure biomass and better represent the contribution of long chains, which would normally be counted as a single image.Data were linearly interpolated when fewer than 3 h were missing to account for the weekly cleaning and calibration of optical instruments deployed alongside the IFCB.To mitigate the effect of missing data from the IFCB, average biovolume concentration per day was taken for the daily, satellite-temporal, and weekly NBPTS resolutions (Fig. 2).This procedure was used to retrieve both total sample biomass and categories sample biomass.When retrieving total sample biomass, non-phytoplankton classes (zooplankton and ciliates) were removed ($ 6.2% of the biovolume concentration).A moving weighted average of 9 h was also applied to smooth the time series without removing the tidal cycle and to focus on consistent signals by mitigating the effect of possible single-hour outliers.
To characterize the general community at the four temporal resolutions, we looked at the community composition at several taxonomic levels, along with the timing of the major yearly blooms over the 2 yr.To assess the ability of each sampling regime to detect similar community composition, we compared results grouped into three taxonomic precisions: (1) a broad group level, grouping categories by Class; (2) the subgroup level, combining categories by Order; and (3) the categories level from the IFCB classification (Table 1).The category level was further divided into seasons to show the seasonal influence on the resolutions: December/January/ February as winter, March/April/May as spring, June/July/ August as summer, and September/October/November as fall.We ran Pearson's chi-squared tests to compare the community composition with the null hypothesis being that there are no significant differences among the different resolutions at α of 5%.At the category level, individual categories that make up more than 3.5% of the biomass were considered separately,  and those with less biomass were grouped as "others" during Pearson's chi-squared test.
To compare the timing of the major yearly bloom in the total biomass time series, we defined the main bloom as the peak with the maximum biomass within a year.For simplification in notation, we labeled the dates from 09 November 2017 to 31 October 2018 as year 1 and the dates from 01 November 2018 to 01 November 2019 as year 2. For each of the four sampling resolutions, we defined the bloom threshold as the median biomass over the time series plus 5% to determine bloom timing (Siegel et al. 2002).The major peak start and end date would be detected when the biomass crosses that threshold.To mitigate instrument and daily variability, we allow up to 72 h to be missing and 12 consecutive hours below the threshold within the peak.To detect the major bloom for each individual IFCB category, a similar bloom threshold was defined but using the mean plus 5% at each of the four time series resolutions, a large number of zeros when considering one category at a time often driving the median to 0.
When considering the total biomass, in addition to the largest bloom of the year, smaller blooms were detected at each of the four sampling resolutions by applying the same median plus 5% threshold.To qualify as a small bloom, values had to stay above the threshold during that time.At daily and hourly resolutions, two peaks are considered different if they are more than 3 d apart.The satellite resolution is 5 d on average, so we allow for up to five missing days and consider two peaks as distinct events if they are more than 5 d apart.For the weekly NBPTS resolution, one value is enough to make a peak, and two peaks are different events if they are more than 10 d apart.This means that the satellite-temporal and weekly NBPTS resolutions may have peaks of 1-d length because only one value was above the threshold.However, the actual length is unknown and could be up to 13 d (6 d before and 6 d after).
Manually collected time series are often carried out at the same time each week, as in the case of the NBPTS.Satellite datasets are, by nature, restricted to daytime hours.However, many phytoplankton have diel cycles, which may or may not coincide with these sampling regimes.To assess the potential impact of sampling time-of-day on results, we assessed variability in each category over diel and tidal cycles utilizing the high-resolution IFCB time series.We characterized each category's overall daily variability as the average ratio between the maximum and minimum values over the day, only considering samples where the given category is detected.To assess daily cycles, we averaged biomass for each hour of the day for a given category.Category biomass was also averaged over each tidal cycle, starting a new cycle at the maximum height within a 12-h period.

Assessment
When considering all the plankton categories detected, the richness follows the species rarefaction theory, meaning that the more you sample, the more categories you detect (Cermeño et al. 2014).All 70 image categories are detected throughout the 2-yr time series at daily and hourly resolutions.Most are also detected at satellite-temporal and weekly NBPTS resolutions when considering the 2 yr together, with the exception of a couple of zooplankton categories (Rotifera, Arthropoda), which are, by instrument design, already very low since most are filtered out to prevent instrument clogging.

Community dynamics
Overall, community composition biomass was not significantly different among the four sampling resolutions (X 2 Pearson = 3.01, p = 1; Fig. 3a) at the broad group level.Across seasons, diatoms dominated, followed by nanophytoplankton and dinoflagellates (Fig. 3a).When considering the time series as a whole, there were also no statistically significant differences in composition detected at the subgroup (X 2 Pearson = 6.02, p = 1; Fig. 3b) or category level (X 2 Pearson = 35.62,p = 0.84, data not shown).However, differences become apparent when the category level is separated by season (Fig. 3c).Weekly NBPTS resolution remains consistent enough to get a similar community daily and hourly resolutions.In contrast, satellite sampling days are more randomly spaced and lead to the most different community from the other resolutions (Fig. 3c).The differences are significant in summer (X 2 Pearson = 50.9,p = 0.0036) due to the absence of satellite sampling days during a short but very important bloom of Margalefidinium polykrikoides in August 2018.Although differences in the fall are not significant (X 2 Pearson = 9.02, p = 0.44), we can notice that the satellitetemporal resolution is slightly different here, too, with 57% diatoms instead of 40% for the other resolutions.This overrepresentation of diatoms is explained by the fact that the hourly (Hourly) and daily (Daily) scale and the same time series resampled on days when the satellite had observations (Satellite) and when the weekly phytoplankton sampling took place (NBPTS).The community composition is generally similar across sampling regimes at the three levels of taxonomic precision: (a) broad group level, (b) subgroup level, (c) category level.The latter is further separated by season to show that under-or oversampling during blooms at the satellite resolution can modify the general community composition repartition.At the category level, individual categories that make up more than 3.5% of the biomass were considered separately and those with less biomass grouped as "others" during the analysis.For the purposes of illustration, in (a) and (b), percentages for groups with more than 5% of the total the biomass are shown.For (c), percentages of categories containing more than 10% of total biomass are shown.
satellite sampling days spanned 16 d in September 2019 when both a Skeletonema spp.and a Guinardia sp.bloom occurred while the other fall months were comparatively undersampled with 3-10 sampling days.
All four resolutions detect the main annual bloom in the late winter for both years (Fig. 4).For the satellite-temporal time series, the bloom in year 1 only encompasses two sampling days, while year 2 has a higher resolution, highlighting the importance of considering the sampling effort when comparing 2 yr as one bloom could be missed or underestimated.Similarly, in year 2, the weekly NBPTS resolution underestimates the magnitude of the peak by almost 50% compared to the hourly maximum.Interestingly, the year 2 bloom has a second bloom at the beginning of March with a slightly lower magnitude (see Supporting Information Fig. S1), and the weekly NBPTS sampling day occurs just a little after the height of that peak, thus, if either of the February or March sampling days had been one or 2 d off, the height of the peak could have moved by 1 month.In terms of bloom phenology, the lower resolution for the satellite and weekly sampling also leads to smaller bloom length in year 1, respectively, 8 and 7 d, compared to 16 d for the hourly and daily resolution.While the start date is mostly within 2 or 3 d, the end date shows more variation for the satellite and weekly resolution, ending a week before the hourly resolution and explaining the smaller bloom length in year 1.During those 2 yr, the bloom maximum also varies by no more than 5 d.
For smaller blooms throughout the year, the number of peaks detected within the time series is similar for the weekly NBPTS, satellite-temporal, and daily resolutions.At the hourly resolution, 10 more peaks are detected (Fig. 5a).The NBPTS and satellite resolution have several bloom lengths of 1 d due to the presence of only one value above the threshold for those peaks (Fig. 5b).At both daily and hourly resolution, we can see that the means are just below 2 weeks, but the hourly median is around 7 d: 50% of the hourly peaks detected are less than 7 d and 30% are also less than 1 d.Many shorter blooms, often category-specific blooms, may thus be detected or not depending on when the sampling falls within a week or even within the day.

Species dynamics
Category-specific major blooms are detected at the same time of the year across resolutions but, depending on the category can look different from year to year and among resolutions.Categories such as Skeletonema spp. or Leptocylindrus minimus with major blooms that last longer than 2 weeks are generally well detected at all resolutions, but the bloom magnitude can be underestimated at lower resolutions due to variability within the bloom (Fig. 6a,e).For instance, for L. minimus, the satellite-temporal resolution misses both the first and highest peaks of year 1 and only samples after the height of the year 2 bloom, underestimating the magnitude of both years.Although the bloom is detected, for some species, the underestimation can be amplified when the whole height of the peak is missed.The weekly NBPTS and satellitetemporal resolutions both miss the Dactyliosolen blavyanus height of the peak in year 1, only resolving the beginning of the bloom and largely underestimating the bloom magnitude difference between the 2 yr (Fig. 6b).Similarly, the satellitetemporal resolution misses the height of the Thalassiosira spp.bloom in year 1 and the weekly NBPTS misses it in year Fig. 4. Time series of the major bloom for each year.The major bloom is plotted with round dots of different colors for each resolution.The star-shaped data point indicates the maximum detected at each resolution.Note that the maximum for the daily and satellite occurs at the same time both years and thus the star-shaped maximums overlap.Only for visualization, the dark gray line shows the weighted moving average applied 30 times over a 7-h window to smooth the time series without increasing the moving window.The black dashed line is the 5% above the median limit to determine the bloom start and end dates.
2 (Fig. 6f).The satellite and weekly resolutions can also happen to sample within the bloom but on both edges of the height of the bloom.This is the case for the Cerataulina pelagica bloom in year 1 for both resolutions (Fig. 6c) and for the year 1 Chaetoceros spp.single bloom at the satellitetemporal resolution (Fig. 6g).Eventually, some categories have a major bloom shorter than 7 d and bloomed both years around the same time (e.g., Dinobryon sp.; Fig. 6h) or only during one of the years (e.g., M. polykrikoides; Fig. 6d).These short-bloom categories are especially hard for lower resolution approaches to detect or not underestimate.The satellitetemporal resolution neither resolves the M. polykrikoides year 1 bloom nor any of the Dinobryon sp.blooms, and while the weekly NBPTS resolution samples the year 1 Dinobryon sp.bloom at the height, it samples the year 2 bloom closer to the bottom.A comparison of those 2 yr at the weekly resolution thus shows a higher year 1 bloom when the year 2 bloom was twice as high.
The time of sampling for a time series in an estuary or a bay can have a different level of importance depending on the species and the strength of the daily or tidal cycle.Some species can have a small ratio between the maximum and minimum hourly biovolume concentration values during a day and, as such, be relatively consistent in biovolume concentration throughout the day (e.g., Akashiwo sanguinea, Skeletonema spp., ratio < 5) whereas others show a high (e.g., D. blavyanus, Thalassiosira spp., 5 < ratio < 25) or very high variability (e.g., Eucampia sp., Asterionella glacialis, ratio > 25) (Fig. 7a).While most categories in the IFCB do not show a specific daily cycle, certain categories seem to peak twice a day, in the morning and in the late afternoon (e.g., Guinardia sp.) or at midday and in the night (e.g., Chaetoceros sp.single), or only during midday (e.g., A. sanguinea) (Fig. 7b).Similarly, most categories have higher biovolume concentration around low tide (e.g., L. minimus) or ebbtide (e.g., Eucampia sp.) although some (e.g., A. glacialis) are slightly higher closer to high tide (Fig. 7c).Although these are average cycles, they show that time of sampling can be important, especially when targeting a specific category.Summary figures with the time series, major blooms, and daily variability for each category are included in Supporting Information Materials.

Discussion
In this analysis, we evaluated the influence of different temporal resolutions on the possibility of retrieving a specific level of taxonomical phytoplankton information.Although some periods of the IFCB time series are missing due to instrument maintenance, we show that a weekly or a variable resolution corresponding to a polar-orbiting satellite sampling is enough to retrieve the general community composition down to the category level.However, at the satellite-temporal resolution, differences occur based on seasonal-dependent coverage: undersampling or oversampling of a very important bloom can introduce deviations in the representation of the detected general community composition.Interannual variability in temporal coverage, with inconsistent sampling intervals due to gaps, at satellite-temporal sampling resolutions are thus important to consider when comparing changes in magnitude in phytoplankton blooms across a time series.Sampling resolution also impacts fine measures like start date, maximum, or end date for the annual main phytoplankton bloom.In regions where the major bloom occurs in winter or spring, satellite data can also have a very different resolution from 1 yr to another due to missing data from cloud cover.Based on our findings, we recommend using caution in interpreting phytoplankton phenology metrics derived from high-resolution satellite time series with gaps exceeding 1 week.A weekly resolution is sufficient to resolve the general annual cycle, but when precisely comparing interannual bloom height, start and end date, at least twice weekly, if not a higher temporal resolution is advisable (Muller-Karger et al. 2018).Further phenology analysis based on high-frequency time series that cover decadal time frames, such as WHOI's Martha's Vineyard IFCB time series, would expand our findings here and help refine potential phenology errors due to sampling timescales.
Many peaks in biovolume concentration last less than 1 week and correspond to category-specific blooms; weekly and polar-orbiting satellite-temporal resolution might miss those short-lived phytoplankton blooms or might miss them 1 yr but not the other.In our location, major blooms of categories, including the HAB M. polykrikoides and the golden alga Dinobryon sp., are missed by the coarser resolutions.On the contrary, categories like L. minimus, Skeletonema costatum, and single-cell Chaetoceros sp. have long enough major blooms to be resolved by all sampling resolutions, but some of the sampling timings lead to underestimation of the bloom magnitude.He et al. (2022) similarly showed that a major phytoplankton bloom in the Qinhuangdao Coastal Area, China, in summer is a succession of diatoms, Chaetoceros tortissimus displaying short 3-d bloom, S. costatum a week-long bloom, and Thalassiosira pacifica a 2-week-long bloom.These indicate that when targeting a specific event or species, a higher resolution, at least with twice weekly sampling, is necessary to reduce the possibility of missing or underestimating the bloom.In Narragansett Bay specifically, Thalassiosira spp. is considered the major player of the winter-spring bloom and important to local fisheries as it has been shown to provide food to zooplankton and juvenile fish (Paul et al. 1990).The strength of its bloom has been driving studies on temperature and light influence on the bloom timing and importance (Hitchcock and Smayda 1977).Our analysis shows that if such a comparison was to be run between years 1 and 2 with weekly sampling, the bloom in year 2 would be considered 35% of the one in year 1 when, in reality, it is only slightly lower (92% of the year 1 bloom).However, a higher resolution also generally requires the use of machine learning algorithms that can misclassify some images; one of the smaller L. minimus peaks in winter, for instance, is part of a bigger Skeletonema spp.bloom.This is balanced with the storage of images, which gives the possibility to confirm or update the signals observed at a later time and regular manual reassessment as the dataset continues.
We summarize recommended sampling frequencies for categories shown in Table 1 to accurately capture both the timing and magnitude of detected blooms (Fig. 8).Recommendations are based on the total number of peaks detected in the time series and the percentage of these blooms that are less than 1 week in duration.With the design of our peak threshold (mean + 5%), when the time series of a given category is dominated by one important annual bloom, very few peaks are detected (left panel in Fig. 8).In these cases, if fewer than 30% of total detected blooms are less than 7 d in duration, then a weekly sampling regime would most likely be enough to resolve the main features of the time series for these categories.For categories with a higher percentage of short-duration blooms, twice-weekly sampling, especially during known bloom periods, is recommended.
Conversely, other categories have numerous peaks throughout the year and are not dominated by a single bloom (middle panel in Fig. 8).These categories are detected throughout the IFCB time series without a consistent blooming period, and a weekly sampling schedule would give a broad overview of their dynamics.However, when more than 50% of the peaks are less than 7 d, twice weekly sampling, particularly during known bloom periods, would lead to a more accurate resolution.The category ceratiaceae is a special case as it includes a small summer species that can be resolved with a weekly distribution and a large winter species that would require a high sampling resolution to be detected.Our study highlights that many phytoplankton categories peak throughout the year for smaller durations beyond the main peak, which supports the recommendations from Muller-Karger et al. ( 2018) for satellite remote sensing sensors of not only higher spatial and spectral resolution but also temporal resolutions of hours to days.
While our analysis combines hourly measurements throughout an entire day into a single representation for weekly and satellite sampling regimes, actual data from such time series will be derived from a single measurement.Previous analyses have already shown that the tidal cycle needs to be accounted for in correcting for quenching in satellite measurements of chlorophyll fluorometry in coastal areas due to the spatial nonhomogeneity between unquenched nighttime measurements and quenched daytime measurements (Carberry et al. 2019).Our analysis of the hourly IFCB time series demonstrates that the choice of sampling time during the day can also impact the retrieval of specific categories, as some present a higher daily variability than others (Fig. 7).Overall, most species showed higher biomass around the time of low tide and, to a smaller extent, in the morning.Both the tidal cycle and the fixed sampling depth may explain such a pattern.For positively buoyant organisms, numbers would be higher at low tide when the fixed sampling pipe is closer to the surface.Although such daily fluctuations may vary throughout the seasons, the pattern found here in Narragansett Bay for A. sanguinea, for instance, is consistent with the early afternoon increase also reported off the Southern California Bight by Kenitz et al. (2023) and in the Ariake Sea by (Katano et al. 2011).Drawing samples only at the surface or only at a specific time each week might then introduce bias into the dynamics of species like A. sanguinea, well-known for diel vertical Fig. 8. Recommended sampling strategies for annual variations of categories in Table 1 based on the total number of peaks above the threshold and the percentage of these blooms lasting less than 7 d.Categories are colored by the minimum resolution recommended to accurately capture bloom duration and magnitude and are grouped (vertical dashed lines) by observed bloom patterns.Category-specific graphs of major bloom pattern, number and length of peaks detected can be found in the Supporting Information Material.
Table 2. Recommended sampling resolution depending on the bloom pattern and general length of the blooms (< 7 d = short, > 7 d = long).An example of a category that would benefit from such sampling resolution is indicated in italic.A more detailed table for each category mentioned in this study can be found in the Supporting Information Material (Supporting Information Table S1).(Katano et al. 2011).In addition to diel or tidal cycles, some blooms present a high spatial patchiness and hourly data are crucial for early warning detection systems, especially for important HAB species known for producing toxins such as K. brevis (Campbell et al. 2017) or M. polykrikoides (Carney et al., in press).Some of the variation observed here may be induced by instrument variability and possible maintenance issues, but the average daily and tidal cycles indicate that when targeting a specific category, sampling time should also be considered.We compiled our findings into a summary table highlighting ideal sampling resolution depending on bloom pattern and general bloom length (Table 2).We also expand this table and provide, in the Supporting Information Material, a table that includes the bloom characteristic and daily pattern over the 2 yr for each category retrieved by our IFCB (Supporting Information Table S1).These patterns were recorded at a specific coastal location but can serve as indicator and baseline to help inform sampling design depending on the taxonomical and temporal resolution targeted, and help interpret both in situ and satellite data, past and future.New algorithms and improved hyperspectral satellite capabilities should help us monitor phytoplankton community changes that may be important for the food web but might not influence the total biomass.Although the summary table we provide represents one coastal location, it gives a general idea into the sampling resolution and sampling periods desirable for important diatoms and dinoflagellates species.

Comments and recommendations
We showed that weekly and satellite-temporal resolutions are sufficient to resolve general community composition but that the randomness of the satellite-temporal resolution can result in overrepresenting or underrepresenting certain classes.It is thus important to consider temporal satellite resolution when comparing year-to-year phytoplankton blooms.While the daily and hourly resolutions are the only ones capturing the whole variability of the time series, satellite-temporal and weekly resolutions can give a general idea of bloom timing and dynamics, especially for species with long-lasting blooms.On the other hand, when targeting specific species or comparing fine-scale bloom phenology metrics from 1 yr to another, we recommend using at least a twice weekly sampling resolution during known bloom periods and again, taking into account sampling resolutions in the interpretation of observed dynamics.The detailed species dynamics presented here are specific to the Narragansett Bay area.They may, such as for A. sanguinea, hold true in other regions but may also be used as a starting point and further enriched with similar areaspecific analysis.

Fig. 1 .
Fig. 1.Schematic of the construction and analysis of the four time series.Starting with the Imaging FlowCytobot hourly and daily averaged biomass time series, weekly Narragansett Bay Plankton Survey microscopy sampling days and OLCI satellite days with at least one chlorophyll value are used as filters.The differences in community composition, phytoplankton blooms and category-specific blooms and daily variability are assessed using these four time series.

Fig. 2 .
Fig. 2. Sampling days included in each of the four time resolutions, showing (a) a time series of data for each resolution, (b) the percentage of days with data per resolution.The top row is the original IFCB hourly/daily time series (IFCB hourly/daily).The other two rows in black show the IFCB time series resampled at the satellite-temporal (IFCB satellite) and weekly phytoplankton (IFCB NBPTS) sampling days.Rows in gray show the full OLCI (Satellite) and plankton survey (NBPTS) data resolution for comparison.

Fig. 3 .
Fig. 3. Comparison of community composition across sampling resolutions.Each bar represents a different time resolution: the IFCB time series on an

Fig. 5 .
Fig. 5. Number and peak length of each resolution.(a) The number of peaks detected for each resolution and (b) Boxplot of peak length per resolution, the red dot corresponds to the mean (u), while the center of the boxplot is the median.Each point represents a peak length.The horizontal dashed line represents a bloom length of 7 d, corresponding to a weekly sampling regime.Blooms falling below this line have the potential to be missed by less frequent sampling.

Fig. 6 .
Fig. 6.Major bloom peak for eight categories.Example categories with a major peak more than 1-week long well-resolved by all resolutions, (a) Skeletonema spp.and (b) Leptocylindrus minimus; major peak underestimated by low sampling resolutions, (c) Dactyliosolen blavyanus, and (d) Thalassiosira spp.; major peak missed due to sampling before and after, (e) Cerataulina pelagica and (f) Chaetoceros spp.single; major peak less than 7 d missed by lower resolutions (g) Margalefidinium polykrikoides and (h) Dinobryon sp.Gray crosses are the hourly data for year 1, and gray circles are the hourly data for year 2. A moving average curve is also shown for visualization, dashed for year 1 and solid for year 2. Daily (yellow), satellite-temporal (blue), and weekly NBPTS (red) resolutions have the same color for both years.

Fig. 7 .
Fig. 7. Daily variation of selected categories.(a) Boxplot of the maximum and minimum biovolume concentration ratio within a single day for six selected categories.The ratio is calculated only considering samples during which the category was detected.(b) Average biomass per hour over the time series for three selected species with different cycles: morning and late afternoon peaks, midday and night peaks, and midday peak.(c) Average biomass per tidal hour over the time series for three selected species with different cycles: low tide peak, ebb tide peak, and high tide peak.

Table 1 .
Taxonomic precision and F1 score for the 26 categories used in the category-specific bloom analysis.