Predicting river phytoplankton blooms and community succession using ecological niche modeling

Excessive phytoplankton concentrations in rivers can result in the loss of plant and invertebrate communities, and threaten drinking water supplies. Whilst the physicochemical controls on algal blooms have been identified previously, how these factors combine to control the initiation, size, and cessation of blooms in rivers is not well understood. We applied flow cytometry to quantify diatom, chlorophyte, and cyanobacterial group abundances in the River Thames (UK) at weekly intervals from 2011 to 2022, alongside physicochemical data. A niche modeling approach was used to identify thresholds in water temperature, flow, solar radiation, and soluble reactive phosphorus (SRP) concentrations required to produce periods of phytoplankton growth, with blooms only occurring when all thresholds were met. The thresholds derived from the 2011 to 2018 dataset were applied to a test data set (2019–2022), which predicted the timing and duration of blooms at accuracies of > 80%. Diatoms and nano‐chlorophyte blooms were initiated by flow and water temperature, and usually terminated due to temperature and flow going out of the threshold range, or SRP and Si becoming limiting. Cyanobacterial bloom dynamics were primarily controlled by water temperature and solar radiation. This simple methodology provides a key understanding of phytoplankton community succession and inter‐annual variation and can be applied to any river with similar water quality and phytoplankton data. It provides early warnings of algal and cyanobacterial bloom timings, which support future catchment management decisions to safeguard water resources, and provides a basis for modeling changing phytoplankton bloom risk due to future climate change.

River phytoplankton plays a key role in supplying oxygen and organic carbon to the aquatic ecosystem and forms the base of the aquatic food web (Suggett et al. 2006;Guo et al. 2016).Human activities can greatly increase phytoplankton biomass through nutrient enrichment, river impoundment, and clearing of bankside vegetation, resulting in increased light and water temperatures (Smith 2003;Wurtsbaugh et al. 2019).Excessive phytoplankton and periphyton growth can result in the loss of aquatic plant and invertebrate communities, and the low dissolved oxygen concentrations that can occur when blooms terminate can result in fish kills (Carpenter et al. 1998;Hilton et al. 2006;Absalon et al. 2023).Higher algal biomasses and changes in phytoplankton community structure can also threaten drinking water supplies and greatly increase operating costs for water supply companies (Pretty et al. 2003).Shifts in the phytoplankton community can result in cyanobacterial dominance, which can cause taste and odor issues, and the presence of cyanotoxins that can cause health problems for animals and humans (Wurtsbaugh et al. 2019;Graham et al. 2020).
The characteristics of lake phytoplankton succession are well described (Sommer et al. 2012), but the understanding of phytoplankton growth, composition, and turnover in rivers is less developed (Reynolds 2000;Xia et al. 2019;Bruns et al. 2022), perhaps due to the biogeochemistry of lotic systems being more dynamic in response to hydraulic events.River chlorophyll concentrations, a common indicator of phytoplankton biomass, can vary greatly from year to year, and respond rapidly to changes in flow, even showing diurnal fluctuations during periods of high biomass (Bowes et al. 2016).In recent years, researchers have begun to understand how phytoplankton functional groups are impacted by nutrients and physical river conditions, as reviewed by Abonyi et al. (2021), but our current knowledge is not sufficient to predict the timing and duration of blooms and the succession of these functional groups.
Previous river phytoplankton studies have highlighted the impacts of suspected drivers, such as nutrient concentrations, flow, light, and water temperature (Chetelat et al. 2006;Larroude et al. 2013;Bruns et al. 2022).However, it is rare for all these parameters to be monitored within a single study, especially at a sampling frequency and duration that captures both rapid phytoplankton dynamics and inter-annual variations.Previous modeling studies have tried multiple approaches to predict chlorophyll dynamics in rivers, using process-based (Whitehead et al. 2015;Pathak et al. 2021), statistical (Kim et al. 2020), and machine-learning techniques (Savoy and Harvey 2023), but the non-linear relationships between phytoplankton groups and their multiple drivers makes this difficult.
Perhaps the largest data gap is in the characterization of the phytoplankton community itself.This is usually achieved using traditional microscopy techniques, which can provide species-level identification.However, it is a relatively expensive technique due to it being time-consuming and requiring specialist taxonomic expertise, resulting in a lack of long-term, high temporal resolution phytoplankton datasets that are needed to identify physicochemical controls on phytoplankton blooms (Dubelaar et al. 2004;Rolland et al. 2009).In recent years, flow cytometry techniques have been developed that are able to enumerate and phenotypically characterize the phytoplankton community at high throughput and low cost, allowing river phytoplankton concentrations to be monitored at high temporal resolutions (Dubelaar et al. 2004;Moorhouse et al. 2018;Mao et al. 2022).
The identification of key environmental thresholds required for river phytoplankton growth has been postulated as a vital tool for effective catchment management (Groffman et al. 2006), but these thresholds have not been able to be derived due to lack of appropriate data.In this paper, we applied an ecological niche modeling approach (Bowes et al. 2016) to a 12-yr physicochemical dataset (Bowes et al. 2018) and flow cytometry-derived phytoplankton counts for the River Thames in southern England.Niche threshold modeling is traditionally used spatially, to identify specific habitat requirements and determine suitable habitat extents for particular species (Valencia-Rodriguez et al. 2021).Our study aimed to apply a temporal niche modeling approach, to identify the key physical and chemical parameters, alongside phytoplankton characterization at an appropriate monitoring frequency, to gain a new understanding of phytoplankton dynamics, and to predict the timing of bloom initiation and cessation.
Based on the observations from previous niche modeling of chlorophyll concentrations in the River Thames (Bowes et al. 2016), we hypothesize that phytoplankton cell concentrations will only increase when river conditions are within specific ranges of flow, water temperature, light, and nutrient concentrations, and therefore, the timing and duration of bloom periods for individual phytoplankton groups can be predicted.This will be possible without adding additional potential factors that are postulated within lake research, such as stratification, zooplankton densities, and grazing rates (Sommer et al. 2012).In this paper, we aim to (1) determine how broad phenotypic community composition and populations of four major phytoplankton groups (diatoms, nano-chlorophytes, picochlorophytes, and cyanobacteria) change seasonally, (2) identify the niche thresholds in flow, water temperature, light, and nutrient concentrations required to stimulate increasing cell concentrations of each phytoplankton group, thereby driving the seasonal changes and inter-annual variation, and (3) to test if these niche thresholds can predict the onset and duration of river phytoplankton blooms in the future.

Study catchment
The River Thames is the second longest river in the United Kingdom, with a length of 354 km to the tidal limit (in west London) and a freshwater catchment area of 9948 km 2 (Fig. 1).The headwaters in the Cotswold Hills and many of its tributaries are predominantly rural, but the river passes through a number of large towns and cities (including Swindon, Oxford, Reading, Maidenhead, and Slough) before flowing through the UK's capital city, London, and out to the North Sea.The River Thames is one of the most monitored and studied rivers in the United Kingdom, partly due to its key importance in supplying drinking water to the London region.This study is focused on the lower Thames at Runnymede.This stretch of the Thames is heavily abstracted to supply the many surrounding reservoirs to the west of London.The mean river flow of the study site in the lower Thames at Runnymede is 56.7 m 3 s À1 .The catchment is underlain with Oolitic limestones in the Cotswold Hills and predominantly by porous Chalk bedrock in the remaining catchment, resulting in the River Thames being largely groundwater-fed, with a high base-flow index of 0.72 at Runnymede (Fig. 1) (Marsh and Hannaford 2008).The land use within the catchment upstream of Runnymede is predominantly arable (40.4%) and grassland (34%), but it also has 10.5% urban/semi-urban development (with a sewage population estimate of approximately 2.7 million people).This high human population pressure results in the lower River Thames being relatively nutrient-enriched, with average concentrations of soluble reactive phosphorus (SRP) and nitrate of 154 μg P L À1 and 28.1 mg NO 3 L À1 , respectively (Bowes et al. 2018).

River sampling and chemical analysis
Water samples were taken from the River Thames at Runnymede at weekly intervals, from February 2011 to October 2022, as part of the UK Centre for Ecology & Hydrology's (UKCEH) Thames Initiative Research Platform (Bowes et al. 2018) (Fig. 1).River water temperature was recorded at the time of sampling.
Samples were taken from the main flow of the river, in a wellmixed location, to ensure that the sample was representative of the river as a whole.Unfiltered subsamples were taken for phytoplankton characterization by flow cytometry, chlorophyll a and total phosphorus (TP) analysis.Other subsamples were filtered (0.45 μm, WCN grade; Whatman) immediately in the field, for subsequent dissolved nutrient analyses.Samples were stored at 4 C in the dark before analysis.Chl a concentrations were determined by filtering samples through a GF/C grade filter paper (Whatman), pigment-extracted overnight using 90% : 10% acetone : deionized water, and then quantified spectrophotometrically (Marker et al. 1980).TP and total dissolved phosphorus (TDP) concentrations were determined by digesting an unfiltered and filtered water sample, respectively, with acidified potassium persulfate in an autoclave at 121 C for 40 min, then reacting with acid ammonium molybdate reagent to produce a molybdenum-phosphorus complex, which was then quantified spectrophotometrically at 880 nm (Eisenreich et al. 1975).SRP concentrations were determined on a filtered sample using the phosphomolybdenum-blue colorimetry method of Murphy and Riley (1962), as modified by Neal et al. (2000).Dissolved reactive silicon concentration was determined by reaction with acid ammonium molybdate, followed by reduction using acidified tin(II) chloride and quantified spectrophotometrically (Mullin and Riley 1955).
Nitrate-N concentration was analyzed by ion chromatography (Dionex DX500).All chemical samples were analyzed alongside reference quality control standards (Aquacheck; LGC Standards).The chemical data sets for the Thames Initiative are freely available through the UKCEH Environmental Information Data Centre portal at https://doi.org/10.5285/cf10ea9a-a249-4074-ac0c-e0c3079e5e45.

Flow cytometry
Phytoplankton analysis by flow cytometry was carried out on a Beckman Coulter Gallios flow cytometer (Beckman Coulter) equipped with blue (488 nm) and red (638 nm) solid-state diode lasers, as described in Read et al. (2014).In brief, a plot of yellow/green fluorescence (FL2-575 nm) against red fluorescence (FL4-695 nm), both excited by the 488 nm laser, representing phycoerythrin fluorescence vs. chlorophyll fluorescence, was used to count major chlorophyll-containing phytoplankton groups.A second plot of red fluorescence (FL4-695 nm) excited by the 488 nm laser against orange fluorescence (FL6-660 nm) excited by the 635 nm laser, representing chlorophyll vs. phycocyanin was used to distinguish and count major cyanobacterial groups.Phytoplankton samples for flow cytometry analyses were stored in the dark at 4 C for no longer than 24 h before analysis, and vortex-mixed immediately before analysis, to break up algal colonies.The phytoplankton groups identified by flow cytometry were diatoms (with some high-fluorescence chlorophytes) (12-20 μm size range) (referred to as diatoms for brevity), nano-chlorophytes, pico-chlorophytes (low chlorophyll fluorescence and predominantly 2-5 μm size range) and cyanobacteria.Previous studies provide further background information on the phytoplankton group classifications, and validation of these classifications was achieved by testing against standard cultures (Read et al. 2014) and HPLC pigment analyses (Moorhouse et al. 2018).

Additional datasets
Hourly global solar radiation data from 2013 to 2022 was obtained from the MIDAS database for the UK Land Surface Station at Heathrow, Greater London (code src_id 708) (Fig. 1), and accessed via the CEDA archive (Met Office 2020).Heathrow is approximately 7 km from the Thames water quality monitoring station at Runnymede.Additional sunshine duration data was obtained from the UKCEH meteorological station at Wallingford.Mean daily flow gauging data for the River Thames at Royal Windsor Park was collected by the Environment Agency, and accessed through UKCEH's National River Flow Archive (https://nrfa.ceh.ac.uk).The flow gauging station was 5 km upstream of the Thames at Runnymede.

Niche modeling approach and statistics
General relationships between phytoplankton cell concentrations and the physicochemical parameters within the dataset for the period 2011-2018 were investigated using Pearson correlation.Proxies for the growth rates of the four phytoplankton functional groups were derived by subtracting the previous week's cell counts mL À2 from the current week's, thereby determining whether there was a net increase or decrease in phytoplankton cell concentrations over the 7-d period, and what was the rate of net-increase.A niche modeling approach was developed by plotting these net increases and decreases of each phytoplankton group against nutrient, flow, light, and water temperature conditions, to determine the ranges of each driver that produced increases in cell concentrations for the 2011-2018 training set data.These derived thresholds were then reapplied to the 2011-2018 dataset, to understand the role that each of these thresholds plays in explaining the observed phytoplankton dynamics.
Finally, the niche thresholds were applied to the 2019-2022 test dataset, to validate whether this niche modeling approach was able to predict the timing of phytoplankton bloom commencement and cessation.Periods of elevated cell concentrations greater than the 90% percentile of all the 2019-2022 observations for all phytoplankton groups were identified and compared with the predicted periods of phytoplankton growth from the niche model.The accuracy, precision, and recall of the phytoplankton models were then determined using a confusion matrix approach (Phillips et al. 2024).

Results and discussion
There was a relatively regular pattern in the annual phytoplankton succession observed in the River Thames at Runnymede, with diatom concentrations increasing in spring, closely followed by nano-chlorophytes (Supporting Information Fig. S1).These were then replaced by pico-chlorophytes through the summer and autumn periods, with sporadic and rapid increases in cyanobacteria occurring in July and August.Despite the regular succession in phytoplankton groups, the timing and magnitude of the blooms varied greatly from year to year.A full description of the time series data for individual phytoplankton groups are presented in the Supporting Information.

Physicochemical relationships with phytoplankton dynamics
All phytoplankton group concentrations had a positive correlation with water temperature and solar radiation, and negative relationships with flow (Table 1), showing that increases in phytoplankton cell concentrations tended to occur during warm and sunny periods, when water residence time was sufficient to allow biomass to develop.There was a particularly strong positive correlation between water temperature and pico-chlorophyte concentration (correlation coefficient of 0.695; p ≤ 0.001), and also to a lesser extent, cyanobacteria (correlation coefficient of 0.44; p ≤ 0.001).Pico-chlorophyte concentrations began to slowly increase when river temperatures exceeded 10 C, with the highest concentration (230,000 cells mL À1 ) occurring on the day with the highest observed temperature of 24.4 C (Supporting information Fig. S2).From 10 C to 19 C, the pico-chlorophyte concentrations tended to increase, but for the majority of observations, cell concentrations remained very low within that temperature range.When water temperatures exceeded 19 C, pico-chlorophyte concentrations were elevated, and always > 35,000 cells mL À1 .Cyanobacterial concentrations showed a similar pattern, being consistently very low (< 10,000 cells mL À1 ) at water temperatures below 16 C, with some intermittently high cyanobacterial concentrations (between 30,000 and 100,000 cells mL À1 ) when water temperatures were above 16 C.At temperatures below 20 C, concentrations of cyanobacteria were commonly below 1000 cells mL À1 , and when above 21 C, all observed cell concentrations were elevated, ranging from 3000 to 100,000.
Diatom and nano-chlorophyte concentrations both exhibited similar relationships with water temperature.Most observations were very low across all water temperatures, but elevated concentrations intermittently occurred in the range of 11-20.5 C for diatoms and 11-22.6C for nano-chlorophytes (Supporting Information Fig. S2).Highest cell concentrations for both groups were observed when water temperatures ranged from 15 C to 18 C.When water temperatures exceeded 20.5 C and 22.6 C, cell concentrations were always low (< 7000 and < 3000 cells mL À1 for diatoms and nano-chlorophytes, respectively), which implies that these larger phytoplankton groups were either temperature-inhibited, impacted by topdown pressures such as increased grazing or viral lysis under warmer conditions, or were outcompeted or replaced by other phytoplankton groups such as the cyanobacteria and picochlorophytes, which proliferate at higher water temperatures.Similar shifts in community structure from larger to smaller phytoplankton cells as water temperatures increase have been observed in previous studies (Paerl and Huisman 2008;Daufresne et al. 2009;Cha et al. 2017).
All phytoplankton groups had negative correlations with flow (Table 1), and only had elevated cell concentrations when river flows were low (Supporting Information Fig. S2b), when residence time within the river was sufficient to allow biomass to develop.Maximum pico-chlorophytes and cyanobacteria concentrations occurred during the periods of lowest flow, whereas the diatom and nano-chlorophyte concentrations peaked at low flow but declined rapidly when flows declined to < ca. 15 m 3 s À1 .This suggests that the larger diatoms and nanochlorophytes were potentially settling out during these periods of low flow and reduced current velocity, and the smaller picochlorophytes and cyanobacteria remained suspended in the water column, thereby allowing them to access light and continue to reproduce (Oliver and Walsby 1988).
All phytoplankton groups had a positive relationship with solar radiation (Table 1; Supporting Information Fig. S2).Elevated concentrations of diatoms and cyanobacteria only occurred when the average daily solar radiation on the preceding 3 d was > ca. 100 W m À2 , and nano-chlorophyte concentrations only increased when average daily solar radiation was above 140 W m À2 .Pico-chlorophytes were able to reproduce at much lower light intensities, and only appear to become light-limited when the daily solar radiation was < 40 W m À2 .
There was a negative correlation between all dissolved nutrient concentrations and the diatom and nanochlorophyte concentrations (Table 1; Supporting Information Fig. S2d), with the highest peaks in diatom concentration occurring when SRP concentrations were < 20 μg L À1 .However, past within-river experimental studies have shown that SRP concentrations > 30 μg L À1 are unlikely to limit algal growth in the Thames catchment (Bowes et al. 2012;McCall et al. 2017).It would be expected that high nutrient concentrations would promote increased phytoplankton biomass, rather than inhibit it, and previous studies of the Loire River in France have shown a reduction in phytoplankton biomass as a result of nutrient concentration reductions over recent decades (Minaudo et al. 2021).The negative relationship indicates that during periods of high diatom and nanochlorophyte biomass, the blooms are depleting the soluble phosphorus, TDP, silicon, and to a lesser extent, nitrogen concentrations.Therefore, during spring bloom periods, these phytoplankton communities are controlling the river nutrient concentrations, rather than responding to the nutrient status of their environment.The major depletion in SRP and dissolved silicon concentrations during some of the major blooms probably limits the maximum diatom concentrations that are possible in the lower River Thames.In contrast, the pico-chlorophyte and cyanobacterial concentrations only increased above 30,000 cells mL À1 when SRP concentrations exceeded 79 and 114 μg L À1 , respectively, suggesting they are phosphorus-limited and that these phytoplankton groups only proliferate under nutrient-enriched conditions.This seems extremely unlikely at these high nutrient concentrations, and it is probably due to them blooming during the mid to late summer when the diatom bloom has ended.Despite the very large pico-chlorophyte and cyanobacterial cell concentrations, their biomass is likely to be low, and therefore their sequestering of phosphorus from the water column is not sufficient to significantly reduce the river nutrient concentrations.

Identifying niche thresholds for phytoplankton community change
Phytoplankton community composition in the River Thames is impacted by water temperature, flow and light intensity, and possibly by dissolved phosphorus and silicon concentrations (Supporting Information Fig. S2).There are certain physical conditions that are favorable to the net growth of each phytoplankton group, but even when these individual conditions are met, there is not an elevated cell concentration on most occasions.For example, elevated diatom cell concentrations only occurred when water temperature was within the range of 11 C to 21 C (Supporting Information Fig. S2a), but for the majority of the observations within that temperature range, the diatom cell concentrations were extremely low, and increases in cell concentrations only occurred intermittently.This implies that although the temperature requirements were met, other physicochemical conditions were not suitable for diatom growth and reproduction, and multiple environmental factors must be controlling phytoplankton community dynamics.
Another confounding issue to understanding phytoplankton dynamics is that the presence of high phytoplankton concentrations does not necessarily mean that conditions are currently favorable.For example, a phytoplankton group that is sampled while a bloom is crashing will still have elevated cell concentrations, but the conditions are unsuitable to sustain increasing cell concentration and reproduction.
To better understand when conditions were favorable for phytoplankton growth, the weekly net change in cell concentration for each individual phytoplankton group (the difference between the current cell concentration and the cell concentration observed in the preceding week) was calculated.These weekly cell concentration changes were plotted against water temperature, flow, and solar radiation values (Supporting Information Fig. S3), to visually identify the range (or niche threshold) of each parameter that was potentially favorable for phytoplankton cell concentration increases (Table 2).These weekly cell concentration changes provided a more accurate identification of favorable condition ranges, compared to using cell concentration data alone.It is important to note that observed changes in cell concentration will be related to phytoplankton growth rate in conjunction with loss by mortality, grazing, viral lysis, settling out of the water column, and dilution due to increased flows.It, therefore, represents only an apparent or weekly net change in cell concentration and integrates all of these processes that are occurring across the river network upstream of the monitoring site.
To investigate the multi-stressor controls on phytoplankton dynamics, all increases in cell concentrations for each phytoplankton group were plotted against combinations of flow, water temperature, and solar radiation (Fig. 2).These plots show how these physical factors interact, and demonstrate that phytoplankton growth can only occur when all three physical parameters are within the niche threshold conditions presented in Table 2.This explains why weekly cell concentration increases only occur when flow, water temperature, or solar radiation are each within their threshold range.
It is important to note that this niche threshold approach is based on the environmental data at a single monitoring site; the River Thames at Runnymede.The phytoplankton community observed at Runnymede develops across the entire upstream river network and will be responding to different temperature, light, flow, and nutrient thresholds, with phytoplankton growth rates potentially varying across the different tributaries.However, all these complex signals from multiple phytoplankton communities and a range of catchment sources are integrated by the time they reach the lower monitoring site.The niche modeling approach presented here purely focuses on identifying the environmental conditions Table 2. Niche thresholds in water temperature, mean daily flow, and average daily solar radiation produce positive growth for the four monitored phytoplankton groups.

Phytoplankton group
Water temperature ( C) River flow (m 3 s À1 ) Solar radiation (W m À2 ) Soluble reactive P (μg L À1 ) Lower limit Upper limit Lower limit Upper limit Lower limit Lower limit that are present in the Thames at Runnymede when the catchment as a whole generates phytoplankton blooms.This eliminates the need to understand the complex biogeochemical processes occurring in multiple locations and tributaries across the catchment, but, more importantly, provides the vital thresholds to predict the commencement and cessation of different phytoplankton group blooms that are transported to the lower River Thames.

Phytoplankton time series analysis
The physical thresholds in water temperature, flow, and solar radiation derived from Fig. 2 and listed in Table 2, along with weekly SRP, nitrate and Si concentration data, were reapplied to the time series data for all four phytoplankton groups from 2011 to 2018.These derived niche thresholds were then used to test whether they were able to explain the complex phytoplankton dynamics observed in the lower River Thames at Runnymede.
The derived niche thresholds in water temperature, flow, and solar radiation, alongside nutrient concentrations, controlled almost all phases of diatom cell concentration increases and bloom collapses observed throughout the 2011-2018 monitoring period (Fig. 3).The timing of the annual diatom blooms was closely related to water temperature, with cell concentrations increasing as soon as temperatures increased above the 11.1 C lower threshold in the years 2011-2016.This probably explains why the timing of the bloom is relatively consistent (early to mid-April in most years).In the final 2 yr of this study, the bloom was not triggered by the lower water temperature threshold, as other physical thresholds were not met at that time.In 2017, there was low solar radiation when water temperatures increased above the 11.1 C threshold, and in the subsequent weeks, while the water temperature was within range, the flow was too low.In 2018, the diatom peak was delayed due to the flow being above the 100 m 3 s À1 upper flow threshold.
The largest and longest-duration diatom blooms occurred during dry periods of stable flow conditions, on the falling limb of the annual hydrograph in 2011, 2013, and 2015.Diatom cell concentrations rapidly reduced and blooms began to collapse in response to even relatively small rainfall events.These unstable flows and the related low light conditions often associated with rainfall resulted in low diatom biomass in 2012, 2014, and 2016.The cessation of the diatom blooms were all associated with either water temperature rising above the upper threshold of 19.4 C (2013, 2014, 2015, 2017, and 2018) or flow dropping below the 15.7 m 3 s À1 lower threshold (2011, 2012, 2013, 2015, and 2017).A similar upper-temperature threshold for diatom growth (20 C) was identified in a study of the Seine River, France (Garnier et al. 1995).The absence of late summer and autumn diatom blooms when temperatures become favorable again is due to flows being too low for cells to maintain their buoyancy.The nutrient limitation also potentially played an important role in limiting the magnitude of diatom blooms and the timing of bloom collapses.All diatoms peaks between 2011   and 2016 resulted in major depletions of dissolved reactive silicon and often SRP to potentially limiting concentrations of 0.5 mg Si L À1 (Lund 1950) and 30 μg P L À1 (Dodds et al. 2002;Bowes et al. 2012).

Water temperature
The output for the other three phytoplankton functional groups, plus a full discussion of the causes of bloom development and cessation, are provided in the Supporting Information (Supporting Information Figs.S4-S6).

Causes of bloom development and cessation
This niche modeling approach has demonstrated that River Thames blooms only commence when the final one or two physiochemical parameters come within the threshold, and these final parameters thereby control the timing of the bloom commencement.To investigate this further, every period when all parameters were within the threshold for at least two successive weeks was examined for the period 2011-2018.To determine the main controls on the timing of bloom development for each phytoplankton group, the final parameter (or parameters) to come into the threshold was identified.Conversely, to better understand the controls on bloom cessation, the parameter that first went out of the threshold was identified.Following the observation that blooms could also be terminated by small rainfall events, even though the flow remained within the flow threshold, an additional parameter (flow instability, set as an increase of ≥ 6 m 3 s À1 from the previous week) was included as a potential reason for bloom cessation.
There were clear differences in the drivers of bloom dynamics for the four phytoplankton groups (Fig. 4).Diatom blooms were initiated by flow (41%), water temperature (35%), and sunlight (24%), and these parameters also controlled diatom bloom cessation, alongside phosphorus and silicon limitation (16%) and, occasionally, flow instability caused by storm events.The drivers for nano-chlorophyte blooms were very similar to the drivers for diatoms, but phosphorus limitation occasionally appeared to delay the onset of the nanochlorophyte bloom during periods when a major diatom bloom had already become established.Major diatom blooms and their associated phosphorus depletion appeared to also be the main cause of delaying the pico-chlorophyte bloom.However, as discussed previously, this is probably due to the pico-chlorophytes being outcompeted or shaded out by large diatom biomass and their associated turbidity, rather than nutrient limitation directly.This assumption is further supported by the observation that sunlight was the most frequent control on pico-chlorophyte bloom commencement and cessation (34% and 42% of blooms, respectively).The causes of cyanobacterial blooms were very different from the other three phytoplankton groups.Their bloom dynamics were not influenced by nutrient depletion or flow instability, and mainly controlled by water temperature (responsible for 75% of bloom commencements and 67% of bloom cessations).This clearly shows that cyanobacterial blooms are driven by high water temperature and associated high sunshine levels.

Modeled prediction of bloom periods (2019-2022)
To test how effective this approach could be in predicting future phytoplankton dynamics, the niche thresholds derived from the 2011 to 2018 dataset (Fig. 2; Table 2) were applied to the 2019-2022 data from the same River Thames study site.Periods when the flow, water temperature, light, and SRP concentration were all within the threshold were identified for each phytoplankton group and then applied to the cell concentration data for 2019-2022 (Fig. 5).The weeks of high phytoplankton cell concentration (above the 90 th percentile) were identified and the effectiveness of the niche model at predicting these bloom periods was assessed using a confusion matrix approach (Fig. 6).
The niche modeling approach was a robust and reliable method for identifying periods of phytoplankton growth (with accuracies for all phytoplankton groups of > 80%; Fig. 6).Almost every observed diatom, nano-and pico-chlorophyte bloom coincided with a period where thresholds were met (14 of the 16 observed diatom blooms, 21 of the 24 nanochlorophyte blooms and 40 of the 41 pico-chlorophyte blooms) (Fig. 6).
The niche threshold approach also correctly predicted the periods of cyanobacterial growth in 2019, 2020, and 2022, and the initial peak in 2021 (Fig. 5).The only bloom period that was not adequately predicted was the cyanobacteria peak in late August/early September 2021, with only 1 week of the 5-week period of elevated cyanobacterial cell concentrations being deemed suitable for growth, suggesting that the niche thresholds for cyanobacteria may need to be optimized to reflect this new data.
The niche model predicted that diatom and nanochlorophyte blooms should have occurred in the autumn period of some years, as all environmental conditions were suitable.This occurred in October 2019 and September 2021 (Fig. 5), when diatom and nano-chlorophyte cell counts remained low.These over-predictions suggest that another parameter (such as zooplankton grazing) was suppressing the growth during this period, or that the late summer/autumn blooms are short-lived, lasting only a few days, as in August 2019 (Fig. 3; Supporting Information Fig. S4), and being missed by the weekly sampling regime used in this study.Another potential explanation is that the diatoms and nano-chlorophytes are not able to proliferate when other phytoplankton groups, such as pico-chlorophytes and cyanobacteria, are already established within the river.

Conclusions
The long-term application of flow cytometric characterization and enumeration of river phytoplankton has demonstrated that the seasonal succession of phytoplankton groups is relatively consistent, but the magnitude and timing of these blooms can vary greatly from year to year.Until now, the reasons for this inter-annual variation in river phytoplankton biomass have not been well understood, but the niche modeling approach developed and utilized in this study provides key system understanding of the timing and duration of these bloom dynamics for the River Thames.Phytoplankton abundances only increased when water temperature, light, flow, and nutrient concentrations were all within specific ranges, allowing bloom periods for individual phytoplankton groups and their annual succession to be predicted and understood.This confirms that these parameters are key to understanding river bloom dynamics in the River Thames, and additional parameters are not required.
The niche thresholds derived in this study imply that future climate change is likely to result in major changes in the timing and magnitude of phytoplankton succession in the River Thames.The forecasted wetter winters, and warmer spring and summer periods in southern England (Johnson et al. 2009) could affect diatom populations in particular.The period when flows are within the threshold for diatom net growth is likely to shift later in the year due to the time taken for the higher winter flows to subside.The period when water temperatures are within threshold are likely to shift to earlier in the year, due to the projected increase in air temperatures in spring and summer.The resulting asynchrony between the flow and water temperature is likely to reduce the diatom growth period and, therefore, reduce the magnitude and duration of future blooms, which could have major impacts on aquatic food webs.Hotter, drier summers will result in higher water temperatures and lower flows, which are likely to shift the phytoplankton community toward pico-chlorophyte dominance and increase the periods of cyanobacterial blooms, which could have implications for human health and future water supply for London and the surrounding region.
Flow cytometry offers a simple and robust technique to quantify and characterize river phytoplankton communities at appropriate monitoring frequency.Similar applications to a range of rivers would increase our understanding of river phytoplankton dynamics and how niche thresholds for individual phytoplankton groups vary across river typologies and regions.In particular, the approach used in this study can provide the key data and system understanding to inform river algal modeling and prediction of climate change impacts.However, it is worth noting some limitations with our approach to characterize the phytoplankton community.Flow cytometry characterizes phytoplankton cells based on phenotypic measures of cell size, cell complexity, pigment type, and pigment concentration.The ability to distinguish and rapidly (and inexpensively) count phytoplankton cells means that flow cytometry has the potential to play an important role in generating the highfrequency and long-term datasets that are needed to identify niche thresholds.However, flow cytometry does not characterize phytoplankton at the species level, making it challenging to link these data with existing knowledge on the behavior and functional groupings of river phytoplankton (Abonyi et al. 2021).Recent developments in flow cytometry, including flow imaging and flow sorting, can potentially provide the link between conventional taxonomic identifications and flow cytometry-derived groupings, further improving this approach.
The use of niche thresholds can also lead to the development of early warning systems to predict the timing of major blooms and collapses (with potential associated dissolved oxygen sags), and periods of cyanobacterial blooms.This would greatly support the catchment management decisions needed to maintain aquatic ecosystems and safeguard drinking water supplies in response to population growth, increasing water demand, and future climate change.

Fig. 1 .
Fig. 1.Topographic map of the River Thames basin, showing the location of the study site at Runnymede.Red star = Meteorological station at Heathrow.Red circle = UKCEH Meteorological station at Wallingford.

Fig. 2 .
Fig. 2. Multiple parameter plots of weekly phytoplankton cell increases in the River Thames at Runnymede, and water temperature, mean daily river flow, and average daily solar radiation over the previous 3 d (2011-2018 calibration period).Circle size indicates the increase in cell concentrations over the preceding week.Only positive weekly growth rates are presented.Niche thresholds are indicated by dashed boxes.

Fig. 3 .
Fig. 3. Diatom concentrations in the lower River Thames at Runnymede, alongside water temperature, flow, daily average solar radiation (over previous 3 d), and nutrient data.Gray columns indicate periods of elevated cell concentrations, with vertical dotted lines marking concentration peaks.Horizontal dashed lines indicate upper and lower niche thresholds for diatom growth for water temperature (red), river flow (blue), and solar radiation (yellow).

Fig. 5 .
Fig. 5. Phytoplankton cell concentrations at the River Thames at Runnymede for the period 2019-2022.Shaded blue boxes and vertical lines indicate multiple and single weeks (respectively) where all niche thresholds for phytoplankton growth were met.Dotted lines indicate the 90 th percentile for observed cell abundances.

Fig. 6 .
Fig. 6.Confusion matrices to evaluate the performance of the niche model application to the 2019 to 2022 test dataset.Bloom periods are defined as cell abundances > 90% percentile.

Table 1 .
Correlation coefficients of physicochemical parameters and chlorophyll/phytoplankton groups for the River Thames at Runnymede.Red shading = negative correlation.Blue shading = positive correlaton.