Macro‐invertebrate Community Response to Multi‐annual Hydrological Indicators

Flow is widely considered one of the primary drivers of instream ecological response. Increasingly, hydroecological models form the basis of integrated and sustainable approaches to river management, linking flow to ecological response. In doing so, the most ecologically relevant hydrological variables should be selected. Some studies have observed a delayed macro‐invertebrate (ecological) response to these variables (i.e. a cumulative inter‐annual effect, referred to as multi‐annual) in groundwater‐fed rivers. To date, only limited research has been performed investigating this phenomenon. This paper examines the ecological response to multi‐annual flow indicators for a groundwater‐fed river. Relationships between instream ecology and flow were investigated by means of a novel methodological framework developed by integrating statistical data analysis and modelling techniques, such as principal component analysis and multistep regression approaches. Results demonstrated a strong multi‐annual multi‐seasonal effect. Inclusion of additional antecedent flows indicators appears to enhance overall model performance (in some cases, goodness of fit statistics such as the adjusted R‐squared value exceeded 0.6). These results strongly suggest that, in order to understand potential changes to instream ecology arising from changing flow regimes, multi‐annual and multi‐seasonal relationships should be considered in hydroecological modelling. © 2017 The Authors River Research and Applications Published by John Wiley & Sons Ltd.


INTRODUCTION
The relationship between flow regime and instream ecological health has been the focus of significant recent research (e.g. Lytle and Poff 2004;Arthington et al. 2006;Dudgeon et al. 2006;Monk et al. 2008;Worrall et al. 2014). Freshwater aquatic systems support the provision of many key ecosystem services, including clean (drinking) water, flood protection, food, recreation, wild species habitat and support for interconnected systems (UK NEA 2011). Within the context of the provision of services, there is a clear conflict between the ecological and anthropogenic demands placed upon lotic ecosystems. Since the 1940s, efforts have been made to quantify the minimum flows required to protect freshwater fluvial ecosystems , leading to the more recent environmental flows research (e.g. Petts 2009). Environmental flows can be defined as the 'quantity, timing and quality of water flows required to sustain [freshwater ecosystems] and the human livelihoods and well-being that depend on these ecosystems' (Hirji and Davis 2009, pp. 13 and 14). It is understood that natural variability in flow is critical for the preservation of aquatic ecosystems (Dudgeon et al. 2006) and maintenance of this variability is critical to this research. In order to help balance conflicting requirements often placed on lotic ecosystems, and to further research in the field, accurate modelling is essential.
The use of numerical models (both process and datadriven models) that link flow and freshwater ecological response is a well-established technique for investigating instream response to flow changes (Dunbar et al. 2007). Hydrological descriptors and ecological data can serve as the basis for the development of such models (Richter et al. 1996;Arthington et al. 2006;Monk et al. 2008). Macro-invertebrates are particularly sensitive to change (in water chemistry/quality, physical habitat and flow regime) whilst exhibiting a clear response to environmental perturbations, making them ideal biological indicators (Acreman et al. 2008;EA 2013). As such, macro-invertebrates (e.g. through standard scoring techniques) commonly serve as a proxy for ecological response and can be linked to hydrological or hydraulic variables in order to test their response to a changing flow regime (e.g. Extence et al. 1987;Dunbar and Mould 2009). The Lotic-invertebrate Index for Flow Evaluation (LIFE) is a weighted index taking into account macro-invertebrate community flow velocity preferences (Extence et al. 1999), making it well suited for such applications. Hydroecological data sets are created by linking the ecological data (such as LIFE score) with hydrological indicators (e.g. mean flow, Q10 and Q95) from the period immediately preceding the sampling. This method has been employed in many studies over the past 2 decades, for example, see Clausen and Biggs (1997), Monk et al. (2006), Exley (2006), Monk et al. (2008), Dunbar et al. (2010) and Worrall et al. (2014).
In models, the flow can be expressed as a continuous time series or discrete hydrological indicators representing interannual or intra-annual variation (defined as the between year and within year flow components, respectively). If discrete indicators are chosen, then the identified variables must be hydrologically, ecologically, or biologically, relevant. These indicators are frequently identified and refined through statistical approaches such as principal component analysis (PCA) redundancy (Olden and Poff 2003;Monk et al. 2008). Where intra-annual variation has been the focus, such as flooding, the conditions immediately preceding sampling tend to be at the exclusive centre of the research (e.g. Greenwood and Booker 2015). This may overlook the cumulative effects of antecedent flow conditions in the preceding seasons and years (Durance and Ormerod 2007), that is, the multi-year, or multi-annual, effect. This is particularly true for rivers with higher Base Flow Indices (BFIs) (groundwater-fed) where there may be a lag in macro-invertebrate community response following extreme hydrological events (i.e. floods and droughts) (Boulton 2003;EA 2005). This lag represents a delayed response of the community to antecedent flow conditions (over seasonal and/or annual timescales). Such lag has been seen to characterize strong ecological responses, specifically in the case of extreme flow disturbances (Wood and Armitage 2004;Wright et al. 2004). To date, limited work has been carried out to explore the effects of these lags on the hydroecological relationship (e.g. Clarke and Dunbar 2005). Rivers around the globe derive their streamflow from a variety of sources, including a significant contribution from groundwater/aquifers (although this contribution is highly variable both spatially and temporally). Lags in ecological response within groundwater dominated systems may therefore be of crucial interest.
In order to better model flow variability, and hence improve current understanding of hydroecological relationships for groundwater rivers, this paper aims to examine the presence of lag in the hydroecological relationship (using LIFE scores as a proxy). These relationships are assessed using a long-term (21-year, 1993-2014) paired hydrological and ecological data set for a groundwater dominated system (River Nar, Norfolk, UK). Multi-annual and multi-seasonal flow variables are intended to account for both the cumulative (inter-annual) and seasonal (intra-annual) flow effects.
The multi-annual aspect of the hydroecological relationship (lag) is systematically explored within the proposed statistical modelling framework through the addition of time-offset hydrological variables. Thus, the key objectives are the following: (1) To identify and develop a suitable statistical modelling framework exploring the multi-annual and multi-seasonal aspect of the hydroecological relationship (a lag in response); (2) To examine the influence of seasonal low/high flows within the relationship; and (3) To explore practical channels for wider implementation of the framework.

Study area
The groundwater-fed River Nar (Norfolk, UK; Figure 1), one of southern England's highly valued chalk streams, serves as the focus for this study. The high BFI of the river, the length of the hydroecological data set, and prior observations of lag in ecological response (Visser 2015;Garbe et al. 2016) make the Nar an ideal candidate for study. The River Nar rises in the Norfolk chalk hills 60-m above sea level, flowing west for 42 km, transitioning from steep to a far gentler gradient at Narborough (Figure 1). This topography and underlying geology give rise to two very different ecosystems. Upstream of Narborough, the Nar flows as a (groundwater-fed) chalk river; thereafter, the chalk has been eroded forming a fen basin ( Figure 1). This distinctive change at the river's midpoint has led to its designation as a Site of Special Scientific Interest. Because of the presence of the two 'distinct river units', the chalk and fen river sections are considered distinct entities, with the focus of this paper falling on the chalk reach only. The Nar is subject to significant low flow stresses, further amplified by overabstraction and extensive channel modification, thereby inhibiting the river's ecological potential (NRT 2012).
The River Nar has a BFI of 0.91 (CEH 2014). This dependence on groundwater results in a highly seasonal flow regime; aquifer recharge primarily occurs in autumn, resulting in a progressive rise in river flow until March/April. Chalk rivers are typified by their relatively low flows (Figure 2). For the available record , the average mean flow is 1.133 m 3 /s, whereas Q10 and Q95 (the daily streamflow values that are exceeded 10% and 95% of the time) are 2.046 m 3 /s and 0.387 m 3 /s, respectively (CEH 2014).

Data
Macro-invertebrate biomonitoring data were made available by the Environment Agency  for 10 sites on the River Nar (six of which are situated in the chalk reach) ( Figure 1) (EA 2015). For modelling purposes, these data were utilized in the form of LIFE scores. Following Dunbar et al. 2006, species level LIFE scores were utilized for both the spring (April-June) and autumn (October-December) seasons, when peaks in macro-invertebrate activity are observed (Lenz 1997). To effectively accommodate the different relationships expected for the spring and autumn macro-invertebrate life cycles, seasons were considered as two separate scenarios ( Figure 3). In order to make the site data comparable, the seasonal biotic data were ratio standardized per site.
Daily mean flow data were extracted from the National River Flow Archive for the Marham gauge (TF723119) between 1958 and 2014 (CEH 2014). Typically, a multitude of flow variables is derived (Richter et al. 1997); however, in the first instance, this work focuses on basic flow exceedance variables (Q10 and Q95) in order to establish simple interpretation of the hypothesized relationship with multiannual antecedent flows. Daily flows for the time period (1989-2014; Figure 2) are converted into seasonal (summer: April to September; winter: October to March) flow variables using flow duration analysis. Flow variables are statistically standardized (normalized).
The ecological and the four seasonal flow variables (summer/winter Q10 and Q95) are paired, as is normal (after  Monk et al. 2008), and the data pooled to produce aggregated regression models. To account for the lag in response, these flow variables are time-offset by a year (t À 1) to a maximum of 3 years (t À 3) (Table I). Previous work (Visser 2015) trialled a time-offset up to t À 5 and found that the predictive power of the models plateaued at t À 3 years. Additionally, adding variables significantly increased computational demands because of an impractical number of variable combinations (Table I).

Data screening
Pressures, resulting in anomalous data points, are known to prevent the detection of relationships between antecedent flow and LIFE score (Clarke and Dunbar 2005). Therefore, those sites affected by issues such as low water quality, sediment ingress, sampling issues or other sources of variability were excluded from this work; a total of three sites were removed. Removal criterion was in accordance with Clarke and Dunbar (2005, p. 16), Dunbar et al. (2006, pp. 1 and 2) and Dunbar and Mould (2009, pp. 1-3).

Modelling and statistical analysis
The aim of multistep regression modelling is to assess a complete suite of candidate models (which can be obtained from different combinations of response variables in the modelling runs) and to identify the candidates that are both statistically significant and offer sufficient predictive power. Here, a model represents any candidate that achieves significance (p > 0.05). In the context of the present paper, models should encompass lag in the hydroecological relationship with LIFE. The potential for a multi-seasonal aspect to the relationship was assessed via various combinations of the seasonal flow variables (summer and winter). The putative multi-annual aspect was considered via the introduction of their associated time-offset flows.
To effectively integrate multi-seasonal and/or multiannual aspects of antecedent flows into the hydroecological relationship, the proposed modelling framework integrated up to 16 variables, shown in Table I. The derivation of this framework is summarized in Figure 3. All analysis was performed using R, an open source software environment for statistical programming and graphical analysis (R Core Team 2016); where a pre-existing package was employed, it is referenced as appropriate.
Multistep (or multistage) regression modelling is a popular technique for reducing the number of predictor variables in large data sets (Wasserman and Roeder 2009). In each step, regression of different variable combinations is considered, resulting in a number of candidate models. The variable combinations are determined by the method applied: forward, backward or stepwise selection. Forward selection is the simplest of the three, where variables are added one at a time and the variable's contribution to the candidate model is assessed against a threshold or stopping point. When a variable has been added to the candidate model, it cannot be removed. In backwards selection, the variables are removed one at a time, but here, the variable with the smallest contribution is removed at each step. Stepwise selection is the most exhaustive of the three, where variables may be both added and removed at each step.
Here, three different approaches were considered, using both forward (approach 1) and stepwise selection (approaches 2 and 3) on three subsets of the hydrological variables; these are summarized in Figure 3 and discussed next. The presence of any lag in the hydroecological relationship was first identified using the simplest and broadest statistical methods. The initial variable subset provides an overall view, consisting of the combinations of two seasonal variables (and their associated time-offsets), summarized in   Table II. These candidates are considered through the application of forward selection. This was followed by two, more sophisticated, stepwise approaches ( Figure 3), with the focus on optimizing the modelling. The stepwise selection was applied using the R package 'leaps' (Thomas Lumley using Fortran code by Alan Miller 2009), using the object ' leaps. exhaustive ' to determine the best model variable combinations. One of the first tasks in hydroecological modelling is to reduce the level of hydrologic variable redundancy, thereby simplifying the analyses. To this end, PCA for variable selection (after Olden and Poff 2003) was applied, using broken-stick as the stopping rule (Jackson 1993). This PCA-reduced variable subset was modelled in the second approach (Figure 3). Monk et al. (2007, p. 113) cast doubt over the use of PCA for hydrological variable selection, stating that it is necessary to 'exercise caution when employing data reduction/index redundancy approaches, as they may reject variables of ecological significance'. Greater scepticism arises because the approaches taken here depart markedly from other work. Therefore, seeking completeness, the full set of 16 variables was considered for the final iteration (Figure 3, approach 3).
Multistep regression techniques are often criticized because of their (frequently) automatic nature and concerns over the robustness of the selection algorithm (Wasserman and Roeder 2009). For example, this lack of user interaction can lead to convergence on a poor model. Here, model selection was assessed semi-automatically via a custom algorithm requiring user input; this dialogue helps retain the awareness of the user during the multistep process. The 'best' models were then selected on the basis (in order of importance) of their Bayesian information criterion (BIC) score, the power of the model ( R 2 ) and the data input requirements. BIC is an assessment of the relative 'goodness' of models based upon log-likelihood and penalty terms (Raftery 1995), thereby allowing the selection of the simplest model whilst not sacrificing accuracy excessively. The criterion provides a measure of the weight of evidence in favour of particular models. The goodness of fit, R-squared (R 2 ), is not presented because of a tendency for overfitting in multiple regression models (Yin and Fan 2001). To account for this, the adjusted R-squared ( R 2 ) [based upon the frequently used Wherry formula-1 (Yin and Fan 2001)] is quoted instead. The power or fit of models can potentially be improved through the removal of redundancy and/or noise using PCA, where it allows the user to retain most of the variability in the data through the first few components. The stopping point was determined using the broken-stick. Factor selection modelling, essentially regression models composing of the principal components, was then applied as before.

RESULTS
Three approaches to the modelling were considered. Each approach was applied to two distinct scenarios, spring and autumn; results from these scenarios should be considered as distinct. Because of the large numbers of models produced, only the five 'best' models are discussed (selected by the supporting weight of evidence, ΔBIC). The first approach is an exception, because of the reduced number of candidates, and only the three best models are presented. The model naming convention references the scenario, approach number and model ranking. For example, model S3.1 is a model from the spring scenario, derived using the third approach, and is the best model from that approach.
Approach 1-initial variable subset The first approach was based upon a subset considering all of the hydrological variables. The number of candidates was limited to 112 (Table II) as the purpose of this first approach was to determine the presence of lag in the hydroecological relationship. The summary statistics associated with the three best models are summarized in Tables III  and IV, for the spring and autumn scenarios, respectively. The model structures are summarized in Figure 4. Principal component analysis was applied to the best models from each scenario. Factor models were then produced from the most relevant principal components (determined using the broken-stick method). The associated summary statistics are also included in Tables III and IV.

Approach 2-principal component analysis-reduced variable subset
In this iteration, PCA for variable reduction was applied. The spring and autumn scenarios variable subsets were reduced from 16 to 6 as follows: Spring: Q S10 (t), Q S95 (t À 1), Q W10 (t À 1), Q W10 (t À 3), Q W95 (t À 1) and Q W95 (t À 3); Autumn: Q S10 (t), Q S10 (t À 3), Q S95 (t), Q S95 (t À 3), Q W10 (t À 1) and Q W95 (t À 1). The total number of candidate models was reduced to 63, this time considered through stepwise selection. The summary statistics associated with the five adjudged best models for the spring scenario are summarized in Table III; the model structures are summarized in Figure 4. No models were derived for the autumn scenario. The summary statistics for the reduced dimension factor models are also available in Table III.

Approach 3-all variables
This final approach considered all 16 hydrologic variables, for a total of 65 535 possible candidates. Stepwise selection reduced this to a manageable scale. The summary statistics associated with the five best models, from each scenario, are summarized in Tables III and IV. The model structures are summarized in Figure 4. The summary statistics for the reduced dimension factor models are available in Tables III  and IV.

Approach 1-initial variable subset
The primary aim of the first approach was to detect if lag in the hydroecological relationship for LIFE was present. Tables III and IV show this to be true. In fact, out of 224 The R 2 column is the adjusted R-squared, and the weight of evidence is Raftery's (1995) grading of model quality based on ΔBIC. The reduced dimension factor models for each of these are presented on the right; where models consist of only one variable, no factor model is possible. The second approach (principal component analysis-reduced variable subset; A2) featured no models.
A. VISSER ET AL.
combinations (for both scenarios), there were 147 of the candidates represented viable models (i.e. achieved significance).
In the case of the spring scenario, the weight of evidence is relatively low for models S1.2 and S1.3, whereas the adjusted R-squared is similar for all three (Table III). This example clearly illustrates the important role of BIC in selecting the best models. Regarding the model structure (Figure 4), summer Q10 flows feature most strongly, whereas the presence of winter Q95 variables illustrates the critical nature of winter low flows.
The factor models composed of the principal components may or may not improve the interpretability of the data. In this case, the models were identical for S1.1 and S1.2 (Table III), whereas S1.3 showed improvement both in the weight of evidence and adjusted R-squared. This improvement suggests that it is the strongest model available for the spring scenario. By reducing redundancy, a more efficient model is produced. This is particularly encouraging as it is a purely procedural change with no further data requirements.
For the autumn scenario, the weight of evidence in favour of the three best models is considerably increased (Table IV). This represents the best possible outcome in terms of confirming the presence of lag in the hydroecological relationship. It should be noted that although model A1.1 achieves the highest BIC weighting, it does not feature the highest adjusted R-squared (as seen previously for the spring scenario). The autumn models show no relationship with winter flows, rather they relate more strongly with summer ( Figure 4). The factor models show no change, being identical to the best models (Table IV).

Approach 2-principal component analysis reduced variable subset
After confirmation of the presence of the hypothesized relationship, this first iteration sought to improve upon the methods and models via reduced redundancy. The redundant variables were removed through Olden and Poff's (2003) 'PCA redundancy approach', reducing the number of variables from 16 to 6 for both scenarios. For this approach, a broader range of candidates is considered through stepwise selection.
Here, the models for both scenarios are unsatisfactory (Tables III and IV). They exhibit no overall improvement over those produced using the more limited methodology of approach 1. Only the factor model for S2.5 shows an improvement in the weight of evidence. There is limited value in a lengthy consideration of these models because of their poor quality. Examination of the variable subsets for spring in approaches 1 and 2 (see Table II and section on Approach 2-Principal Component Analysis-reduced Variable Subset) suggests that the PCA did not capture the ecologically relevant variables, a concern cited by Monk et al. (2007) previously. (Monk concluded that subtle factors beyond the dominant sources of statistical variation may be more influential.) This argument is further bolstered by the fact that the autumn candidates were unable to present any significant combinations.

Approach 3-all variables
In light of the results from approach 2, and for the sake of completeness, a final iteration considering all 16 variables was applied. The weight of evidence in favour of the best models produced in this final iteration shows it to be the most successful in capturing the LIFE-correlated lagged hydroecological relationships (Tables III and IV). The models for the spring scenario are most notable, with the BIC weight of evidence exceeding Raftery's (1995) highest grading. The corresponding adjusted R-squared for each model is similarly positive. This is particularly interesting when compared with corresponding values presented in the literature: multiple catchment studies such as Clarke and Dunbar (2005), Dunbar et al. (2006) and Monk et al. (2007) achieved values of between 0.2 and 0.3; despite focussing on the River Itchen exclusively, Exley (2006) also achieved values of around 0.3. For this scenario, again, the factor models provided no improvement.
The approach 3 spring scenario models were best overall. In particular, they show considerable improvement over those from approach 1. In light of this, it is not surprising that variable inclusion in the models has evolved (Figure 4). However, the focus on summer high flows remains, featuring the largest coefficients (Figure 4). Given that spring and summer months tend to be a very active period for macro-invertebrates (Lenz 1997), it therefore follows that summer flows have a strong influence over spring LIFE scores, and by extension, ecosystem health. The observed negative relationship with summer flows (beyond the immediately preceding antecedent flow, t) suggests that naturally occurring high and low flows exert a moderating effect on LIFE scores (Lytle and Poff 2004). As discussed prior, the emphasis on winter flows highlights their importance for aquifer recharge and, by extension, LIFE scores.
The best models for the autumn scenario also occur as a result of the third approach (Table IV). Despite a lower overall quality of models, the reduced number of variables (Figure 4), and hence data requirements, is appealing. Again, the models retain a very strongly positive predication upon summer high flows, here some orders of magnitude greater than the others. The models also reveal a strongly inverse influence of winter low flows (t À 1) (Figure 4). This flow is that which occurs at the time of the autumn macroinvertebrate sampling.
The autumn factor models exhibit an improvement for one case, S3.3, resulting in the best overall model. This highlights that, although there are no guarantees that factor models will improve model quality, there is some value in its implementation, particularly as it requires no additional data requirements.

Implications
This work illustrates clearly the significance of accounting for lag (in the form of multi-annual and multi-seasonal flow variables) in the LIFE hydroecological relationship (in the River Nar). Of the 26 best models identified, only three relied on the direct antecedent flow (i.e. utilized one previous year of flow data). Further, overall, these were some of the poorest models produced (in terms of the model quality, BIC). This suggests that, in this case of a groundwater-fed river, it could be presumed that a single year of antecedent flows overlooks critical information.
The principal difficulty in the use of the multi-annual and multi-seasonal flow variables could be the potential data requirements. This work suggests that, for effective modelling of the spring scenario, a consistent suite of 4 years of data is required (Table III). In contrast, the autumn scenario requires much less input with just 2 years (Table IV). However, it is made clear that, by accounting for the multi-seasonal multi-annual variation in flow in the modelling framework, the models can be significantly improved through better representation of the natural variability of the river system. An understanding of which is fundamental to active maintenance of any riverine system's ecological integrity (Petts 2009).
Incidentally, the methods employed also highlight the need to consider more comprehensive statistical approaches when embarking on modelling of this type. The failure of approach 2, where PCA was used to identify variable redundancy, further stresses the need to exercise caution. The authors would thus promote consideration of modelling both with and without this redundancy technique. This is not to say that PCA techniques have no application potential; the factor models did (on occasion) provide some improvement to the best models.
Considering the wider impact of the present work, modelling the ecological season plays an important role. This choice is typically made based upon the goals of the modelling. For example, brown trout is a key species in the River Nar, being valued highly by the local fisherman (Garbe et al. 2016). One of their primary food sources is the Mayfly (Ephemeroptera baetidae) that hatch during the spring season. Therefore, in the Nar, if environmental flows were to be set to promote brown trout population, the focus of the modelling efforts should surround spring. It may also be possible that the consideration of additional, more ecologically relevant, hydrological variables [selected through the Indicators of Hydrologic Alteration method (Richter et al. 1996)] may reduce data requirements. (The application of the Indicators of Hydrologic Alteration forms part of the body of future work.) However, this may simply be dependent upon the type of river under consideration.
The outcomes of this work appear particularly pertinent to water resource planning and environmental flows research. Better understanding of longer-term hydroecological relationships allows for enhanced resilience. This is particularly relevant in the case of climate change where the outlook is uncertain. The simple application of the methods applied herein, easily replicated using R, or another programming language, makes it both accessible and replicable. It is hoped that this can be simplified further still in the future through a framework or package. However, before it can be considered for general use, there is need for further work considering other more ecologically relevant hydrologic variables as well as application to other rivers.

CONCLUSIONS
The variability of the natural flow regime, particularly floods and droughts, is known to be critical to ecological health (Lytle and Poff 2004). For rivers with a higher BFI (groundwater-fed), there may be lags in ecological response to this variability (Boulton 2003). Currently, the majority of research focuses on the inter-annual hydrologic variation that immediately precedes ecological sampling, and in neglecting a broader temporal view, may be failing to present a true picture of the reality. The research presented herein has taken a multi-annual (cumulative inter-annual) and multi-seasonal (direct intra-annual) approach, using a groundwater-fed river with a high BFI to explore these patterns (using simple hydrological variables as proof of concept).
The first aim of this study was to identify whether there was a multi-annual LIFE-correlated hydroecological relationship in evidence in the case study river, the River Nar. The dimensionality of the data set required the derivation of a new methodology, explored through three approaches. Two scenarios were considered in order to account for the different macro-invertebrate life cycles. The best and strongest relationships were seen to occur for the spring scenario, using the third approach. Relative to other studies (e.g. Clarke and Dunbar 2005;Dunbar et al. 2006;Exley 2006;and Monk et al. 2007), the strength end of these relationships is strongly suggestive of a positive multi-annual hydroecological relationship.
The second priority was to examine which flows resulted in the strongest relationships. Throughout, the best models featured primarily high flows. It is thought that the reasons for this could be due to the relatively high variation of high flows when compared with low flows. Unexpectedly, the most critical high flows appeared to occur in summer as opposed to winter when aquifer recharge occurs. However, these findings do not suggest that winter aquifer recharge is unimportant as the magnitude of summer high flows is ultimately dependent upon this recharge. The importance of low flows is also evident.
Finally, the findings suggest that the additional hydroecological data requirements may vary. For the spring scenario, a total of 4 years of antecedent flow data would be required, whereas for autumn, only 2 years were required. This reduction in data input was however at the cost of model power. This study focussed on simple hydrological variables. Further work should broaden this data set, and consider further river types, and in so doing, the ecological relevance of the lag may differ.
An incidental conclusion of the work was the role of PCA. PCA is frequently used to reduce hydrologic variable redundancy. Concerns regarding this approach have been raised in the past (Monk et al. 2007) with the findings here supporting this (approach 2). However, the use of factor models (principal component regression modelling; approach 3) showed positive results in some situations. This is of particular interests because it requires no additional data.
Overall, this research has demonstrated the presence of a positive multi-annual hydroecological relationship. These results confirm that current methods that focus on interannual and intra-annual relationships in their current format (immediately preceding ecological sampling) relationships may underestimate the response. Consideration of a broader temporal scale, with a more comprehensive statistical approach, appears likely to result in a more complex understanding of ecological response.