Reply to Comment by W. Knoben and M. Clark on “The Treatment of Uncertainty in Hydrometric Observations: A Probabilistic Description of Streamflow Records”

In this Reply, we address the concerns of Knoben and Clark (2023, https://doi.org/10.1029/2022WR034294) or KC23 that “the assumptions needed to effectively use difference‐based variance estimation methods are not always met by hourly streamflow records.” There should be little doubt that the assumptions of our difference‐based estimator will sometimes be violated in hourly streamflow records but the results from de Oliveira and Vrugt (2022, https://doi.org/10.1029/2022wr032263) and confirmed by the findings in our Reply show that such violations are sporadic enough not to affect much the error variance estimates. Snowmelt as pointed out by KC23 (https://doi.org/10.1029/2022WR034294) may not have received sufficient attention in our paper, yet their 365‐day record is simply not long enough to demonstrate bias of our discharge error variance estimates (and their dependence on flow level). This would require analysis of a much longer, multi‐year, streamflow record of a snowmelt‐driven watershed. The snowmelt catchment analyzed in dOV22 (https://doi.org/10.1029/2022wr032263) did not demonstrate bias in the discharge error variance estimates. We also provide additional clarification on the interpretation of the variance estimates obtained with the nonparametric estimator, and discuss the main issues in the test case presented in Knoben and Clark (2023, https://doi.org/10.1029/2022WR034294)).


Introduction
expressed some concerns about the method proposed in Vrugt et al. (2005, hereafter V05) and de Oliveira and Vrugt (2022, hereafter dOV22) to estimate the magnitude of aleatory errors in discharge times series.KC23 stated that "the assumptions needed to effectively use the statistical approach of V05; dOV22 are not always met by hourly streamflow records."We certainly do not rule out the possibility that the assumptions of our difference-based estimator will be violated at times in hourly streamflow records but the results from de Oliveira and Vrugt (2022) and confirmed by the findings in this Reply, demonstrate that such anomalous events do not affect much the error variance estimates.The difference-based variance estimator will provide a robust characterization of the magnitude of aleatory errors and its dependence on flow level if the discharge record is long enough.The 1-year streamflow record of KC23 does not satisfy this length requirement.Furthermore, the suggestion by KC23 that their anomalous event (snowmelt) is expected to repeat every year does not demonstrate bias of our estimator as it disregards the premise of difference-based estimation.Differencing the discharge time series only once will eliminate (or reduce) trends and seasonality.Each additional differencing step removes more memory from the time series ultimately leaving us with a record of random perturbations provided that (a) the data generating process  () is sufficiently smooth and (b) the measurement frequency is high compared to the typical timescale of  () .Thus, even if an apparently anomalous event (snowmelt) repeats every year, annual discharge records will differ and yield different error variance estimates.Only a multi-year streamflow record of a snowmelt-driven watershed will tell us whether the rapidly rising hydrographs due to snowmelt systematically bias the discharge error variance estimates.The snowmelt catchment analyzed in dOV22 did not show such bias (see Figure S5 in dOV22), but, if as KC23 suggest this is the case, then one could follow the suggestion of dOV22 for intermittent catchments and difference the discharge time series additional times.
The main concern expressed in KC23 is further separated into two issues, associated with (a) the appropriateness of local variance estimates,    2 h , provided by the nonparametric estimator, and (b) the effectiveness of the smoothing procedure.We argue and demonstrate that these perceived issues originate from a misinterpretation and misuse of the nonparametric estimator presented in dOV22, specifically, KC23 inappropriate interpretation Abstract In this Reply, we address the concerns of Knoben and Clark (2023, https://doi.org/10.1029/2022WR034294)or KC23 that "the assumptions needed to effectively use difference-based variance estimation methods are not always met by hourly streamflow records."There should be little doubt that the assumptions of our difference-based estimator will sometimes be violated in hourly streamflow records but the results from de Oliveira and Vrugt (2022, https://doi.org/10.1029/2022wr032263)and confirmed by the findings in our Reply show that such violations are sporadic enough not to affect much the error variance estimates.Snowmelt as pointed out by KC23 (https://doi.org/10.1029/2022WR034294)may not have received sufficient attention in our paper, yet their 365-day record is simply not long enough to demonstrate bias of our discharge error variance estimates (and their dependence on flow level).This would require analysis of a much longer, multi-year, streamflow record of a snowmelt-driven watershed.The snowmelt catchment analyzed in dOV22 (https://doi.org/10.1029/2022wr032263)did not demonstrate bias in the discharge error variance estimates.We also provide additional clarification on the interpretation of the variance estimates obtained with the nonparametric estimator, and discuss the main issues in the test case presented in Knoben and Clark (2023, https://doi.org/10.1029/2022WR034294)).

10.1029/2023WR036550
2 of 5 of individual    2 h values as variance estimates and gross overestimation of the aleatory errors.KC23 also questioned whether the magnitude of aleatory errors could be effectively estimated from a streamflow time series without accounting simultaneously for other sources of error.This Reply is organized along the main comments of KC23.In Section 2 we discuss the robustness of the nonparametric estimator when applied to hourly discharge records.Then, in Section 3 we provide further clarification on the interpretation of the variance estimates provided by the nonparametric estimator.Lastly, Section 4 discusses the use of the proposed nonparametric estimator to estimate aleatory errors in the presence of other (and potentially larger) sources of uncertainty in discharge records.

Robustness of the Nonparametric Estimator With Hourly Discharge Records
Through a series of synthetic case studies involving catchments with contrasting hydrologic conditions (Figure S1 in dOV22) and varying degrees of homoscedastic and heteroscedastic errors, dOV22 showed that the nonparametric error estimator provides unbiased estimates of the magnitude of aleatory errors of hourly discharge records.The estimator distinguishes random errors from the underlying signal (Section 3.1.1 in dOV22), hence returns aleatory variance estimates of near zero when applied directly to error-free discharge data (not shown).This merely confirms that hourly discharge data satisfy assumptions (a) and (b) of the estimator listed above.
Despite our comprehensive testing of the nonparametric error estimator in dOV22, KC23 is not convinced by the results of our methodology and argue that there is a "lack of compelling evidence that these methods can be applied to hourly discharge observations."KC23 argue that "the uncorrelated random nature of the synthetic errors creates time series that are not as smooth as real streamflow records."KC23 illustrate this issue by corrupting the discharge record from the Bow River at Banff (Alberta, Canada) with aleatory errors whose standard deviation is more than one order of magnitude larger than its actual (estimated) value.Specifically, KC23 assumed a slope α = 0.01 of the error function (Equation 4 in dOV22), whereas the estimated slope,    , of this record is less than 0.001, more than 10 times smaller!(see Figure 1).The consequences of this gross overestimation of α are evident in Figure 1 of KC23.The aleatory errors are unrealistically large and the corrupted discharge time series is overly jagged.If    = 0.0008 of Figure 1 were used instead, the aleatory errors would be much smaller on average and the perturbed discharge time series would be in much better agreement with the streamflow record.Unlike what KC23 have done, we do not visualize the perturbed  ( α = 0.0008 ) and measured hydrographs as we do not expect the two records to be similar.We will address this issue in the next paragraph.
There is a second problem with the method used by KC23 to generate corrupted streamflow records.This issue is important but less evident.KC23 adds the aleatory errors directly to the discharge record,  ỹ , but this time series already contains errors.Then, additional perturbations are only expected to deteriorate further record smoothness (among others) even if the induced errors are consistent with the aleatory errors expressed by the streamflow records.Instead, we should only add the random errors to a discharge time series that is free of aleatory errors.
In dOV22, different magnitudes of homoscedastic and heteroscedastic errors were tested and used to corrupt a simulated time series (free of aleatory errors) of watersheds with contrasting hydrologic regimes.In Figure 2a, we present such an error-corrupted time series for the Leaf River catchment.
Random errors were added to the simulated discharge record by sampling from a normal distribution with zero mean and heteroscedastic variance    2 h expressed by the error function of this watershed (we set    = 0.004 -see Figure 2 in dOV22).In Figure 2b, we also present the hourly streamflow record.At first glance the two records appear similar.Discrepancies between the time series reflect the combined effect of measurement and modeling errors.When we zoom in on the largest storm event in early March 2001, we observe a strong resemblance between the peaks of the corrupted and original records.Both records express a similar smoothness.Thus, the 10.1029/2023WR036550 3 of 5 claim of KC23 that "the uncorrelated random nature of the synthetic errors creates time series that are not as smooth as real streamflow records" is not supported by our results.

Clarification on the Interpretation of the Variance Estimates Provided by the Nonparametric Estimator
In this section, we address the alleged drawbacks of our method, that is, (a) the appropriateness of local variance estimates,    2 h , provided by the nonparametric estimator, and (b) the effectiveness of the smoothing procedure, and argue that these issues originate from a misinterpretation and misuse of the nonparametric estimator.
First, KC23 seems to be looking at each individual value or a few values of    2 h as an estimate of the variance for a specific flow level.As discussed in dOV22, "each    2 h individually is rather meaningless, since its value is estimated using only k + 1 points."Text S1 (SI) of dOV22 presents the derivation of the nonparametric estimator equation for k = 1 and should help clarify that the variance estimate is obtained from all    2 h values.For homoscedastic errors, this can be accomplished by computing the arithmetic mean of all    2 h values (Hall et al., 1990).For heteroscedastic errors, we apply the nonparametric estimator locally in the time series after which we summarize the resulting  ( ŷℎ, σh ) data pairs in dOV22 by fitting a linear regression function to the point cloud (e.g., see Figure 1 of this Reply).Thus, the statement made by KC23 that "the estimated standard deviation of the error,    , tends to be particularly high when the hydrograph changes rapidly, and not, as the error model used by V05 and dOV22 suggests it should be (see Equation 3), when the largest flow magnitudes occur" is misleading, since it implies that we can actually get a robust variance estimate for each (or a few) discharge value(s) separately.We illustrate this point further in Figure 3 using the hourly discharge record of the Leaf River catchment.

While an individual 𝐴𝐴
data pair in the "medium" flow range (denoted with a red cross) corresponds to the largest value of   h , there is an overall tendency of   h to increase with flow level.While it may be true that we yield some of the largest   h -values when the hydrograph changes rapidly, the so-obtained  (  h , σh ) data pairs make up only a small fraction of the point cloud.As a result, these "outliers" have a negligible impact on the overall   h − σh relationship (see also Figure 1).data pairs at medium and high flows needed for a robust characterization of the   h − σh relationship.dOV22 used more than 10 years of streamflow data for each catchment (median of 28 years).This length of the hourly discharge records "was judged [long] enough to infer the error model of each watershed."To our surprise, KC23 Figure 4 illustrates the consequences of using a much too short discharge record.We present the results of the nonparametric estimator using only 1 year of hourly streamflow values from the Leaf River.The value of   h for "medium" flows is comparable to that of "high" flows.This contradicts our earlier findings in Figure 3, which supports a heteroscedastic   h − σh relationship.
We do not discount the possibility that for some specific discharge records, the individual   h values deviate systematically from the assumed linear   h − σh -relationship.But even if this is true, the method presented in dOV22 is still plausible as "heteroscedasticity of discharge errors is expected since a power function (or

Figure 1 .
Figure 1.Scatter plot of hourly discharge  h (mm/d) and its estimated error standard deviation   h for the Bow River catchment.Each gray dot signifies a different data pair.The pink squares portray the central moving average of   h using a window of 100 data pairs.The solid black line is the regression line.Note that we are still using a logarithmic scale in the Figures presented in this Reply given the distribution of the variables under analysis.

Figure 2 .
Figure 2. (a) Error corrupted time series and (b) original discharge record of the Leaf River near Collins, MS (USGS 02472000).The corrupted time series was generated by adding to the simulated hourly discharges y h aleatory errors sampled from a normal distribution with zero mean and variance equal to  (h + h) 2, where m h is the arithmetic mean of the y h 's.Coefficients α and β are derived from the discharge record using the nonparametric estimator.Further information is found in the Supporting Information (SI) of dOV22.

Figure 4 .
Figure 4. Application of the nonparametric estimator to a 1 year extract of the discharge record of the Leaf River: (a) distribution of   h for medium flows, (b) distribution of   h for high flows, (c) relationship between hourly discharge y h and the error standard deviation   h .Each dot signifies a different data pair.The red cross in panel (a) corresponds to the largest value of   h and the black crosses in panels (a) and (b) portray the mean error standard deviations of both flow levels.

Figure 3 .
Figure 3. Application of the nonparametric estimator to the entire available discharge record of the Leaf River near Collins, MS (USGS 02472000): (a) Distribution of   h for medium flows, (b) distribution of   h for high flows, (c) relationship between hourly discharge and the error standard deviation   h .Each dot signifies a different data pair.The red cross in panel (a) corresponds to the largest value of   h and the black crosses in panels (a) and (b) denote the mean error standard deviations of both flow levels.
The above focus on just a few members of the point cloud distracts from what is truly important in determining the error variance of streamflow records.Only a long data record guarantees a large enough sample of