Corresponding author: S. V. Weijs, School of Architecture, Civil and Environmental Engineering, École Polytechnique Fédérale de Lausanne, EPFL Station 2, Bat. GR, CH-1015 Lausanne, Switzerland. (firstname.lastname@example.org)
 Streamflow time series are important for inference and understanding of the hydrological processes in alpine watersheds. Because streamflow is expensive to continuously measure directly, it is usually derived from measured water levels, using a rating curve modeling the stage-discharge relationship. In alpine streams, this practice is complicated by the fact that the streambed constantly changes due to erosion and sedimentation by the turbulent mountain streams. This makes the stage-discharge relationship dynamic, requiring frequent discharge gaugings to have reliable streamflow estimates. During an ongoing field study in the Val Ferret watershed in the Swiss Alps, 93 streamflow values were measured in the period 2009–2011 using salt dilution gauging with the gulp injection method. The natural background electrical conductivity in the stream, which was measured as by-product of these gaugings, was shown to be a strong predictor for the streamflow, even marginally outperforming water level. Analysis of the residuals of both predictive relations revealed errors in the gauged streamflows. These could be corrected by filtering disinformation from erroneous calibration coefficients. In total, extracting information from the auxiliary data enabled to reduce the uncertainty in the rating curve, as measured by the root-mean-square error in log-transformed streamflow relative to that of the original stage-discharge relationship, by 43.7%.
 Streamflow, as a spatially integrated watershed response, is one of the most important inputs for hydrological modeling [Brutsaert and Nieber, 1977; Szilagyi et al., 1998]. Because of the difficulties of continuously measuring streamflow (Q) directly in an inexpensive and simple way, streamflow records are often based on permanent measurements of water levels (h), sometimes even obtained remotely [Liebe et al., 2009], in combination with a stage-discharge relation, Q(h). The stage-discharge relation is based on regular gaugings in different conditions. In the case of fixed structures, like flumes, Q(h) is well defined and can also be derived theoretically, while in more natural situations, the relationship needs to be calibrated and can be influenced by, for example, vegetation or sediment dynamics.
 The role of dynamic morphology is especially important in alpine watersheds, where the streams are never lying quietly in their beds, but constantly changing them under the influence of the steep gradients and turbulent erosion of the sediments. Together with the often hard to reach locations and challenging conditions, this makes it difficult to monitor streamflow in alpine watersheds. The resulting uncertainty in streamflow records has a negative impact on hydrological modeling, especially because it is often not explicitly accounted for.
 One remedy to reduce the impact of uncertainties on hydrological modeling is to quantify the uncertainties in the discharge signal, so that the model knows what it can learn from the data and what it cannot learn. Over the last decade, significant progress has been made to address uncertainties in the data (both input and output) and models [Ciach and Krajewski, 1999; Anagnostou et al., 1999; Szilagyi and Parlange, 1999; Vrugt et al., 2005; Kavetski et al., 2006; Thyer et al., 2009; Di Baldassarre and Montanari, 2009; Schoups and Vrugt, 2010; Kuczera et al., 2010; Kampf and Burges, 2010; Westerberg et al., 2011]. Other approaches focused on including observational uncertainties in information theoretical evaluation criteria for probabilistic forecasts [Weijs and Van de Giesen, 2011] and provided arguments why explicitly representing uncertainties in the model and data formulation and calibrating with information-theoretical measures has advantages from a philosophy of science viewpoint [Weijs et al., 2010]. Because using the same information twice can be logically inconsistent, it might be important to formulate models for the data uncertainty independently of the formulation of hydrological models. In the case of discharge measurements, this means replacing a simple curve representing the stage discharge relationship by a probabilistic model that may employ several sources of information for estimating streamflow and its uncertainty.
 In this paper, we investigate the potential for stream water natural electrical conductivity (EC), measured before each of the 93 discharge gaugings, as an auxiliary information source for improving streamflow estimates in morphologically dynamic alpine streams. For our study area, we found the natural EC to have a predictive power for the measured streamflow comparable to that of water level. When a single relation was sought for the entire 3 year measurement period, the stage-discharge relationship was even slightly outperformed by the EC-streamflow relationship.
 Although strong relations between EC and streamflow have been observed before [Collins, 1979; Collins and Young, 1981; Gurnell and Fenn, 1985; Evans and Davies, 1998; Duffy and Cusumano, 1998; Dzikowski and Jobard, 2011], its potential for improving streamflow records has, to our knowledge, not been discussed. Although these relations are often complex, with lags and hysteresis, we focus on simple models that can be derived without continuous data. An extensive review of possible mechanisms at work behind the relations is therefore outside the scope of this paper but can be found in the aforementioned references. Since methods for explicitly incorporating output (streamflow) uncertainty in hydrological models are advancing and probabilistic treatment of uncertainties in a Bayesian framework has the potential of optimally combining various sources of information, we believe that future more extensive continuous measurements of EC have the potential to improve streamflow records and advance understanding of alpine hydrology.
2. Site, Data, and Methods
2.1. Val Ferret Watershed
 Val Ferret, situated in the Swiss canton of Valais, is an alpine valley draining into the Dranse de Ferret, the Dranse, and eventually the Rhone. The area of study measures around 20 km2 and ranges in elevation between 1775 m above sea level (asl) to 3206 m asl, with a mean of 2423 m asl. The slopes are moderate to steep (mean 31.6°, maximum 88.9°) and partly soil mantled. Vegetation is mainly grasses while some patches of firs are found at lower elevations. The river is partly fed by the melt of the small glacier des Angroniettes in the upper part of the catchment. See Simoni et al.  for a more detailed description of the study site. The streamflow shows a regular diurnal signal, which gradually decreases in amplitude going from spring to autumn. Also, the average flow itself shows this decreasing trend over the season, suggesting a snow melt-dominated flow regime in spring, gradually changing to a signal of groundwater outflow recession curves, a small diurnal signal from glacier melt, and response to rainfall events. The main stream has a bankfull width of around 4 m and an average bed slope of 5% around the measurement location. The bed material consists mostly of rocks and pebbles of 5–30 cm, with some finer sediments deposited in the stiller pools, especially after flood events. The morphology around the measurement point is braided, while more upstream, the stream is confined to a narrow gorge, with some small waterfalls occurring.
2.2. Water Level Measurements
 Water levels of the stream were monitored at the outlet of the studied catchment, at the bridge at l'Ars Dessus. They were recorded using a pressure sensor with logger, placed close to the bank, 1 m upstream from the bridge, where the water is fairly stagnant. During winter, reliable measurements are unavailable due to snow and ice blockage. The embankments under the bridge are stabilized, constraining the stream width, while the river bed consists of rocks and finer sediments that can be moved by the water. The time series have a temporal resolution of one sample per minute. For use in the rating curve, averages over a window of 60 min, centered around the salt peak, are used. The standard deviations within these windows are shown in Figure 1.
2.3. Streamflow Measurements Using Salt Dilution Gauging
 To measure streamflow, salt dilution gaugings, using the slug injection method [Day, 1976; Kite, 1993; Moore, 2005], are taken regularly under different flow conditions. In total, we took into account 93 gaugings from the year 2009 to 2011. By injecting a known mass of salt M, usually 5 or 10 kg, and measuring the concentration of injected salt 210 m downstream as a function of time, c(t), we obtained the streamflow, Q, using
The measurement is cut off by the time the injected salt concentration is within the measurement accuracy, i.e., when the total ionic concentration is indistinguishable from the natural ionic concentration. Because this is relatively short (in our case, usually around 15 min) compared to the timescale of variation in the natural background concentration, we can assume the latter to be constant during the measurement. Another assumption behind the method is that, the concentration we measure downstream at time t is representative for the ratio between the mass of salt that passes the cross section and the volume of water passing the cross section. In other words, the salt should be well mixed within the cross section or the places in the cross-section with different concentration do not contribute to the flow significantly.
2.3.1. Calibration Procedure
 The time-varying signal of the injected salt concentration, c(t), cannot be measured directly. Instead, the EC σ(t) of the water is measured by means of an alternating current. In the range of measurement, the (temperature compensated) EC has a linear relation with the salt concentration, which is calibrated on site before the gauging by repeatedly pipetting 1 ml of calibration solution (10 g/L) into 500 mL of water from the stream. This results in linear relations of the form:
where σb(t) is the natural conductivity or base conductivity of the stream, which depends on the natural ionic concentrations in the stream water, and κ is the linear calibration coefficient with unit . The calibrations have coefficients of determination R2 close to one, with a minimum of 0.98 over all calibrations.
 For the salt dilution gaugings, we used the MRS-4, by Sommer mess–systemtechnik, a device with a built-in capability of integrating the signal and calculating the streamflow. The device has two EC probes, which were placed at different points in one cross-section to check the cross-sectional mixing. For each probe, the calibration and streamflow calculation is done independently, resulting in two streamflow values. The device stores the raw EC data at a 1 s sampling rate and some auxiliary data, including the individual calibration points, probe temperatures, and the natural EC of the stream σb at the time before the gauging. These data are used for the analysis in this paper.
2.4. Stage-Discharge Model
 Traditionally, the Q(h) relation is modeled as a power function, which is often fitted with least squares [Singh, 2010]:
where α, β, and h0 are the parameters of the Q(h)-relation, and are the observed water level and calculated discharge during the gauging number i at time ti. This assumption of least squares is equivalent to assuming independent and identically distributed (iid) Gaussian errors. In the case of streamflow measurements, heteroscedastic errors are usually a better description, as uncertainties in the gaugings tend to grow with discharge [Sorooshian and Dracup, 1980]. A heteroscedastic error model was achieved by finding a least-squared fit on the log-transformed discharge, implicitly assuming a Gaussian uncertainty in the log-transformed discharge, i.e., a lognormal error in discharge. Furthermore, this serves to avoid assigning probability to negative streamflow values, as is the case with the Gaussian error assumption underlying equation (4). Figure 1 shows the rating curves for fitting Q (lin) and (log) with least squares, with the corresponding equations and R2 values listed in Table 1. The residuals of both lin and log fits were checked for normality using QQ-plots (not shown) and the Shapiro-Wilk test. The p value for that test gives the maximum confidence level (accepted probability of falsely rejecting) at which one would not reject the null hypothesis that the residuals are normally distributed with unknown mean and variance. This indicated that the log-transformed Gaussian error model is satisfactory (p = 0.08), while the linear is less so (p < 0.001).
Table 1. Parameters and Coefficients of Determination for Q(h)a
 The measurements of EC in the stream were obtained as a by-product of the salt dilution gaugings used to determine discharge. The EC measured in the stream before injecting the salt varied from gauging to gauging and reflects changes in the natural ionic composition in the stream. This resulted in 93 measured EC values at the outlet of the study catchment, spread over the years 2009–2011.
 To use EC as predictor, its relation to streamflow must be modeled. The behavior of EC as function of streamflow has been investigated previously, mainly in the field of glaciology, where EC measurements have been used to distinguish between subglacial and englacial contributions to flow at proglacial streams. Gurnell and Fenn  studied the relation between EC and streamflow in a Swiss alpine valley relatively close to our field site and considered different spatial and temporal sampling strategies, focusing mostly on a location relatively closer to the glacier snout. They reported an R2 value of 0.91 for a linear relation predicting EC from the logarithm of the streamflow for hourly values measured in June and July 1978. Collins  noted a similar inverse relation between the diurnal cycles of streamflow and those of EC for two alpine proglacial streams and tried to separate different flow components based on a mixing model. Recently, such mixing models were further investigated in relation to measurements taken in a French high-alpine valley [Dzikowski and Jobard, 2011].
 Since our interest is primarily in obtaining estimates of streamflow and not enough data are available to formulate mechanistic models, we focused on simple empirical relationships. For comparison, we also considered a simple conceptual two member mixing model and fitted the parameters empirically. The limited temporal resolution and spatial extent of the EC data precluded more detailed analysis of the true mixing processes, full ionic composition, or sources of solutes, which could possibly lead to more accurate models [see, e.g., Walter et al., 2007; Salmon et al., 2001; Botter et al., 2008, 2009; Duffy, 2010].
 For the fit on the entire data set, Figure 2 shows different alternative functional relations. The mixing law relation is a conceptual model in which a constant, high EC groundwater outflow is mixed with a varying flow of low EC water from, e.g., snow melt or rainfall fast runoff. The empirical logarithmic relationship and the power law have a significantly better fit than the two-reservoir mixing model. As more distributed measurements of streamflow and EC of the different contributing water sources will become available after continued field campaigns, it will be interesting to find a better conceptual model capturing the (dynamic) relationship, possibly enhancing generalization.
 The quantile-quantile plots in Figure 3, which compare the distribution of errors around the curve to a normal distribution, show that the heteroscedastic error model for Q, which assumes Gaussian errors in , is more realistic than least squares on Q when checked a posteriori. This indicates that a least squares fit on log-transformed streamflow is the preferred method of inference on a relationship and that a heteroscedastic error model of this type can be used to describe the “measurement” uncertainty in a streamflow time series derived from continuous EC measurements. It also confirms that the R2 values calculated on the log-transformed streamflow are adequate measures of predictive power or mutual information.
 Figure 4 shows that the slow fluctuations and the higher frequency fluctuations between streamflow and EC follow each other (note that the negative log of streamflow is plotted to obtain equal signs). Figure 4b shows the residuals of the logarithmic Q(σ) relation next to the measured water level and is intended to reveal whether residuals are associated with certain hydrological events. Apart from the consistent underprediction during 2010, there is no obvious pattern in the data, although future analysis in conjunction with meteorological data might reveal further relationships. The residuals were also plotted against time of year, time of day, and water temperature (not shown), but this did not reveal any patterns or correlations to explain them.
 When looking at the residuals of the linear relation between and σ on the one hand and of the relation between and on the other, it appears that both series of residuals are correlated. Furthermore, a clustering is visible of the results of different years; see Figures 4 and 5. This is particularly interesting, since the information provided by water level and EC would be expected to be more or less independent. Possible explanations are discussed in the next section, which also proposes a correction to the measurements of Q that partly solves the problem of correlated residuals (right-hand side of Figure 5). An overview of the coefficients of determination for the relations is given in Table 2. Note that the Q(σ) relation outperforms Q(h) for the complete dataset, probably indicating that performance of Q(h) is affected by interannual morphological shifts in the river bed.
Table 2. R2 Coefficients of Determination of Several Linear Relationshipsa
Linear Relation (Response; Predictors)
R2 calculated on the log-transformed streamflow values. Predictor C stands for a constant, i.e., a linear relation with intercept. Parameter h0 is estimated for each column separately. Q⋆ are the corrected streamflow measurements, which are introduced in section 4. The last column shows the resulting R2 values for leave-one-out cross validation. This allows a fairer comparison between models of different complexity.
log(Q); σ, ,C
log(Q); σ, , , ,C
 The general inverse relationship between Q and σ seems consistent with the conceptual idea of the functioning of the catchment, where a relatively constant, solute-enriched base flow is mixed with a fluctuating low ion content flow from snow melt or rainfall response. From Figure 4, it can be seen that the EC follows the seasonal pattern of streamflow, which is mainly caused by snow melt. While a hyperbolic relationship may be physically more plausible, we found the logarithmic Q(σ) relationship to have the most predictive power. This result is in accordance with a relation found by Gurnell and Fenn  in a similar watershed. To obtain more insight into the dynamics and finding possible explanations for the logarithmic relationship in Val Ferret, continuous EC measurements are currently ongoing.
4.1. Analysis of Residuals
 The relatively high correlation between the residuals, depicted in Figure 5 on the left, is quite surprising and somewhat suspicious. Since relations between deviations in h and σ from their respective estimates based on Q are a priori not very likely, the correlation may be the result of errors in the measurements of Q obtained from the salt dilution gaugings. This suspicion is supported by the fact that the R2 of a direct relation between and σ is sometimes stronger than that of both variables' relation to Q (Table 2), while, logically, the causality in the relation would indicate Q as a cause and σ and h as effects.
 Because the calibration data from the gaugings were available, we were able to further investigate possible sources of errors in the gaugings of Q. One of the stored variables was the set of calibration coefficients κ describing the slopes of the linear relations between σ and the concentration of the added NaCl salt in the water for each calibration preceding a gauging. Each calculation of Qi from the salt wave (equation (1)) uses the values of κi from the calibration preceding gauging i. The EC measurement is temperature compensated, and therefore the calibration coefficient would be expected to be constant. Differences in the coefficient could occur either (1) as a result of probe fouling, poor connections, or other electronical causes or (2) as a result of differences (errors) in the calibration procedure, such as incorrect concentration in the calibration fluid, volume in the pipette, or initial water volume in the calibration reservoir.
 The first type of errors would not influence the gaugings of Q, since they are present in both the calibration and the actual gauging conductivities, but they would influence the measurements of base conductivity. This would, however, not explain the correlation between the residuals depicted in Figure 5. The second type of errors, those in the calibration procedure, would influence the measurements of Q, through errors in calibration coefficient κ and therefore explain the correlated residuals. In that case, one would also expect the calibration coefficients to be correlated to the residuals of both the Q(h) and the Q(σ) relationship.
4.2. Correction for Discharge Measurements
 The results summarized in Table 3 confirm the second hypothesis and indicate that errors in the calibration procedure (whose exact sources are for the moment unknown) are a likely cause of errors in Q. Gauged values for Q are inversely proportional to the coefficients κ used in their calculation, which mostly varied in the range 0.37–0.55, with five outliers around 0.87. When we assume all variation in κ to be the result of calibration errors, the gauged values of Q can be corrected by undoing the calibration:
where is the corrected value for Qi at gauging i and is the mean over all calibration coefficients. When Q⋆ is used as a variable in the rating curves, the strength of both the Q⋆(h) and Q⋆(σ) relationships improve those of Q(h) and Q(σ), see Figures 6 and 7 and Table 4. The uncertainty as measured by the standard deviation of the residuals in was reduced by 28.6% for Q(σ). The R2 for all relations using Q⋆ are shown in the last column of Table 2.
Table 3. Coefficients of Determination for Identification of Measurement Error Causes
“res” indicates residuals in from the given relationship All are calculated on the entire data set.
Measurement errors in Q or common cause for deviations in σ and h
κ, res Q(h)
Scatter in Q(h) relationship partly explained by errors in Q induced by κ
κ, res Q(σ)
Scatter in Q(σ) relationship partly explained by errors in Q induced by κ
Variations in measured EC not explained by instrumentation errors
res Q⋆(h), res Q⋆(σ)
Correction of Q for variations in κ, reduces correlation between residuals
Much sharper relation in the rating curve for h when Q is corrected
Much sharper relation in the rating curve for σ when Q is corrected
Combining h and σ further improves, adjusted R2 = 0.954
Table 4. Parameters and Coefficients of Determination for Q⋆(h)a
Curves of the form . Updated version of Table 1 after correction of Q.
 It must be noted that we are dealing with reducing and quantifying observational uncertainty, without access to golden standard observations. This precludes objective, assumption-free evaluation of the predictions (in fact, this is true for all science but is generally less evident). The better fits on the relationships can therefore only be interpreted as improvements when the interpretations and assumptions on causes of the scatter (Table 3) hold. In absence of more likely explanations for the improvements in the predictions, we believe they are probably closer to the truth; see also the discussion in Weijs and Van de Giesen .
 Under these assumptions, the total reduction in uncertainty due to the use of auxiliary information on the calibration coefficients κ and base-conductivities σb can be calculated. This is done by comparing the initial scatter around the Q(h) relation to the scatter around the rating curve Q⋆(h, σ). The latter curve uses both water level and EC and corrects the streamflow gaugings by undoing the calibration procedure, using information from the previously applied calibration coefficients κ. Since the uncertainty is still best described by a lognormal heteroscedastic model, the relative errors are more of interest than the absolute, in terms of information gain. We, therefore, characterize the uncertainty by the root-mean-squared error in the log-transformed streamflows computed from the relations; see the scatter plots in Figure 8 and interpretation in Figure 9. The total percentual reduction in streamflow uncertainty around the rating curve can thus be characterized by
In terms of the untransformed Q, this means the RMS multiplicative error went down from a factor 1.36 to a factor 1.19.
5. Summary and Further Work
 The main finding of this work is that EC presents a major opportunity to improve continuous streamflow series for alpine streams. For the stream considered in this paper, the EC has a predictive power apparently comparable to that of water level. Detailed analysis of the residuals of both relations in conjunction with the calibration data revealed the calibration coefficients as a likely correctable source of error in the gaugings. Assuming it is correct, this correction, combined with both water level and EC as streamflow predictors, leads to an additional reduction of uncertainty in the stage-discharge relationship, bringing the total reduction in uncertainty to 43.7%.
 The results presented here can have significant practical value, since salt dilution gauging is a common method for determining discharge in alpine streams. The predictive power of natural EC can be readily checked for other streams where data from salt dilution gaugings is available. Whether this power is present depends on the dynamics of the catchment. If a strong relation is found, it is advisable to monitor EC continuously and try to use it in a predictive model for streamflow. It should be noted, however, that time lags and hysteresis in the EC signal can cause artifacts in the discharge series. It is probably best to use EC alongside water level as a predictor, rather than replacing it. Furthermore, using a more physically based dynamic and mechanistic model is preferred over simple regression to optimally combine information from both sources to track variations in streamflow on all timescales. Such models could make predictions that are more transferable to other catchments and perform better under changing conditions such as land use.
 When predictive power is found in the EC snapshots from the Q gaugings, various analyses of the residuals, like the ones presented here, may further help to identify errors and point to their sources, in our case, the calibration coefficients as a source of error for measured Q. This is of course equipment and procedure specific rather than catchment specific. Especially, in morphologically dynamic streams, EC might be useful as an independent input to supplement standardized error checking procedures already in place at the agencies responsible for streamflow measurement [see, e.g., Kennedy, 1984; Sauer, 2002; World Meteorological Organization (WMO), 2010]. Although results may vary from case to case, we hope that this paper inspires ideas to take a closer look at the data underlying rating curves and discharge data, as this might significantly reduce uncertainties in the final streamflow series. This extraction of information from various data fits well into a more probabilistic view on streamflow measurements as being model forecasts with predictive uncertainties, which should not be hidden but rather explicitly presented to aid hydrological modeling.
 While for the current analysis, only sparse EC measurements were available that were collected during the streamflow gaugings, continuous EC measurements at higher temporal resolution have now been deployed. This will give insight in the daily patterns. Since previous research has shown that EC signals can lag behind the streamflow signals by a few hours [Gurnell and Fenn, 1985], a model taking this into account may further improve the streamflow estimation and will possibly give more insight in hyporheic exchanges caused by diurnal cycles [Loheide and Lundquist, 2009]. This also allows more detailed analysis of temporal patterns of rainfall response or snow melt events, especially when tributaries are monitored for both EC and stage [Lundquist et al., 2009]. Monitoring EC in different water sources, such as tributaries, groundwater wells, and glacier melt, may improve estimates and give more insight in the hydrological processes. In future research, we plan to advance this insight by using EC measurements in combination with analysis of isotopes, chemical analysis, and distributed modeling based, for instance, on travel time distributions [see e.g. Szilagyi and Parlange, 1999; Rinaldo et al., 2006; Kampf and Burges, 2007; Nicótina et al., 2008; Botter et al., 2008].
 Ongoing research focuses on how to incorporate various sources of information in a dynamic model of streamflow uncertainty. Once a continuous EC signal is available, this enables the use of the long-term stable Q(σ) relationship, while h can be used to track fast variations. The dynamic model can then combine both information sources to provide a probabilistic streamflow time series. These can subsequently be used to aid model inference while balancing maximum extraction of information and minimum extraction of misinformation. This is achieved by having sharp and reliable uncertainty estimates based on all available relevant information.
 The authors thank John Selker, Michael Gooseff, Jessica Lundquist, and two anonymous reviewers, who provided detailed and constructive comments that improved the paper. Steven Weijs is a beneficiary of a postdoctoral fellowship from the AXA research fund, which is gratefully acknowledged. Funding from the Swiss National Science Foundation, the NCCR-MICS, and CCES are also gratefully acknowledged.