Notice: Wiley Online Library will be unavailable on Saturday 30th July 2016 from 08:00-11:00 BST / 03:00-06:00 EST / 15:00-18:00 SGT for essential maintenance. Apologies for the inconvenience.
Corresponding author: R. Krier, Public Research Center–Gabriel Lippmann, 41 Rue du Brill, LU-4422 Belvaux, Luxembourg. (email@example.com)
 The complexity of hydrological systems and the necessary simplification of models describing these systems remain major challenges in hydrological modeling. Kirchner's (2009) approach of inferring rainfall and evaporation from discharge fluctuations by “doing hydrology backward” is based on the assumption that catchment behavior can be conceptualized with a single storage-discharge relationship. Here we test Kirchner's approach using a densely instrumented hydrologic measurement network spanning 24 geologically diverse subbasins of the Alzette catchment in Luxembourg. We show that effective rainfall rates inferred from discharge fluctuations generally correlate well with catchment-averaged precipitation radar estimates in catchments ranging from less than 10 to more than 1000 km2in size. The correlation between predicted and observed effective precipitation was 0.8 or better in 23 of our 24 catchments, and prediction skill did not vary systematically with catchment size or with the complexity of the underlying geology. Model performance improves systematically at higher soil moisture levels, indicating that our study catchments behave more like simple dynamical systems with unambiguous storage-discharge relationships during wet conditions. The overall mean correlation coefficient for all subbasins for the entire data set increases from 0.80 to 0.95, and the mean bias for all basins decreases from –0.61 to –0.35 mm d−1. We propose an extension of Kirchner's approach that uses in situ soil moisture measurements to distinguish wet and dry catchment conditions.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 Every hydrological modeler strives to overcome the inherent complexity and heterogeneity of hydrological processes by making appropriate modeling simplifications and generalizations. In one recent effort along these lines, Kirchner showed that if a catchment can be described by a single storage element in which discharge is a function of storage alone, this storage-discharge relation can be estimated from streamflow fluctuations.Kirchner  demonstrated that in such simple dynamical systems, it is possible to “do hydrology backward,” that is, to infer rainfall and evaporation time series from discharge fluctuations alone. Although the effect of evaporation and plant transpiration on runoff recession has been studied previously in detail [e.g., Federer, 1973; Daniel, 1976; Brutsaert, 1982], the “hydrology backward” approach offers interesting new possibilities for catchment hydrology and merits further testing in catchments characterized by different climates and lithologies. Teuling et al.  have recently investigated the “hydrology backward” approach in the Swiss Rietholzbach catchment, observing that the model performed better under wet conditions. Kirchner  has also pointed out that the approach should begin to break down in catchments that are larger than individual storm systems, due to the fact that a model that represents a basin as a single nonlinear storage will not easily encompass cases in which one part of the basin can be completely saturated after a storm while another part remains dry. However, given that the hydrology backward approach has so far been applied in relatively small catchments like Plynlimon (∼10 km2) in Wales [Kirchner, 2009] and Rietholzbach (∼3 km2) in Switzerland [Teuling et al., 2010], it has been unclear whether the approach can be used successfully in much larger basins.
 Here we investigate the advantages and limitations of the “hydrology backward” approach in a densely instrumented network of geologically diverse catchments, inferring catchment average rainfall rates from streamflow fluctuations and testing them against weather radar rainfall data that provide more representative catchment average precipitation measurements than interpolated point measurements do. Although Kirchner  already presented a virtual experiment that applied the approach to a hypothetical storm with different antecedent moisture levels (i.e., storage levels) to demonstrate the concept that peak runoff can be estimated from the sensitivity function, we go a step further by using in situ soil moisture measurements as a proxy for the catchment storage status, in order to investigate the impact that antecedent soil moisture conditions have on model performance.
 In summary this paper has two main objectives: (1) to understand the limitations and advantages of Kirchner's  approach with respect to different catchment characteristics (e.g., basin size, lithology) and soil wetness conditions and (2) to assess the potential of the approach for diagnosing the functioning of hydrological systems and to compare model behavior under various catchment conditions.
2.1. Kirchner's Methodology
 The methodology outlined by Kirchner  combines the conservation of mass equation (1) and the assumption that the stream discharge Q depends solely on the amount of water Sstored in the catchment, quantified by a deterministic storage-discharge function(2):
where S is the volume of water stored in the catchment, in units of depth [L] and P, E, and Q are the rates of precipitation, evaporation, and discharge, respectively, in units of depth per time [L/T]. Assuming that Q is an increasing function of S, the storage-discharge function can be inverted:
 Because f(S) is invertible, equation (2) can also be differentiated as
 Where g(Q) expresses the sensitivity of discharge to changes in storage. Equation (5) can be combined with equation (4) and rearranged as follows:
 For periods when P and E can be neglected, this becomes
which would allow one to estimate g(Q) from hydrograph data alone. It follows that estimating g(Q) requires time series subsets when precipitation and evaporation fluxes are small in order to simplify equation (6) to equation (7). To determine periods where P and E can be neglected we chose to apply the second of the two methods presented by Kirchner : we selected the hourly records for nighttime during which there was a total recorded weather radar rainfall amount of less than 0.1 mm within the preceding 6 h and the following 2 h. We defined nighttime as the period from 1 h after sunset to 1 h before sunrise. These rainless nighttime hours are used to establish the sensitivity function g(Q). Once g(Q) has been calibrated, the model can be applied to all other periods (including daytime and periods with rainfall).
 The precipitation that can be inferred with the original method is at best “effective” precipitation, which remains after replenishment of the interception storages. What we consider as “effective rainfall” is rainfall minus interception. In that sense to be methodologically more correct and to compare calculated effective rainfall with inferred effective rainfall, we preprocessed the observed precipitation by subtracting the replenishment of the interception storage Ss compartment. In doing this, we defined the interception storage with a seasonally varying saturation threshold of 2 mm in winter, 2 mm in spring, 6 mm in summer and 3 mm in autumn. These values are based on previous research in the study area [Gerrits et al., 2010]. The storage capacity is calculated for each daily time step by taking into account the input (radar rainfall time series) and the evaporation “loss,” based on a Penman-Monteith potential evaporation time series from a weather station. Despite errors associated with the measured data having an impact on the model output evaluation, we note that similar operations are not uncommon in hydrological modeling. For example,Young preprocessed the rainfall in similar ways to filter out potential nonlinearities in the rainfall-runoff relationship. We carried out a sensitivity impact analysis of the applied threshold described byGerrits et al.  to assess the impact of our interception model on the Kirchner model results in the Mess basin at Pontpierre (8). The results showed that the average bias decreases, and the correlation coefficient does not change significantly, as the maximum storage capacity increases from no interception to the values previously determined by Gerrits et al. . A further increase to 4, 4, 8, and 5 mm in the winter, spring, summer, and fall, respectively, has no significant influence on model performance.
2.2. Implementation of Soil Moisture Conditions
 Threshold behavior has been frequently discussed as influencing surface and subsurface runoff generation processes at different spatial scales [Uhlenbrook, 2003; Uchida et al., 2006; Zehe et al., 2007; Zehe and Sivapalan, 2009; Graham et al., 2010; Matgen et al., 2012]. Two cardinal hydrological functions in a hillslope have been described by Savenije : moisture retention and preferential subsurface drainage. Rapid subsurface flow, also named storage excess subsurface flow [Savenije, 2010], is the dominant stormflow generation mechanism on hill slopes if Horton-type overland flow is negligible, while the dominant stormflow generation mechanism in the riparian zone is saturation excess overland flow. Both processes involve clear threshold mechanisms. Rainfall intensity and antecedent moisture conditions also play an important role, and define which runoff processes occur at a specific location during an event [Uhlenbrook, 2003; Zehe et al., 2007]. Our fundamental working hypothesis is that we expect catchments to behave as simple dynamical systems with unambiguous storage-discharge relationships only if critical soil moisture thresholds are exceeded. Soil moisture measurements can help to reflect this threshold behavior [Detty and McGuire, 2010] and can be available as in situ measurements or global grid-based remote sensing measurements [Brocca et al., 2010a, 2010b]. Our objective is to compare the performance of the Kirchner model during periods when different threshold values of soil moisture are exceeded.
 We applied the approach in separate time periods by extracting those hourly time steps where the soil wetness index (defined in section 4) exceeded thresholds ranging from 0% to 80% in steps of 10%. To estimate g(Q) from hourly discharge data and following Kirchner  (who in turn followed Brutsaert and Nieber ), we calculated the rate of flow recession as the difference in discharge between two successive hours, −dQ/dt = (Qt−Δt − Qt)/Δt, and plotted it as a function of the average discharge over the 2 h, (Qt−Δt + Qt)/2. The discharge values Q correspond to time steps where low precipitation and low evaporation occurred. A third condition is added by selecting Q values for time steps where a certain soil moisture threshold is exceeded. For the rest of this analysis whenever the method is applied with a 0% soil moisture threshold the method can be considered as the original method as Kirchner  applied it. As soon as a soil moisture threshold greater than 0 is considered, the g(Q) function is recalculated using only time periods above the threshold.
 The functional relationship between –dQ/dt and Q is estimated by grouping the hourly data points into ranges of Q, and then calculating the mean and standard error for −dQ/dt and Q within each group. We fitted the grouped means using a quadratic curve, which leads to the following expression (8) for the sensitivity function g(Q):
 If the sensitivity function g(Q) of a catchment is known, one can infer precipitation directly from measured discharge fluctuations, taking into account the time lag between changes in storage and changes in streamflow measurements at the gauging station. The method for inferring precipitation rates from streamflow fluctuations is derived by rearranging equation (6) to yield
 According to Kirchner , whenever it is raining, we can assume that any evaporation beyond interception losses is relatively small, so that P − E ≈ P.
 This leads to the following rainfall estimation equation:
Pt is the effective precipitation rate inferred from the discharge time series Q following a time lag ℓ for changes in storage to be reflected in streamflow at the gauging station. We defined the lag time for each basin by calculating the correlation coefficient between the inferred and measured effective rainfall with lag times of 1, 2, 3, 4, 5, 6, 12, 24 and 48 h and by determining the lag time that provides the best performance level. The calculated lag times ranged from 2 h in the 10 smallest basins to 12 h in the biggest basin. Precipitation rates are only calculated for the time steps with soil wetness conditions above the same thresholds that were used to calculate g(Q).
3. Study Area
3.1. Alzette River Basin
 The Alzette catchment (1092 km2) in the Grand Duchy of Luxembourg (Figure 1) is an ideal test area for the purposes of this work. Dense networks of rain gauges (18 stations) and discharge measurements (24 stations) cover the Alzette catchment; moreover the entire region is also covered by weather radar. We applied the Kirchner approach to 24 subcatchments to generate a catchment precipitation time series for each of them. To avoid any misunderstanding in the text about the different subbasins we indicate the name of the subbasins by giving the name of the river, the name of the discharge gauging station and the ID number of the subbasins (see Figure 1) in parentheses.
 Luxembourg is divided into two major geomorphological regions. The Oesling represents the northern third of Luxembourg and belongs to the Ardennes Massif. The relief consists of a plateau that averages 450 m in elevation. Rocks are mainly composed of schist, phyllads, sandstone and quartzite of Devonian age. This plateau, which supports cropland and forest, is incised by deep, forested, V-shaped valleys. The southern part of the country, called Gutland, lies at lower altitudes than the Oesling. The Gutland belongs geologically to the Paris basin and its relief consists of alternating plateaus and gentle hills. The monotony of the landscape is only interrupted by deep valleys cut into the Luxembourg sandstone and large valleys in the Keuper marls [Pfister et al., 2004]. Main lithologies occurring in the Gutland are marls, sandstones and limestones. The mesoscale Alzette catchment in southwestern Luxembourg belongs mainly to the Gutland region but also touches schist areas in the north. Figure 1 gives an overview of the study area of the Alzette catchment and the widely varying geological conditions in the different subbasins, which account for their varied hydrological responses to rainfall.
3.2. Hydrology of the Alzette River Basin
 Rainfall-runoff processes in the Alzette basin are as variable as the geology is diverse. There are three different dominant lithologies occurring in the Alzette basin.
 1. Marls. Marly substrata are well developed in the Gutland. They are impermeable at depth, and generally highly responsive to rainfall events. They have little storage capacity, making their runoff regime extremely variable, with high discharges during winter. Discharge promptly follows rainfall events, and is characterized by high and steep peaks. Streamflow is low or absent during prolonged dry weather periods. When dry-weather streamflow does occur, it is sustained by saturated throughflow occurring at the interface between the soil and the underlying bedrock layer.
 2. Sandstone. Mainly represented by the Grès de Luxembourg, this geological formation is highly permeable and therefore deep percolation occurs. In sandstone areas, streamflow is mainly sustained by groundwater flow. Discharge is produced and sustained during long time spans, and is characterized by smooth peaks, long recession periods, and a delayed response to rainfall [Juilleret et al., 2005].
 3. Schist. These geological formations occur in the Oesling and are characterized by an upper weathered zone on top on a relatively impermeable layer. The top layer has a relatively small storage capacity, becoming quickly saturated during wet periods. Similar to the marl units, discharge is high during the wet season and low or absent during prolonged dry periods. A complex system of cracks and channels in the rock mass allows deep percolation and a long-term base flow discharge component [Fenicia et al., 2006; Krein et al., 2007].
 Further characteristics of the Alzette catchment are as follows.
 1. Temperate climatic conditions predominate, influenced by westerly atmospheric circulation patterns and a strong precipitation gradient, with rainfall decreasing from west to east. Rainfall totals are higher in the north of Luxembourg than in the south. This is due to a strong topographic influence on the spatial distribution of rainfall, with the Mosel cuesta (ridge) in the southeast and the Ardennes massif in the north of the country.
 2. The seasonal variability of rainfall intensities is small, but the seasonal variability in temperature and evaporation is pronounced, with a maximum in summer and a minimum in winter. The average annual temperature is about 9°C and the catchment-averaged rainfall totals roughly 740 mm yr−1.
 3. The prevalence of impermeable substrata varies between 30% and 100% among the Alzette's various subcatchments. The degree of urbanization also varies among the subcatchments, ranging from 5% to 27%, and the fraction of agricultural lands varies from 37% to 80% [Pfister et al., 2004].
 We compiled a complete set of all the necessary data for our study at hourly frequency for the calendar year 2007. The discharge data of the 24 subbasins used in this study were provided by the Luxembourg water resources administration (Administration de la Gestion de l'Eau Luxembourg) and the Public Research Centre–Gabriel Lippmann in Luxembourg. The data are aggregated to hourly intervals in order to generate the sensitivity function g(Q).
 The weather radar data set used here is the RADOLAN composite data set, with hourly cumulated rainfall data provided by the German Weather Service (DWD) at a grid cell resolution of 1 km2. The radar online adjustment procedure of the DWD merges the different radar-based precipitation analyses and in situ based precipitation observations located in Germany and its border regions to guarantee high quantitative and qualitative rainfall estimation performance [Bartels et al., 2004]. The closest radar device to the Alzette catchment is located near the German village of Neuheilenbach (Figure 1). The data set has been evaluated and corrected by the authors for the study area and time period using a radar gauge merging method based on the conditional merging methodology first developed by Sinclair and Pegram  and positively evaluated by Goudenhoofdt and Delobbe . The in situ measurements used for this purpose were provided by 18 automatic rain gauges of the Luxembourg water resources administration (Administration de la Gestion de l'Eau Luxembourg), ASTA (Administration des Services Techniques de l'Agriculture) and the Public Research Centre–Gabriel Lippmann in Luxembourg. Because the aim of this work is to estimate areal catchment rainfall rates from discharge observations, weather radar plays a significant role as it provides precipitation estimation at high spatial and temporal resolution over a large area. A network of rain gauges can provide more accurate pointwise measurements but the spatial representativity is limited. The two observation systems are generally seen as complementary and are used here as a composite benchmark rainfall data set [Goudenhoofdt and Delobbe, 2009].
 Our soil moisture measurements come from the Bibeschbach catchment above Livange (12), located in the southern part of the Alzette River basin (Figure 1). The CRP–Gabriel Lippmann uses this subcatchment as a test basin, equipped with a set of 16 ECH2O Decagon soil moisture sensors at 8 sites. These sensors measure the permittivity of the topsoil layer, which is strongly dependent on its water content at a depth of approximately 5 cm. The measurements are expressed as soil wetness indices (SWI) calculated by normalizing the soil moisture data between the long-term minimum and maximum, such that the values range between 0 and 1 (seeMatgen et al.  and Heitz et al. for further details). Although it would have been preferable to have soil moisture measurements in each subbasin, we have based our approach on the assumption that these in situ soil moisture measurements in one of our study catchments can be used as a rough proxy for the soil moisture status in the entire study area, and hence as a proxy for the occupied storage capacity of the unsaturated subsurface storage compartment. The use of soil moisture observations for different types of rainfall-runoff modeling has already been analyzed in detail [Aubert et al., 2003; Anctil et al., 2008; Brocca et al., 2009; Tramblay et al., 2010; Zehe et al., 2010]. In these studies the authors used local soil moisture information (even for the surface layer only) as a proxy of soil moisture at catchment scale. Studies on soil moisture temporal stability [e.g., Vachaud et al., 1985; Brocca et al., 2010c; Loew and Schlenz, 2011; Matgen et al., 2012] indirectly obtained the same results, i.e., that point measurements can be effectively used to estimate temporal patterns of soil moisture for larger areas.
 Following Kirchner's methodology, we inferred effective precipitation directly from measured streamflow fluctuations with a time lag between changes in the storage and changes in streamflow measured at the gauging station. Figure 2illustrates the rainfall-runoff threshold behavior in three different subbasins of the study area, in relation to a soil wetness index (SWI) time series measured by soil moisture probes in one of the three catchments (seesection 4for further details). These three subbasins are located in different parts of the study area with different lithologies (marl, schist, and sandstone). The SWI/discharge relationships in the first and third basin indicate threshold rainfall-runoff behavior, whereas the second basin shows a smoother rainfall-runoff relation. Based on these observations, soil wetness index thresholds (SWI) from 0% to 80% in steps of 10% were defined, andg(Q) was estimated from the recession time steps with SWI values above these thresholds. Because we only have a complete data set for one year, there are not enough data points to reliably calculate g(Q) above the SWI threshold of 80%. It is important to note that we used discharge data points above the SWI thresholds, both in constructing the recession plots and also in applying the inversion formula (equation (10)) to infer rainfall rates.
 We evaluated the impact of different soil moisture thresholds on model performance, as measured by the correlation between predicted and observed daily average effective precipitation. A problem can emerge here from comparing correlation coefficients between different sample sizes. Therefore we made the following statistical test: we calculated the 95% confidence intervals for the correlation coefficients measured for 0% SWI and 80% SWI using the Student's t distribution. If those error bars do not overlap, one can be highly confident that the two correlation coefficients are statistically different. Out of the 24 basins we only found 3, Alzette at Hesperange (4), Dudlingerbach at Bettembourg (11), and Alzette at Schifflange (23), where the error bars overlap. The correlation coefficient and mean bias performance of the model runs at different SWI thresholds can be found in Figures 3 and 4. Additionally, Table 1 contains the correlation coefficient, the RMSE, the bias, standard deviation of prediction errors (SDPE), and the slope of the relationship between predicted and observed effective precipitation for all basins, using a SWI threshold of 80%. When comparing model performance across different SWI thresholds, it is important to keep in mind that the number of modeling time steps varies according to the SWI threshold. As the SWI threshold is raised from 0% to 80%, the overall mean correlation coefficient for all subbasins for the entire data set increases from 0.80 to 0.906 and the mean bias for all basins is reduced from –0.61 to –0.35 mm d−1. In 22 out of 24 catchments we observe a stepwise increase in the correlation coefficient as the SWI threshold rises from 0% to 80% (apart from the Dudelingerbach (11) and the Alzette basin at gauge Schifflange (23)). For all basins the correlation coefficient lies above 0.75 with a SWI threshold between 60 and 80%. The maximum bias was observed in the Kaylbach at gauge Kayl (10) with –3.28 mm d−1 at 80% SWI, and the minimum bias was observed in the Roudbach basin at gauge Platen (19) with 0.01 mm d−1. In fourteen basins (ID numbers 1, 3, 4, 5, 7, 8, 9, 13, 15, 16, 19, 20, 22, 23, and 24) we can observe a bias decrease or stabilized bias evolution as the SWI threshold increases, with an optimum value in the range of 60% to 80% soil wetness. In nine basins (ID numbers 2, 6, 10, 11, 12, 14, 17, 18, and 21) we see a bias increase. By calculating the linear regression of the measured versus inferred rainfall scatterplots (Figures 7 and 8) we determined the slope of the trend line. The average slope for all 24 basins at a SWI of 0% is 0.53, while the average slope at 80% SWI is 0.74.
Table 1. Performance Measurements of Simulated Versus Measured Daily Rainfall Rates for All 24 Subbasins at a SWI Threshold of 80%
Basin known to be disturbed by extensive subsurface tunneling.
 In order to test whether the implementation of the lag time has any significant effect on g(Q), we carried out a sensitivity analysis of the model in the Attert basin at gauge Reichlange (6) (∼160 km2). This basin has an optimized lag time of 5 h. We implemented lag times of 1 h and 5 h to select the nighttime hours to generate the g(Q) function. We could not find any significant difference in the results. At a SWI of 0% (original method) we calculated a correlation coefficient of 0.77 and a bias of 1.18 mm d−1 at a lag time of 1 h. The correlation coefficient at 5 h lag time is still 0.77 and the bias is 1.22 mm d−1. At a SWI of 80% we get a correlation coefficient of 0.93 and a bias of 0.82 mm d−1 with 1h lag time and a correlation coefficient of 0.95 and a bias of 0.80 mm d−1 at a 5 h lag time. Similar results were obtained for the Colpach (17) (∼20 km2), the Alzette at Lintgen (24) (∼425 km2) and the Eisch at Hunnebuer (5) (∼160 km2) basins.
 Two examples of basins with good performance are the Mess at gauge Pontpierre (8) with a correlation coefficient of 0.92, a RMSE of 2.3 mm d−1, a bias of – 0.34 mm d−1, an SDPE of 2.3 mm d−1 and a slope of 0.81 at 80% SWI, and the Wollefsbach at gauge Useldange (20) with a correlation coefficient of 0.93, a bias of –0.58 mm d−1, a RMSE of 1.9 (the lowest of all the sites) an SDPE of 1.8 mm d−1 (also the lowest of all the sites) and a slope of 0.78 at 80% SWI. The largest catchment, the entire Alzette at gauge Ettelbrueck (2) with more than 1000 km2, also showed good overall model performance with a correlation coefficient of 0.95, a bias of –0.89 mm d−1, a RMSE of 2.27, an SDPE of 2.10 mm d−1 and a slope of 0.72 at 80% SWI.
Figures 5 and 6 provide a closer look at sensitivity function estimation in two specific subbasins: the Mess River at Pontpierre (8) and the Schwebich at Useldange (21). Figures 5a, 5c, and 5e and 6a, 6c, and 6e show the relationship between discharge and flow recession with SWI thresholds of 0%, 50% and 80%. Figures 5b, 5d, and 5f and 6b, 6d, and 6f show the curves that were fitted to the grouped means using least squares regression. The implementation of SWI thresholds leads to a reduction of the scattering in Figures 5b and 5c and 6e and 6f, especially in low-flow periods, where a clearer linear functional relationship between −dQ/dt and Q is obtained.
 Plots in Figures 7 and 8 with SWI thresholds of 0%, 50% and 80% show the measured daily radar catchment means compared to the daily effective precipitation rates inferred from streamflow fluctuations for six example catchments. These basins span the three dominant lithologies of the study area: schist, marl, and sandstone. The Wark basin at gauge Ettelbrueck (3, Figure 7) represents a typical schist basin. The Mierbech basin at gauge Huncherange (9, Figure 7), the Mess basin at gauge Pontpierre (8, Figure 8), and the Wollefsbach basin at gauge Useldange (20, Figure 8) are typical marl basins. The Mamer basin at gauge Schoenfels (7, Figure 8) represents a sandstone-dominated catchment. The detailed geological classification of these basins can be found in the pie charts ofFigure 1. The Kaylbach at gauge Kayl (10) in Figure 7 exhibits poorer performance, but this catchment is extensively disturbed by anthropogenic influences that are explained more in detail in section 6. Only 7% of precipitation leaves the catchment as Kaylbach streamflow, with the rest leaking into adjacent catchments through a network of tunnels. The model underestimates precipitation at Kaylbach because the Kaylbach catchment substantially violates the mass conservation assumptions on which the model is based. Figure 9 shows box plots of the prediction errors for all 24 subbasins with median, lower, and upper quartiles, data range, and outliers. Figures 10 and 11 show the time series of the simulated versus measured daily rainfall rates during a rainfall event in the second half of February. It is one of the most intense events during 2007 in the entire study area. The different plots show model results in the six basins already presented in Figures 7 and 8.
 Under both wet and dry conditions, model estimates of basin-averaged effective precipitation rates agree with weather radar measurements (mean correlation coefficient of 0.80 and mean bias of −0.61 mm d−1, averaged over the 24 Alzette basins). Model performance is even better under wetter conditions, with the correlation coefficient rising to 0.906 and the mean bias decreasing to −0.35 mm d−1 above a soil wetness index threshold of 80%. The average slope of the relationship between predicted and observed effective precipitation, averaged over all basins, likewise rises from 0.53 to 0.74.
 We argue that the model works better during wetter periods (such as wintertime in the Alzette catchment) due to the fact that the unsaturated subsurface storage becomes filled close to its capacity and the catchment becomes better approximated by a single storage with a single storage-discharge relationship. This result is consistent with the findings ofTeuling et al.  in the Swiss Rietholzbach catchment. They also argue that the concept of a simple dynamical system is more appropriate under wet conditions. The model performance results in Figures 3 and 4 show that when time steps with a SWI below a certain threshold were excluded, thereby focusing on time steps where the catchment storage was highly saturated, the performance of the model in most of the subbasins increased. In 22 out of 24 subcatchments of the Alzette basin we could improve the correlation coefficient performance of the model by implementing SWI thresholds.
 Generally, we found better model performance in basins with predominantly marly lithologies (e.g., the Mierbech basin at gauge Huncherange (9) in Figure 7, the Mess at gauge Pontpierre (8) and the Wollefsbach at gauge Useldange (20) in Figure 8). Marl is a sedimentary rock composed primarily of clay and calcium carbonate, and tends to be impermeable. During periods of extended rainfall, soils become saturated and connectivity between different soil layers is established, generating saturated overland flow and rapid subsurface flow. The Mess River (8) is located in the southern part of the Alzette catchment in a region noted for clay soils, which inhibit deep infiltration. Pfister et al. showed that the maximum runoff coefficients (the percentage of precipitation that appears as runoff), obtained via the double-mass curves of rainfall and runoff for 18 subcatchments of the Alzette catchment, range between 10% and 66%. They defined a runoff coefficient of 54% for the Mess River at gauge Pontpierre (8), which is close to the maximum measured in the entire Alzette catchment. We argue that this partly explains the good results obtained in that subbasin using both the original Kirchner model and the model with higher SWI thresholds.
Figures 10 and 11 show that during the rainfall event after 23 February the timing of the predicted rainfall peaks in the presented basins is very accurate. The Wollefsbach basin at gauge Useldange (20) in Figure 11c showed particularly good performance. Although the model greatly underestimated the rainfall intensities in the Kaylbach basin (in Figure 10c; see further explanations below), the timing in all examples of the event was exact.
 In Figure 9 we can observe error outliers in the box plots with values up to 20 mm d−1. After analyzing the input data it turned out that the errors around –20 mm d−1 for the basins with the IDs 7, 11, 15, and 23 all occurred on the same day on 17 February. During that day the rain gauges recorded the most intense rainfall event in the year of our study, followed by the highest discharge peaks in all streams. The weather radar measured rainfall intensities up to 40 mm d−1 in the Pall basin (18). Because similar errors occur in different basins at the same time, we argue that it is more likely that there was a problem with the measured radar rainfall images during that day, rather then a problem with the discharge measurements in these basins. Indeed, comparison of the weather radar with rain gauge measurements during that day shows that the rain gauges measured about 5 to 10 mm less rainfall than the weather radar. It seems that the radar only had such problems on that specific day compared to the rest of the year. However we could also observe that in some basins the model could reproduce this rainfall event very well. One example is the Wollefsbach basin at gauge Useldange (20), with 27.6 mm d−1 simulated and 29.0 mm d−1 measured rainfall by the radar. Generally, the longer the time period of discharge data that is available to fit the sensitivity function, the more robust and reliable the model is during extreme events. As these examples illustrate, the Kirchner model can be used to diagnose measurement errors by checking for inconsistencies between rainfall and discharge time series, and to identify basins that are not well characterized by this specific model structure.
 The basin with the most obvious bias is the Kaylbach (10). The southern part of the study area, in which the Kaylbach and also the Dudlingerbach at gauge Bettembourg (11) are located, was heavily industrialized in the last 100 years due to ore mining. The subsurface boundaries of these basins are heavily disturbed by mines and tunnels drawing that drain water out of the catchment or lead to unknown subsurface water exchange [Pfister et al., 2002]. In the case of the Kaylbach, only 7% of precipitation leaves the catchment as streamflow, with the rest either evaporating or draining away through tunnels. This explains the significant underestimation of rainfall (see the Kaylbach (10) plots in Figures 7g–7i). We argue that the lower model performance in this case is not due to the limitations of the model but rather to the disturbed water balance in that catchment.
 As already hypothesized by Kirchner the theoretical foundations of the method are challenged by catchments with heterogeneous and complex geology, such as those that have multiple unconnected subsurface reservoirs with dry patches in between. In such configurations discharge may not be a single-valued function of storage, but instead may depend on how that storage is distributed among various subsurface reservoirs, and thus streamflow fluctuations may not be unambiguously related to storage fluctuations. However, our results do not show a clear relationship between model performance and the geological complexity of the individual catchments. For example, the Pall catchment at gauge Niederpallen (18) is dominated partly by schist and partly by different types of marls, which would seem to make it difficult to characterize the basin as one simple dynamical system. Nonetheless, the performance of the model is not markedly better or worse in the Pall catchment than in the other catchments of the Alzette basin. Our results demonstrate rather good performance of the model even in the most challenging catchment configurations. Likewise, the model performance in larger basins, with drainage areas between 200 and 1000 km2, is at least as good, and possibly even better, than it is in basins with drainage areas of 20 km2 or less. Considered together, these results suggest that once a critical soil moisture threshold is exceeded and the connectivity between multiple reservoirs is established the model performs reasonably well even in a complex environment.
Kirchner has pointed out that snowfall and melting could potentially affect model results. One can expect to obtain false pulses of inferred effective precipitation during periods of snowmelt and rain on snow with subsequent melt of the snowpack; what is inferred in such cases is not precipitation per se (and particularly not frozen precipitation), but rather wet precipitation (net of interception losses) plus snowmelt. However, in the area under investigation snowmelt-induced runoff does not occur very frequently, so this aspect of the model could not be investigated.
 At low rainfall rates some point scattering can still be observed in Figures 7a, 7d, 8d, 8g, and 8h, close to the x axis, especially with a SWI threshold of 0%, even after implementing the interception storage. This scatter partly results from very low measured radar rainfall rates that did not cause any reaction in the stream. The “variable source area” concept of Hewlett and Hibbert  says that water in a catchment has to be “connected” in order to generate runoff; as long as water is stored in disconnected water pockets, runoff generation is inhibited. Furthermore, not only canopy but also forest floor interception plays a significant role in the hydrological cycle. As only the effective rainfall is transformed into runoff, the inferred effective rainfall will normally be underestimated because of interception losses. Although we preprocessed the measured radar data by implementing seasonally variable interception thresholds, we cannot explain the observed bias by the process of interception alone. The scattering is significantly reduced after implementing thresholds of soil wetness conditions. We attribute this reduced scatter to significant reductions in the residual capacity of unsaturated storage compartments as soil layers in the subsurface become connected.
 Comparisons with radar rainfall data show that Kirchner's doing hydrology backward approach could be successfully applied in the densely instrumented Alzette catchment of Luxembourg. The average correlation between predicted and inferred effective daily precipitation was 0.8 when Kirchner's original method was applied to 24 subbasins in the Alzette catchment. However, implementing a soil moisture threshold led to a stepwise improvement of the results, with the average correlation rising from 0.8 to 0.91 under wet conditions (as indicated by a soil moisture index of 80% or higher). Model results for the 24 Alzette basins mostly confirm what has been observed in Luxembourg as hydrological behavior and functioning in the different test basins; thus the results seem to corroborate Kirchner's assertion that the approach can be a valuable tool for diagnosing catchment behavior. If the performance of the model shows a systematic failure in one specific catchment one can infer that the catchment cannot be characterized by a single nonlinear storage-discharge relation, or, as in the cases of the Kaylbach and Dudelingerbach catchments, one can infer that there are significant sources or sinks that have not been accounted for. Similarly, event-specific outliers (e.g.,Figures 7–9) can indicate that something is wrong with either the discharge or rainfall data for those specific events. Reliable rating curves and high-precision water level monitoring systems indeed are a prerequisite for doing hydrology backward. The more discharge data is available, the more robust the fittedg(Q) function should be, especially during extreme events. The excellent results obtained for the Alzette basin at Ettelbrueck (2), at a drainage area of more than 1000 km2, and in several other geologically complex basins of more than 200 km2, suggest that Kirchner's approach can be successfully applied in mesoscale basins with heterogeneous lithology.
 We thank Fabrizio Fenicia, Jérôme Juilleret, and Sebastian Wrede for helpful discussions and suggestions concerning this work. The authors would like to thank the Ministry of Culture, Higher Education and Research of Luxembourg, and the National Research Fund (FNR) for their support of this work.