Weather radar-based quantitative precipitation estimation (QPE) is in principle superior to the areal precipitation estimated by using rain gauge data only, and therefore has become increasingly popular in applications such as hydrological modeling. The present study investigates the potential of using multiannual radar QPE data in coupled surface water—groundwater modeling with emphasis given to the groundwater component. Since the radar QPE is partly dependent on the rain gauge observations, it is necessary to evaluate the impact of rain gauge network density on the quality of the estimated rainfall and subsequently the simulated hydrological responses. A headwater catchment located in western Denmark is chosen as the study site. Two hydrological models are built using the MIKE SHE code, where they have identical model structures expect for the rainfall forcing: one model is based on rain gauge interpolated rainfall, while the other is based on radar QPE which is a combination of both radar and rain gauge information. The two hydrological models are inversely calibrated and then validated against field observations. The model results show that the improvement introduced by using radar QPE data is in fact more obvious to groundwater than to surface water at daily scale. Moreover, substantial negative impact on the simulated hydrological responses is observed due to the cut down in operational rain gauge network between 2006 and 2010. The radar QPE based model demonstrates the added value of the extra information from radar when rain gauge density decreases; however it is not able to sustain the level of model performance preceding the reduction in number of rain gauges.
 Precipitation is the main driving force for hydrological processes in most terrestrial water cycles such as in Denmark. From a water resources management perspective, accurate information on the temporal and spatial variation in precipitation plays a critical role in determining the overall water budget at catchment and subcatchment scale [Watts and Calver, 1991; Kuczera and Williams, 1992; Boyle et al., 2001; Beven, 2002; Younger et al., 2009]. The possibility of using precipitation estimated by weather radar for hydrological purposes was initially recognized in the 1970s, when the concept of radar-based quantitative precipitation estimation (QPE) was introduced to practical hydrological operations such as flood forecasting [Clark et al., 1972; Brandes, 1975; Wilson and Brandes, 1979]. Since then, an increasing number of research activities have explored the potentials of using radar QPE in hydrological applications [Carpenter et al., 1999; Tilford et al., 2002; Cole and Moore, 2008, 2009; Gourley et al., 2010; He et al., 2011b].
 The primary advantage of using radar for rainfall estimation is that it provides higher spatial and temporal resolution products and has larger areal coverage compared to traditional rain gauge measured rainfall. This makes radar-based QPE products very suitable for distributed hydrological modeling [Steiner et al., 1995; Ciach et al., 1997; Gourley and Vieux, 2005; Villarini et al., 2008; van de Beek et al., 2010]. In principle, radar is able to provide rainfall measurements corresponding to the grid scale of distributed hydrological models whereas rain-gauge-based products are based on geometrical interpolations to each model grid from a limited number of observation points.
 Despite the scaling advantages in space and time, stand-alone radar systems may not be able to serve the purpose of quantitative application. This is mainly because radar measures precipitation remotely and indirectly, and the radar signal is often contaminated with various types of noises and artifacts. As a result, the operational radar QPE is often based on a combination of both radar and rain gauge information [Barnes, 1964; Smith and Krajewski, 1991; Anagnostou and Krajewski, 1998; Seo and Breidenbach, 2002; Haberlandt, 2007]. However, it is important to know, first, if the rain gauge network density is sufficient to provide adequate bias correction to the radar data; second, if there are the additional information introduced by using radar data in combination with the existing rain gauge network; and third, if the current rain gauge density can be reduced without sacrificing the quality of the rainfall estimate. It may be expected that the relationship between the quality of the radar QPE product would be a nonlinear function of the rain gauge density and that the quality at some point would reach a level where additional rainfall information from extra rain gauges would be redundant [Krajewski et al., 1991; Abtew et al., 1995; Kim et al., 2010]. Solving these issues can be beneficial when evaluating the economic aspects of maintaining the rain gauge network and requires that the rainfall products are tested as driving forces to, e.g., in hydrological models.
 Radar precipitation has been used in various types of hydrological studies, where the improvement of simulated hydrological dynamics by using high-resolution data is documented. Among them the most common application of radar QPE data is to simulate or predict flash flood events using rainfall-runoff models [Carpenter et al., 1999; Ntelekos et al., 2006; Villarini et al., 2009; Atencia et al., 2011; Zoccatelli et al., 2011]. However, radar-based precipitation products have never been employed in driving a groundwater model. It is acknowledged in groundwater modeling that the spatial and temporal variation of groundwater recharge plays a significant role in obtaining realistic flow paths and changes in storage [Bormann et al., 1999; Burns et al., 2001; Liang and Xie, 2003; Krause et al., 2007; Milewski et al., 2009], and that recharge is highly dependent on precipitation. Therefore, the potential of using radar QPE data in such context may have been overlooked in previous studies. Moreover, the focus of water management at catchment scale has in recent years moved toward using an integrated water resource approach, which requires modeling of the entire water cycle including both the surface and subsurface elements [Haberlandt et al., 2001; Ludwig et al., 2003; Henriksen et al., 2008; Benito et al., 2010]. Considering the high resolution and large spatial coverage, it is likely that using radar QPE data can provide more realistic forcing at the time scales relevant for ground water recharge.
 Dynamic, coupled surface water-groundwater models require years of observation data with relatively high quality to obtain reliable water budgets. However, application of long time series of radar QPE data may cause difficulties, since the nonprecipitation echoes as well as other types of errors left on the radar QPE product could accumulate over time and affect the model calibration [Ryzhkov and Zrnic, 1995; Anagnostou et al., 1998; Steiner et al., 1999; Borga et al., 2002; Goudenhoofdt and Delobbe, 2009; Villarini and Krajewski, 2010]. Thus, a thorough model performance evaluation using not only the radar but also rain-gauge-based precipitation product for comparison could help to provide an understanding of the physical processes in the hydrological models. Additionally, the coupled hydrological model is also an indirect tool to assess the accuracy of the rainfall input.
 The objectives of the present study are: (1) to prepare multiannual rainfall data sets and to apply both radar and rain-gauge-based rainfall products in hydrological models, (2) to analyze the added value of using higher resolution radar QPE data in hydrological modeling, and (3) to identify the role of rain gauge density on the accuracy of rainfall estimation and hydrological model simulations.
2. Study Area and Data
 The study area is a catchment located in the western part of Denmark where the Skjern River discharges to the Ringkoebing Fjord (RF), Figure 1. Glacial melt water sand from the last Ice Age constitutes the upper layer of the geology, which enables high infiltration capacity. This feature together with generally shallow groundwater table suggests that the RF catchment may be an ideal place to observe possible interactions between precipitation and groundwater. Two precipitation data sets are prepared in this study, one based on a combination of weather radar data and rain gauge data while the other is based entirely on rain gauge data.
2.1. Ringkoebing Fjord Catchment and Observation Data
 The RF catchment is a hydrological observatory and the best instrumented area in Denmark for hydrological research activities [Jensen and Illangasekare, 2011]. The catchment is approximately 3500 km2, Figure 1. Land use consists mainly of agriculture (85%) while the rest is urban and forest areas.
 Western Denmark is dominated by a typical maritime climate. Thus, the RF catchment experiences mild winters, cool summers, and frequent precipitation. On average, precipitation is observed more than a third of the days each year. The mean annual precipitation is estimated to 1057 mm. The catchment is drained by the largest river in Denmark, the Skjern River, which has mean discharge of approximately 35 m3/s.
 Observation data are available from stream discharge stations (Figure 1) and groundwater monitoring wells (Figure 2). The observation points are distributed relatively homogenously over the catchment. The groundwater observation data are collected by the local Danish authorities and stored in the Jupiter database maintained by the Geological Survey of Denmark and Greenland (GEUS). Due to an administrative restructuring of the local authorities, many of the groundwater data collected in 2005 and 2006 have not been reported to GEUS. As a result, most of the field data collected during and shortly after 2005–2006 are missing from the database. After 2007, data collection and reporting has been resumed but with a much less dense network as compared to the period before 2005 (seen on Figure 2).
 The top soil in the area is highly permeable and the stream discharge is therefore dominated by inflow from groundwater (base flow). In large areas, the groundwater level is close to ground surface and subsurface drainage pipes are therefore installed to direct the excessive water to the streams. The shallow geology is dominated by Quaternary sand and gravel which forms large interconnected aquifers. Below, large Miocene sand and clay formations are found. Data from groundwater abstraction wells are the main data source for building the geological model. Six geological units were defined, three Quaternary units comprising fractured clay located close to the soil surface, sand and clay, together with three pre-Quaternary units defined as clay, Miocene mica sand and Miocene quartz sand. A geological cross section is shown in Figure 3.
2.2. Radar-Based Precipitation
 The radar data used in the study are retrieved from two C-band radars located at Roemoe and Sindal in western Denmark as shown on Figure 4. The radars are operated by The Danish Meteorological Institute (DMI) and have a detection range of 240 km. Pseudo-CAPPI images of radar reflectivity at 2 km height are generated every 10 min. A maximum-pixel-value approach is used to generate composite sums in order to avoid the borderline effect in the area where the radar beams intercept.
 The radar QPE algorithm used in this study is developed by DMI where the basic principles of the algorithm are based on the method developed by [Michelson and Koistinen, 2000]. The program runs on hourly or daily time step and the end product, short name ARNE, has a resolution of 2 km Cartesian grid. Detailed description of the ARNE algorithm can be found in [He et al., 2011a], and a brief summary is given in the following.
 The radar reflectivity factor (Z) is converted to rain rate (R) using the power law Marshall-Palmer relationship [Marshall et al., 1947]. After the hourly accumulated rainfall is obtained, the raw radar images are subjected to bias adjustment based on ground observations from rain gauges. The radar pixels with collocated rain gauges are collected as gauge-radar data pairs. The bias is expressed as the logarithmic transformation of the gauge-to-radar ratio. After interpolation of the bias values to the entire grid domain, the final adjusted daily radar precipitation is obtained by multiplying an adjustment factor field to the original radar composite sum. As such, ARNE is a range-dependent bias adjustment algorithm that not only adjusts the mean bias but also conditions individual pixel values based on the location of the rain gauges. However, the method for conditioning radar rainfall field to rain gauge records is not an exact procedure. The rain gauge data are used for establishing the bias-to-distance relationship, thus the rain gauge records at their corresponding locations are not preserved to the exact values.
 When the radar QPE images are accumulated, static ground clutter are magnified since no clutter removal program has been implemented by the radar operator. These accumulated clutters can cause large errors in the hydrological model since they are pure noises. Thus, in the present study the static clutters are removed by using a simple clutter filter. Histograms are generated using all pixels in the annual accumulated radar images. Under the assumption that an accumulated clutter signal is stronger than a precipitation signal, the locations of the pixels at the top 2% of the histogram are identified and the values are recalculated using the average values from adjacent pixels. However, the procedure is not able to deal with neither dynamic clutters that change in space and time nor the static clutters with signal strength similar to the rainfall signal.
2.3. Rain Gauge Data
 Ground-based precipitation data are obtained from the rain gauges located across the RF catchment. In the present study, the 10 km rainfall grid product provided by the Danish Meteorological Institute (DMI), DMI10 for short in the following will be used as a comparison to the radar-based rainfall product. Since many of the rain gauge stations in the study area are manual gauges with daily observation frequency, both radar and rain-gauge-based precipitation products used in the hydrological model represent accumulated daily precipitation. Inverse distance weighting interpolation is used to project point observation data onto the 10 km rectangular grids. Instead of using a fixed search radius, where all stations within a defined distance are included, the method seeks in the four sectors around the center point, and uses only the nearest station in each sector. The stations used in this method are always the best available in terms of geographical spreading [Scharling et al., 2006].
 In principle, the radar image adjustment described in section 2.2 takes account of all the rain gauges in the country, but since the multiplicative factor is reversely proportional to the distance, it is estimated that rain gauges located further than 50 km away will have little or no contribution to the adjustment of a particular pixel. It is difficult, however, to identify the exact number of rain gauges that affect the resulting radar images of the RF catchment area. Therefore, a 100 km × 80 km rectangle is selected to demonstrate the spatial layout of rain gauges both within and in the vicinity of the catchment. As seen on Figure 2, the number of rain gauges has been reduced over the country and over the catchment due to system upgrade and budget cuts. In the year of 2006 where the rain gauge number is at its peak, there are 87 gauges inside the rectangle. However, in 2009 and 2010 the number of operational gauges in the catchment is 49 and 40, respectively. Hence, the gauge density has been reduced by more than a factor of two.
 The DMI10 product has long been considered as the “standard” rainfall data set in Denmark, and most of the previous hydrological models that use gridded rainfall data as model input have employed such data [van Roosmalen et al., 2009; Fu et al., 2011; Seifert et al., 2012; Stisen et al., 2012].
 Both the radar and rain-gauge-based rainfall products are highly dependent on the accuracy of the rain gauge data. Thus, bias correction of the rain gauge observations is crucial to the success of the hydrological modeling. The rain gauge catch correction method proposed by Stisen et al. [2011, 2012] is used in this study where air temperature, wind speed, and rainfall intensity on a daily basis at grid scale are accounted for.
 Our intention is to construct two hydrological configurations where different rainfall estimation schemes can be tested. To make sure these two models are both valid for their respective conditions, the two models also need to be calibrated against observation data. As illustrated in Figure 5, the overall simulation period is 1996–2010 with the first 10 years used as a warm-up period. The DMI10 product is used during the warm-up period to ensure identical hydrological conditions at the beginning of 2006 for the different rainfall scenarios introduced in the later period. Radar QPE data are prepared for 2006–2010. Due to the reduction in rain gauges, model calibrations are carried out for the period 2007–2009 and the calibrated models are validated in 2006 and 2010. In this way, both the effect of rain gauge density and the impact of alternative rainfall products on the hydrological model performance are evaluated. When designing the experiment shown in Figure 5, a number of considerations have been given regarding the data discontinuity. Rain gauge data are available for the entire simulation period (1996–2010), whereas the radar data are only available from 2006 to 2010. Therefore, the discontinuity problem does not exist in the rain-gauge-based model. In the radar-based model, a switch of rainfall product takes place at 2005/2006. However, both years have the “full” rain gauge network, which makes the bias level staying basically the same. Thus, one would expect that radar QPE gives similar rainfall mean as the rain gauge rainfall if radar data before 2006 had existed. Moreover, the model calibration starts at 2007, which means there is one extra year for the groundwater stage to reach equilibrium again even if a difference in the mean value between the rainfall products was present.
3.1. Hydrological Model and Model Setup
 The hydrological model for the RF catchment is developed using the MIKE SHE code which is a deterministic, distributed and physically based modeling system [Abbott et al., 1986] providing a fully coupled groundwater-surface water description. The basic setup of the RF model is in line with the Danish National Water Resources Model, the DK-model, which operates at daily temporal scale on a 500 m grid domain [Henriksen et al., 2003; Hojberg et al., 2012; Stisen et al., 2012].
 The current model configuration includes a 2-D diffusive wave approximation of the Saint Venant equations for overland flow; a kinematic routing method for flow in the river system; a two-layer water balance method for the unsaturated zone describing the distribution of infiltration between evapotranspiration and groundwater recharge; a linear reservoir model for flow in subsurface drains; a 3-D Boussinesq equation for flow in the saturated zone; a Darcy flow method that includes both river bed and aquifer conductance for the river-aquifer interaction; and a degree-day approach for snow accumulation and melting. More information about the MIKE SHE modeling system and the descriptions of the involved hydrological processes can be found in Abbott et al. , Refsgaard and Storm , and Graham and Butts .
 As the backbone of the groundwater model component in MIKE SHE, a geological model is established for the catchment. The geological model consists of 17 geological units where the hydraulic conductivity, specific yield, and specific storage values are specified for each unit. To accelerate the numerical computation, 11 computational layers are specified which delineate the main aquifers and aquitards. A detailed description of the geological setup of the RF catchment model can be found in Nyegaard et al. .
3.2 Model Optimization
 The model calibration scheme is in accordance with the guidelines given by Madsen , where the whole optimization process can be divided into three key steps: The first step is model parameterization where the sensitivities of the model parameters are analyzed and choices of parameters for calibration are made. In the second step, calibration criteria are defined and the objective functions are formulated. The algorithm used for calibration is specified in the third step.
 Due to the complexity of the MIKE SHE model structure, multiobjective calibration is adopted which enables various hydrological processes to be optimized simultaneously [Gupta et al., 1999]. Using the multiobjective calibration also prevents overconditioning to a particular water compartment due to limited field data. This feature is very beneficial in the current study since the entire hydrological cycle is considered altogether.
 Model calibration is carried out against observation data for hydraulic head and stream discharge, where PEST [Doherty, 2003] is used as the optimization tool. PEST has been intensively used in the field of hydrological research [Moore and Doherty, 2005; Skahill and Doherty, 2006; Christensen and Doherty, 2008]. The model optimization method used by PEST is the Levenberg-Marquardt Algorithm (LMA) with Jacobian matrix updating, which is a gradient-based local parameter search algorithm.
 The RF model used for the present study has 96 parameters that can be individually specified. Based on a sensitivity analysis, a subset of nine parameters was selected for optimization. Another 14 parameters were tied to the nine free parameters. The rest of the parameter values were fixed at values specified based on previous model applications in the catchment. To minimize the number of free parameters, it is assumed that the ratios between the summer and winter root depths are fixed, and the ratios between the root depths for different soil types are also fixed.
 All free parameters are assumed to be log-normally distributed. No constrains were specified to limit the parameter search, implying that the optimized parameter values might end up with unrealistic numbers. This specification was purposely designed since unrealistic parameter values indicate that errors may have occurred in the model input or the model formulation [Stisen et al., 2011]. The selected free parameters as well as the initial values are summarized in Table 1. The initial parameter values are obtained from calibration of a larger scale hydrological model that covers the entire mid-Jutland, including the RF catchment [Stisen et al., 2012].
Table 1. List of Parameters As Well As Their Initial Values Used for Calibrating the Skjern Model
Hydraulic conductivity for Quaternary sand
3.94 × 10−4
Hydraulic conductivity for Quaternary clay
3.42 × 10−7
Hydraulic conductivity for Miocene quartz sand
8.40 × 10−4
Hydraulic conductivity for Miocene mica sand
9.27 × 10−5
Hydraulic conductivity for pre-Quaternary clay
6.73 × 10−8
Hydraulic conductivity for Top 3 m till/moraine
1.48 × 10−4
Time constant for drain flow
2.86 × 10−8
River-aquifer leakage coefficient
1.38 × 10−6
Summer root depth
3.3. Model Performance Evaluation
 Six objective functions are defined for both model calibration and independent model validation. The objective functions are listed in Table 2 along with their mathematical expressions and the number of observation data in each group. As seen from Table 2, groundwater is more intensively sampled than surface water. Therefore, weighting factors in the multiobjective calibration are assigned to each observation data to make sure the aggregated relative weights among the objective functions are equal.
Table 2. List of Objective Functions Used for Calibrating the Skjern Catchment Model With PEST
Number of Observations
Nash-Sutcliffe efficiency of daily discharge
Mean relative error in the total water balance
Mean relative error in the summer water balance
RMSE of the individual hydraulic head time series
RMSE of the mean hydraulic heads in 2000–2005
Mean error of hydraulic head in each model layer
 Summer water balance is introduced as an objective function to adjust the simulated low flows during the dry season more rigidly, whereas the simulated water balance during the wet season is rather reliable based on our past experiences. Because of the groundwater head data availability issue explained before, mean hydraulic head data from the period 2000–2005 where more abundant observation data are available, are also used for calibration. The change in the dynamics of simulated groundwater head is captured by using time series of hydraulic head data from the fewer observation wells in 2007–2009.
 Since the resolution of the precipitation products (2 and 10 km) are coarser than the grid size of the hydrological model (500 m), the grid size of the precipitation estimates are split into 500 m grids to match with the grid size of the RF model without any interpolation. When the precipitation input is switched between radar and rain gauge products, the remaining parts of the RF model setup are unchanged. Therefore, as the hydrological models are calibrated individually against field observations, the differences in the calibrated parameter values only reflect the influence of the rainfall forcing. Based on the calibrated models, validations are carried out with respect to surface water, groundwater, and the overall water balance.
4.1. Precipitation Estimated by Radar and Rain Gauge
 Although the radar QPE algorithm uses rain gauge data for bias adjustment, it is still necessary to scrutinize the residual bias. One effective and commonly used method for this purpose is to calculate the mean field bias (MFB) between two images [Smith and Krajewski, 1991; Anagnostou et al., 1998; Seo et al., 1999; Borga et al., 2002; Goudenhoofdt and Delobbe, 2009]. Mean field bias is defined as:
 where Gi is rain gauge observations over a certain area and Ri is the radar estimated values at the pixels that contain Gi. In the present study, daily MFB is calculated for collocated rain gauge-radar data pairs with radar-based QPE as the denominator. The data set is screened for precipitation values larger than 0.3 mm/day in order to reduce the background noises. The data set is also log-transformed for easy visualization.
 As seen in Figure 6, the distribution of the log-transformed residual bias from the radar QPE follows the shape of the normal distribution. This confirms that the radar rainfall bias adjustment needs to be operated in a log-based domain which is one of the assumptions in the ARNE algorithm. The bell cloud peaks sharply at zero indicating that the majority of the mean bias has been removed by the ARNE algorithm. On the other hand, it is neither practical nor necessary to remove all the mean bias considering the different nature of these two rainfall products. Toward the lower precipitation intensities, it is observed that the differences between the two estimates can be as large as a factor of 10, which are typically caused by radar image artifacts. The radar image artifacts can possibly originate from both meteorological and nonmeteorological echoes; however, in the present case, it is mainly due to clutters in the study area. According to DMI, the bright band contamination is very seldom in western Denmark and the VPR correction is not applied. Overall, for the annual water balance the simulated catchment flows are essentially controlled by large rainfall events. As heavy rainfall occurs very rarely in the RF catchment, the overall result of the bias removal is acceptable.
 The areal precipitation in the RF catchment for both radar and rain-gauge-based precipitation products are illustrated in Figure 7. Annual averaged rainfall maps are calculated for three periods: 2006, 2007–2009 and 2010, which are consistent with the model calibration and validation periods. The radar-based precipitation product has a five-time higher spatial resolution and is obviously more pixelated than the rain-gauge-based product.
 Apart from the mean bias, another common problem of using radar QPE in a hydrological context is that the bias is a function of the distance from the radar site. There are two main factors that could cause radar rainfall estimates to show biases with range: First, the height of the radar beams increase with range, and thus intercept with the vertical profile of reflectivity at different heights, and second, the sampling size of rainfall observation increases with range, as the radar beams become wider [Zawadzki, 1984; Kitchen et al., 1994; Ryzhkov and Zrnic, 1995; Borga and Tonelli, 2000; Krajewski and Vignal, 2001; van de Beek et al., 2010; Villarini and Krajewski, 2010]. For this reason, a second order polynomial is established to correct the bias in the ARNE algorithm based on a distance relationship. It is seen in Figure 7 that after the adjustment, the spatial patterns of the two rainfall products are more similar than if only the mean bias is removed from the accumulated radar image [He et al., 2011a], which indicates that the distance induced bias is basically removed from the radar rainfall product and the annual water budget calculated by these two products will be mainly dependent on the mean values inside the study area.
4.2. Optimized Hydrological Model Parameters
 Verification of radar-based rainfall estimation against rain gauge data alone may be insufficient. This is because the large differences observed between the two data sources are likely to be representativeness errors due to the different supporting scales of the two methods. In the present study, a distributed hydrological model with spatially distributed model outputs serves as a tool for additional verification of the rainfall signal. Before forward model simulations are carried out, model parameter optimizations are performed using DMI10 and ARNE as precipitation input while the rest of the model setup is unchanged. Nine parameters are calibrated using the initial values listed in Table 1 and the objective functions in Table 2.
 The nine calibration parameters selected can be grouped into three categories: (1) hydraulic conductivities controlling groundwater flow; (2) drainage and leakage coefficients controlling flow contributions to stream discharge; and (3) root depth controlling evapotranspiration and percolation out of the root zone. The resulting estimates of the optimized parameters are shown in Figure 8. The hydraulic conductivities for the subsurface sediments are very similar for the two model optimizations. The hydraulic conductivity of the top soil (Kx_6) controls the dynamics of the groundwater flow in case the groundwater table is close to the ground surface. In the RF catchment more than half of the area has a groundwater table at less than 3 m depth, which makes the conductivity of this geological layer sensitive to changes in precipitation at daily time scale. Therefore, the difference between the estimates for Kx_6 is relatively larger than for other hydraulic conductivities. Since ARNE gives higher values of estimated precipitation amount than DMI10 during 2007–2009, it is not surprising that the largest difference is seen for the root depth as actual evapotranspiration is closely related to root depth. The calibrated root depth based on ARNE input is about 40% higher than when using DMI10 in order to obtain acceptable overall water balance.
 In general, the model parameters that cover a wide range of hydrological processes do not show any significant differences between the two methods.
4.3 Model Calibration and Validation Results
 Independent validation is essential in hydrological model development and application, since the model parameters may be overconditioned during the calibration period. In that case, the model cannot generate reliable simulations for periods or situations that are different from those in the calibration period. Moreover, results from the hydrological model can be seen as an indicator of the quality of the rainfall products. For these reasons, model performances are evaluated using the objective functions shown in Table 2 for the two validation periods, 2006 and 2010, individually. Based on the field observations available, the model validations are performed for stream discharge, groundwater head, and the total water balance. Scores of the objective functions from the calibration and validation periods are presented side by side for comparison, Table 3-5.
4.3.1. Stream Discharge
 Nash-Sutcliffe efficiency (NSE) and two water balance scores are shown in Table 3. The discharge stations are sorted by catchment size where the locations can be found in Figure 1. During the calibration period (2007–2009), the radar-based model is superior to the rain-gauge-based model in terms of NSE. However, in the same period the rain-gauge-based model is more favorable in terms of water balance scores showing smaller bias for most stations both with respect to annual accumulation and accumulation over the summer months (June, July, and August). This indicates that ARNE generates a larger mean bias but a smaller residual variance over the calibration periods. For stream discharge we consider NSE to be more powerful for model evaluation since it accounts for the timing of the peaks and also the shapes of the recession curves. Thus, the radar-based model is considered more accurate during calibration in this case. In Figure 9, simulated hydrographs at an upstream station (250021) and a downstream station (250097) are shown. Based on visual inspection, the differences between the model simulations are larger at the upstream site, however, the difference between the Nash-Sutcliffe scores are similar for the two stations in the calibration period 2007–2009 with values of 0.73 and 0.67 at the upstream station, and 0.77 and 0.72 at the downstream station. This trend is also obtained at other upstream stations not included in the figure. However, both models underestimate discharge in the second half of 2008, which explains the larger error in the water balance scores. It is also noted that during winter between 2006 and 2007, both models produce discharge that are much higher than observed. This problem is presumably caused by a model deficiency in the two-layer description of the unsaturated zone. Before the large peaks 10 successive days with significant rainfall were observed. Since the unsaturated zone of the two-layer model ends at the root depth, it is likely that the buffering capacity is insufficient or the time delay in the unsaturated zone is too short. As a result, the infiltration excessive runoff is discharged directly to the rivers, whereas in real nature the catchment enables more buffering capacity than in the model.
Table 3. Model Performance for Simulated Stream Discharge Using Different Precipitation Inputs During Both Calibration and Validated Periods, Expressed by the Nash-Sutcliffe Coefficient, and Deviations to Water Balance
 In the validation periods, 2006 and 2010, the differences between models performances are relatively small, which suggests that outside the calibration period the differences in predictive accuracy with respect to stream discharge caused by using different rainfall products are very limited. The difference between the two models at the same station is much smaller than the variation for the same model at different subcatchment. In addition, it was expected that the performance of the radar-based model would be better in smaller catchments due to the higher resolution. However, such scale dependency is unclear. One possible explanation is that the selected catchments are simply too large to verify this hypothesis. The smallest catchment covers an area of 46.5 km2, which corresponds to nearly 12 radar pixels, or half of an interpolated rain gauge grid. Therefore, as long as the mean precipitation over this area is basically the same between the two estimates, the internal distribution does not make a significant impact.
4.3.2. Groundwater Head
 Using multiannual radar QPE data set as input to a hydrological model provides an opportunity to evaluate the impact of radar QPE on the simulated groundwater behavior, since the interaction between rainfall and groundwater is much less instantaneous than the rainfall-runoff process and thus requires much longer period of rainfall data set. Table 4 shows the RMSE of groundwater hydraulic head calculated for the models using both radar and rain-gauge-based precipitation inputs, and compared against observations from a control period (2000–2005) where abundant data are available. The model performances based on the two rainfall scenarios are very close to one another both in the calibration and validation periods with a general preference given to the ARNE-based model. It is noted that at layer 3, where the shallow aquifer is located, the performance of the radar-based model is about 10% higher than the rain-gauge-based model in all three periods. Since the observations from groundwater wells are relatively local, the advantage introduced by using high-resolution rainfall data as model input is apparent for simulating shallow groundwater heads. At deeper groundwater layers, for instance layer 5 and 8, the results from the radar-based model are also superior. However, although the groundwater recharge could dominate the simulated hydraulic head, it is a very indirect proxy of precipitation. Therefore, it is hardly conclusive if the improvement is caused by using radar QPE as the rainfall input.
Table 4. Root Mean Squared Error of the Simulated Hydraulic Head at Different Model Computational Layers Using Radar and Rain Gauge Based Precipitation Inputs During the Calibration and Validation Periods (unit: m)a
No. of Observations
Calculations are made with reference to the mean hydraulic head in 2000–2005.
 The radar and rain gauge simulated groundwater heads for 2006–2010 are illustrated in Figure 10. It is unnecessary to differentiate between the three periods since the differences are subtle. The groundwater head data are extracted from layer 3 of the saturated zone which is located approximately 10 to 20 m below the ground surface. On average, the spatial distribution of the simulated groundwater heads is very similar by visual comparison. However, the radar-driven hydrological model predicts higher groundwater elevation in the western part of the catchment and lower in the eastern part compared to the results obtained by the rain-gauge-driven model. Initially, it was anticipated that the choice of rainfall product would have limited effect on the simulated groundwater head over a time span of half a decade. However, Figure 10 indicates that the spatial distribution of precipitation applied in the hydrological model has relatively significant impact on the simulated groundwater heads: at many locations in the catchment the absolute differences between model simulations are more than 2 m. Since the RMSE of the simulated groundwater head is in the range of 2–4 m as indicated by Table 4, a 2 m difference between the model simulations could show significant impact on the model performances.
4.3.3. Overall Hydrological Model Performances
 Overall water balances for the two models are summarized in Table 5 for both the calibration and validation periods. The average precipitation estimated by ARNE is almost identical to DMI10 in our testing period with only marginal residual bias between the two products. As a result, Table 5 indicates that the two precipitation products have basically equal performance in providing sufficient accuracy for closing the water balance. It should also be noted that the model calibration has compensated for some of the differences in the rainfall forcing. Therefore, if an uncalibrated model with the same set of parameters is used for both precipitation estimates, the difference between the model outcomes would have been larger.
Table 5. Overall Water Balance for the Optimized Hydrological Models Over the Entire RF Catchment Using Different Precipitation Inputs During Both Calibration and Validation Periods (unit: mm/year)
Groundwater storage change
 In order to rate the model performance for different rain gauge layouts over time, measures for aggregation and averaging are needed. In the present study, RMSEs of the groundwater heads are calculated using time series observations, as seen in Figure 2a (2). RMSE scores from each groundwater well are weighted equally, whereas for the surface water weighted mean Nash-Sutcliffe efficiencies are used. The weighting is done considering the size of the subcatchments represented by the discharge stations, and the weighting factors are directly proportional to the subcatchment size. The results are shown in Figure 11.
 Both the radar and the rain-gauge-based models indicate somehow dependency on the rain gauge density with increasing RMSEs for groundwater head and decreasing NSE for stream discharge with decreasing density. RMSE increases by 16 cm and 28 cm from 2006 to 2010 for ARNE and DMI10, respectively. For the Nash-Sutcliffe coefficient, a decrease of 0.17 and 0.20 for the two data sets are found. Hence, the decrease in performance is more apparent for the rain-gauge-based model.
 Relative changes in the performance criteria RMSE for groundwater and NSE for stream discharge are also apparent for simulations with and without the use of radar data. In the calibration period (2007–2009), both groundwater head and stream discharge are better described when radar QPE is used as input to the hydrological model, where a decrease in RMSE of 6% and an increase in NSE by 4.4% are found. In 2006, where the number of gauges is relatively high, the added value of using radar data is relatively small, 3.9% for groundwater and none for stream discharge. For the last period, 2010, where the gauge density is relatively low, the opposite result is found as the increase in performance of both RMSE and NSE are higher than for the calibration period. Hence, the results indicate that the value of the radar data increase as the gauge density decrease.
4.4. Model With Reduced Rain Gauge Density in 2006
 The weak model performance observed in 2010 could be due to abrupt changes in environmental condition in comparison to the previous period. If so, the decline in model performance cannot be explained by the change in rain gauge density. Therefore, the ARNE algorithm is run for the year 2006 where only rain gauges that are still in operation in 2010 are included. The results are shown by the red bars in Figure 11. The drop in rain gauge number has a clear negative influence on both the simulated groundwater head and stream discharge performances compared to the 2010 situation. The relative change in the performance criteria RMSE and NSE are calculated by comparing the results of using the full gauge network from 2006 to the reduced network from 2006 and the actual network from 2010. In both situations, substantial reductions in stream discharge performance of 25% and 22% are found, whereas the effect on groundwater is less significant, only an increase of 1.9% and 4.3% are found in RMSE. Therefore, reducing the rain gauge density in the QPE algorithm has larger impact on simulated stream discharge than that on simulated groundwater head.
 The change of model performance caused by the change of rain gauge density involved in the QPE algorithm also has spatial implications. The spatial arrangement of the rain gauges that are still in use in 2010 and the rain gauges that are removed between 2007 and 2009 are shown in Figure 12. It is seen that the closed stations are centered on two clusters: one to the south and another to the east of the RF catchment. The cluster to the south is not supposed to have large influence on the simulated discharge since the rainfall collected from that area can hardly be manifested by either of the discharge stations. However, the east cluster is very close to the upstream discharge stations, where two of the stations, namely 250018 and 250021, show considerable decrease of NSE as shown in Table 6. Therefore, when the rain gauge number is halved in 2006, the impact is more significant on the simulated upstream discharge in the RF catchment.
Table 6. Changes of Nash-Sutcliffe Coefficients of Simulated Stream Discharge Caused by Changes in Rain Gauge Density in Radar Based QPE in 2006
 The large reduction of NSE seen from the upstream stations is likely due to a drastic change of estimated rainfall mean. This hypothesis is confirmed by comparing the actual rain gauge measurements with the ARNE estimated rainfall with two sets of rain gauge densities at five locations shown in Figure 12 and Table 7. The correlation coefficients between the two rainfall scenarios are nearly identical at all five stations indicating the temporal dynamics of ARNE are unchanged regardless of the number of rain gauges involved. However, very large differences are observed on the mean bias where ARNE with reduced rain gauge network has all negative tendencies. Therefore, Table 7 provides evidence that by removing half of the rain gauges from the QPE algorithm, the estimated mean rainfall can change considerably, leading to a performance decrease of the hydrological model especially at streams located at the upstream part of the catchment.
Table 7. Comparisons Between Rain Gauge Observed Rainfall and ARNE Estimated Rainfall in 2006, With Full Rain Gauge Network and Reduced Rain Gauge Network in ARNE
Mean Bias (mm/year)
Correl. Coefficient (r)
5. Discussion and Conclusions
 The present study highlights the comparison of model simulations from an overall water balance perspective with much consideration given to the simulated groundwater behavior, which is made feasible by using a five year radar QPE data set. Furthermore, the high dependence of model performance on the areal mean precipitation addresses the importance of rain gauge density involved in the radar QPE algorithm.
 Radar QPE has been compared with rain-gauge-based rainfall estimation and also quality checked using a hydrological model. Based on accumulated rainfall maps, the differences between ARNE and DMI10 are small both with respect to mean bias and spatial patterns. However, considering the improvement in the spatial resolution, ARNE is more favorable than the DMI10 product from a hydrological modeling perspective since some of the local rainfall missed by the interpolation of the rain gauge data may be captured by the radar QPE.
 The lumped performance measures suggest overall better groundwater head simulations for the radar-based model than the rain-gauge-based model regardless of the number of rain gauges involved, where a difference of approximately 5% in RMSE can be expected between the two models. For simulation of stream discharge, the benefit outlined above may not have large influence in medium to large size river basins, such as the Skjern catchment and its subcatchments, mainly due to the spatial averaging and smoothing in the processes. The positive impact of radar-based rainfall estimation on groundwater simulation is likely explained by the relatively slow response from rainfall on groundwater head. The catchment also acts as a filter that screens part of the noises and artifacts in the radar QPE. Therefore, the simulated groundwater head is much less sensitive to the radar artifacts and clutters. Besides, the groundwater head measurements are more local than the discharge measurements. As a result, the rainfall spatial details revealed by using the radar data have demonstrated advantages over the rain-gauge-based precipitation estimates in simulation of groundwater heads.
 In hydrological studies, weather radar has been reportedly used to assist flood forecasting, since the infiltration excessive runoff is directly related to the intensity and spatial distribution of rainfall [Kuczera and Williams, 1992; Sun et al., 2000; Carpenter and Georgakakos, 2004; Ntelekos et al., 2006; Cole and Moore, 2008]. The present study, from a quantitative perspective, demonstrates the potential of using radar QPE for groundwater modeling at places where surface and subsurface hydrological processes are closely linked. This finding could help to extend the application of radar-based QPE data to a broader range of applications, which otherwise have not been given enough attention.
 It has been suggested in the present study that the added value of using radar increases when the number of rain gauges decreases. In 2006, where abundant gauges are available, the differences in RMSE and NSE between the ARNE and DMI10 based simulations are relatively small. As the number of gauges is reduced from 2006 to 2010 the added value of using radar becomes apparent. We may assume that with the help from radar it is possible to sustain the quality of the resulting rainfall product when the rain gauge number reduces. However, our results indicate that the quality of the radar-based product is lower than the original gauge-based product which is contrary to the intention.
 To further analyze this problem, a synthetic scenario was created to reduce the rain gauge number in 2006 to the same level as in 2010, which provides an opportunity to analyze the impact of the rain gauge density on both the radar-based rainfall estimation and on the hydrological simulations under exactly the same climatic and hydrological conditions. It is implied that maintaining a relatively high rain gauge density, e.g., 4 gauges/100 km2, is crucial even with the aid from the radar. With many uncertainties in the radar-based QPE still unsolved, we consider that rain gauges, especially the ones close to the upstream part of the RF catchment, should not have been removed for the time being from a hydrological modeling perspective. The NSE score obtained by using ARNE with reduced rain gauge network in 2006 from discharge station 250021 is similar to the NSE score from the same discharge station in 2010, which are 0.33 and 0.25, respectively. Therefore, the negative impact on the upstream station by closing down the neighborhood rain gauges is consistent. It can be concluded that closing down half of the rain gauge stations in the study area sacrifices the quality of the rainfall estimation and subsequently the performance of the hydrological model; however, without the extra data obtained from the radar the effect would have been worse.
 There are mainly two reasons that cause model performance to decrease when using fewer rain gauges in the QPE algorithm. First, it could be due to the change of estimated rainfall mean. Several studies have indicated that in distributed hydrological modeling, the estimated rainfall mean plays a significant role in sustaining realistic model simulations, especially when the model results have to be extrapolated [Bandaragoda et al., 2004; Ivanov et al., 2004; Jayawickreme and Hyndman, 2007]. Our study has confirmed this hypothesis that when the estimated mean precipitation changes from 1141 mm/yr to 1086 mm/yr in 2006, which is caused by the reduction of rain gauge density from around 4 gauges/100 km2 to 2 gauges/100 km2, there is also a significant decrease in model performance. Second, less accurate characterization of the spatial variation of rainfall (rainfall noise) could also make a negative impact on the hydrological simulation. Model calibration can usually to a large extent compensate for the rainfall bias, but is much less effective to the rainfall noise. With the model calibrated for 2007–2009, one could expect poorer performance when validating for a different time period. This is true in 2010. However, in 2006, more rainfall stations not only help to reduce the rainfall bias but also the rainfall noise, which results in a better performance than the calibration period. We believe that both of the two factors are important, and their combined effect has the final impact on the hydrological simulations.
 The QPE production method applied in the present study can also have implications on the conclusions. One of the main objectives in QPE is to make adjustments to the radar rainfall images and bring them closer to what can be observed at ground level. To achieve that, individual uncertainty sources are studied and physically based corrections are made [Bech et al., 2000; Vignal et al., 2000; Anagnostou et al., 2004; Brandes et al., 2004]. Rain gauge data are not always involved in this process and thus the reduction of rain gauge numbers will not pose any significant impact. An alternative approach is to consider all the uncertainty sources as a whole, and this usually requires the use of rain gauge data. In the present study, the second approach is taken.
 The overall conclusions from the present study are expected to be valid at other locations with similar hydrological regimes, i.e., rather homogeneous precipitation, flat terrain, and groundwater dominated hydrology. The same trend in terms of rain gauge density would also be expected in mountainous areas or other areas with higher spatial variation in terrain and precipitation.
 This work has been a part of the HOBE—Center for Hydrology (www.hobe.dk), which is funded by the Villum Foundation in Denmark.