[27] In the first experiment, precipitation is uncertain and no grain size uncertainty is added. Only PM measurements are assimilated. Here we investigate the effects of uncertainty magnitude, spatial distribution, and accuracy. The spatial distribution component is especially interesting since the PM is a coarse measurement (25 km sensor resolution) on state variables represented at a much finer scale (1 km resolution).
4.1.1. Effect of Uncertainty Magnitude
[28] To investigate the effect of uncertainty magnitude, we use identical uncertainty parameters for both the filter and the truth in each case. We perform two separate experiments in which first the coefficient of variation and second the correlation length of the precipitation are varied over a range of values. In Figures 1a and 1b the domainaveraged and pixelwise mean fraction of corrected open loop RMSE is shown. From Figure 1a, the PM measurements are able to correct around 70% of the domainaveraged SWE RMSE regardless of the coefficient of variation within this range. On the other hand, as the coefficient of variation increases, the fractional improvement in the pixelwise mean RMSE decreases dramatically: this was not expected, since increasing model uncertainty generally tends to increase fractional improvement in DA schemes.
[29] We hypothesize that the unexpected behavior seen in Figure 1a is due to the fact that (1) the 25 km PM measurements can resolve only the mean, and not the spatial pattern of the 1 km state variables and (2) less and less of the error is resolved as the coefficient of variation increases. We demonstrate this using the update on 25 November 2002: let ɛ_{t,p} be the difference between the true and filter estimate of snow depth at time t for all pixels p. One measure of error is the “spatial RMSE” at a point in time, RMSE_{t}, which is defined by:
The spatial RMSE can be expressed in terms of the spatial standard deviation σ_{e,t} and the spatial mean μ_{e,t} (i.e., bias) of ɛ_{t,p} evaluated over the n_{p} pixels [Walpole et al., 1998]:
These values have been tabulated in Table 2 for the measurement on 25 November 2002, as a function of the precipitation coefficient of variation. For this update, we found that (1) as expected, as the coefficient of variation increases, a larger fraction of μ_{e,t} (spatially averaged bias) is corrected; (2) σ_{e,t} is not corrected by the filter (σ_{e,t} is identical for the prior and for the posterior filter estimate); and (3) as the precipitation coefficient of variation increases, the posterior RMSE_{t} is dominated more and more by σ_{e,t} (rather than μ_{e,t}). Thus the coarse PM measurement is able to resolve less and less of the total error as the precipitation coefficient of variation increases.
Table 2. Prior and Posterior Error Statistics Shown for the Update on 25 November 2002 as a Function of the Precipitation Coefficient of Variation^{a}Precipitation Coefficient of Variation  RMSE_{t}  Absolute Value of μ_{ɛ,t} for Posterior Estimate  σ_{ɛ,t} for the Prior Estimate  σ_{ɛ,t} for the Posterior Estimate  Fraction of μ_{ɛ,t} Corrected  Ratio of Posterior σ_{ɛ,t} and Posterior RMSE_{t} 


0.1  0.03  0.024  0.02  0.02  0.29  0.46 
0.3  0.06  0.023  0.06  0.06  0.76  0.88 
0.5  0.10  0.010  0.10  0.10  0.93  0.99 
0.7  0.14  0.001  0.14  0.15  0.99  1.00 
0.9  0.18  0.003  0.17  0.19  0.99  1.00 
[30] In Figure 1b, when the correlation length is less than 100 km, the filter scheme is able to correct less than 40% of the pixelwise mean RMSE. It should be noted that the resolution of the sensor is 25 km. As the correlation length of the precipitation error increases, the 25 km observation can resolve a greater fraction of the error, since the state variables become more and more homogenous. It should be noted, however, that even for a precipitation correlation length of 150 km (a factor of six larger than the domain size) with a precipitation coefficient of variation of 0.5, there is significant spatial variability in the precipitation perturbations. The difference between the open loop and true snow depth on 1 March (not shown) varies from −0.2 m to 1.2 m across the domain; the open loop snow depth varies from 0.25 to 1.25 m at that time.
4.1.2. Effect of Uncertainty Spatial Distribution
[31] The radically different results from the two RMSE metrics in Figure 1a point to the fact that the spatial distribution of SWE uncertainty is an important factor in these results. Since we have assumed that precipitation uncertainty is constant in time in setting up these experiments, it is adequate to consider a single PM update (on 25 November 2002) in investigating the effects of the spatial distribution of uncertainty on the filter. Figures 2a and 2b show the true and filter ensemble mean prior snow depth at this update time, respectively. Figure 2c shows the prior snow depth error, calculated as the difference between the true and prior snow depth; in almost all parts of the domain, the filter underestimates the truth, especially in the southcentral area. Figure 2d shows the posterior error, calculated as the difference between the true and posterior snow depth; although the RMSE has been reduced by 36% by the update, the filter now overestimates snow depth in the northeastern area but still underestimates in the southcentral area. In Figure 2e, the change in the snow depth (state increment) due to the update is shown; as expected, the spatial pattern is very similar to that of the state uncertainty estimate, which is shown in Figure 2f, and is calculated as the standard deviation across the prior ensemble. The scatterplot between these two quantities is shown in Figure 2g; the correlation coefficient is 0.72. Note that the spatial pattern of prior estimated uncertainty (Figure 2f) is different than that of the actual prior error in Figure 2c; hence the filter will update the state suboptimally as long as the actual spatial distribution of the error is unknown, which is why the posterior estimate is too high in some areas (see Figure 2d). Similar to other studies [e.g., Crow and Van Loon, 2006], we model the precipitation error as a lognormal variable; this means that the ensemble standard deviation of the state will be, to some degree, a linear function of the ensemble mean of the state, as shown in Figure 2h. The implication of this relationship between the ensemble mean and uncertainty is that when modeling state variables at a finer resolution than the measurements, the spatial pattern of the update will be strongly influenced by the spatial pattern of the mean.
[32] The mechanism behind this update can be visualized by noting that the spatial pattern of the change in state variables is due to the spatial pattern of the Kalman gain in equation (1), since each 25 km measurement is constant in space. Figure 3 shows maps of the Kalman gain for all twelve measurements with units of meters per kelvin (m K^{−1}) brightness temperature. Several of the channels show a maximum or minimum value for the Kalman gain in the eastcentral area (where the prior estimate of state uncertainty is high), including the 6.925, 10.65, 36.5 GHz channels for both polarizations (Figures 3a, 3b, 3e, 3g, 3h, and 3k). The relative weight given to the measurements versus the prior states can be assessed by calculating a Kalman gain for the modelpredicted brightness temperatures:
where K_{z} represents the Kalman gain for the modelpredicted brightness temperatures, and is a square matrix with dimensions of 12 for this update; that is, it is not spatially variable. If the values of K_{z} approach unity, then the implication is that maximum weight is given to the measurement, and the Kalman update reduces to a linear interpolation as described by Durand and Margulis [2006]. If the values of K_{z} approach zero, on the other hand, then no weight is given to the measurement, and there is no update. For this update, the values of the diagonal of the K_{z} matrix varied between 0 and 0.4, with larger values for the higher frequencies.
[33] The product of the Kalman gain and the difference between the predicted and actual (spatially uniform) measurement produces the update due to each channel [Durand and Margulis, 2006], shown in Figure 4. These maps reveal which measurements were the most influential on a given update. In Figure 4, it is clear that the same spatial pattern holds for the update due to each measurement, and that two channels (36.5 H and 23.8 V in Figures 4e and 4j) are most effective in correcting the estimate. The commonly used retrieval algorithm by Chang et al. [1987] is based on a similar combination of the 36.5H and 19.0H channels. The fact that the 36.5H and the 23.8V channels were most useful in correcting the snow depth confirms the physical basis of using retrieval algorithms such as Chang et al. [1987] to characterize shallow snow. The 37.0 V and 89.0V channels (Figures 4k and 4l) were also very helpful in this update; indeed, most of the channels proved to be useful throughout the course of a winter in a 1D test of this methodology [Durand and Margulis, 2006].
[34] If the spatial distribution of the uncertainty were accurate (i.e., matched the spatial distribution of the actual error), the filter would be able to almost fully recover the truth with time. Thus it is clear that the estimate of the spatial distribution of the uncertainty is as important as the spatial estimate of the mean for the case of a spatially coarse (i.e., PM) measurement. In this DA context, the spatial distribution of precipitation uncertainty is essentially controlled by the spatial distribution of the mean precipitation and by the random number schemes. In reality, the uncertainty varies in space because of other factors, e.g., distance to the nearest precipitation gage. Development of more sophisticated methods of estimating precipitation uncertainty will be an important step in application of land surface data assimilation methods. However, the spatial uncertainty estimate will always itself be somewhat uncertain; the effects of uncertainty accuracy are examined in section 4.1.3.
4.1.3. Effect of Uncertainty Accuracy
[35] To investigate the effect of uncertainty accuracy, we use different uncertainty parameters for the filter when compared with the truth. We perform two separate experiments in which first the filter coefficient of variation and second the filter correlation length of the precipitation are varied over a range of values, while the true value stays constant. Figures 1c and 1d show the effect of incorrect precipitation error assumptions on the fraction of SWE RMSE corrected (evaluated pixelwise and over the entire basin) for different misspecification of the coefficient of variation and correlation length (Figures 1c and 1d, respectively). In Figure 1c, the domain RMSE is sensitive only to underestimation while the pixelwise RMSE metrics are sensitive to both overestimation and underestimation of the coefficient of variation. In Figure 1d, all error metrics are sensitive to underestimation, but not to the overestimation of the correlation length. Furthermore, very poor results are only observed for severe (greater than one order of magnitude) underestimation of the correlation length; in these cases the filter degrades the estimate with respect to the open loop. For the correlation length, error values are slightly lower for slight underestimation of the parameters than for the actual parameter. Indeed, it should be noted that the maximum efficiency roughly corresponds to the sensor resolution of 25 km. Whether this is due to some intrinsic property of the multiscale DA system is an interesting question that merits further investigation in future studies. Filter results are evidently more sensitive to misspecification of the coefficient of variation than to correlation length.
[36] The fact that the maximum estimation efficiency does not correspond to the true error statistics is somewhat unexpected, although not unprecedented in hydrologic DA studies [Crow and Van Loon, 2006]. In order to investigate whether or not this was due simply to the choice of random numbers in our ensemble, we reproduced this analysis for three additional ensemble seeds. In each of these cases (not shown), the same general pattern persisted: estimation efficiency decreases for both underestimation and overestimation of the coefficient of variation, and the maximum estimation accuracy varies slightly from run to run. Thus these additional runs bolster confidence in the reliability of the pattern in Figure 1c.
[37] In Figure 5, the potential for utilizing the innovation mean to diagnose incorrect precipitation coefficient of variation specification is explored using the same experiments analyzed in Figures 1c and 1d. Four different metrics (defined above) for the difference between the observed and expected innovations are assessed. For the innovation mean metric averaged across the time series, (Figure 5a) there is clearly a minimum near the correct coefficient of variation. The metric proposed by Dee [1995], (Figure 5b) is sensitive to underestimation of coefficient of variation, but less sensitive to overestimation of the coefficient of variation. Ostensibly because of the nonlinearities in the LSM and RTMs, approaches unity asymptotically for overestimation of the coefficient of variation: according to Dee [1995], should be less than unity for overestimation of the error covariance. A similar asymptotic behavior was seen in the soil moisture estimation study of Crow and Van Loon [2006], although in that study, generally crossed unity and converged asymptotically to a value less than unity. The normalized _{1}norm of the difference between expected and sample innovation covariance, (Figure 5c) is a monotonic function of error covariance. The normalized trace of the covariance difference, (Figure 5d), is clearly sensitive to both underestimation and overestimation of the error covariance. In this experiment, in which precipitation correlation length and grain size statistics are all known, both the and metrics could be used in an adaptive filtering scheme to tune the precipitation coefficient of variation. The and metrics, on the other hand, would not be of use in diagnosing input misspecification.
[39] In general, the metric appears most promising for correcting precipitation uncertainty parameters using PM measurement innovations. More importantly, however, all four statistics are far more sensitive to underestimation of the correlation length than to overestimation; each of the statistics has much higher values (not shown) corresponding to very low (less than 25 km) estimates of the correlation length. Thus, though our experiments indicate that minima are found for these statistics, it is doubtful that they could be used directly to correct a parameter if that parameter were overestimated because of the low sensitivity of these metrics to overestimation of correlation length. An underestimation of the correlation length, however, might be able to be diagnosed. It should be noted that in the presence of the bias introduced by using a statistical outlier to represent the true simulation, the assumptions governing the derivation of the expected values of the four metrics defined in section 3.3 no longer hold. For an adaptive application, it may be necessary to use a biasaware filter [e.g., Dee, 2005] to estimate the input bias, before the innovations prove to be useful in correcting misspecified uncertainty parameters.
[40] All of these metrics should theoretically be temporally uncorrelated if the assumptions made in the EnKF derivation are satisfied. Crow and Van Loon [2006] found that an adaptive filter based either on the temporal correlation of the innovations or on the metric would predict identical uncertainty parameters. The temporal autocorrelation function for each of these innovation metrics was also examined (not shown). The autocorrelation function of the innovation signal was not helpful in diagnosing the error, but tended to oscillate in time; this oscillatory behavior was also reported by Reichle et al. [2002b]. Indeed, the temporal sequence of the innovation metrics themselves exhibited oscillatory behavior. For this reason, the Fourier transform of the sequences was taken. The magnitude of the frequency coefficients identified using the Fourier transform showed very similar dependence on the input uncertainty misspecification as the time average and thus are not shown. It is possible that if the measurement uncertainty Λ_{v} in equation (2) were also unknown, we would need to utilize the temporal correlation of the innovation sequence in addition to the metrics introduced above in order to estimate both uncertainty parameters. In that case, use of the innovation temporal correlation to estimate the uncertainty parameters would need to be further investigated.