Multisensor snow data assimilation at the continental scale: The value of Gravity Recovery and Climate Experiment terrestrial water storage information



[1] This investigation establishes a multisensor snow data assimilation system over North America (from January 2002 to June 2007), toward the goal of better estimation of snowpack (in particular, snow water equivalent and snow depth) via incorporating both Gravity Recovery and Climate Experiment (GRACE) terrestrial water storage (TWS) and Moderate Resolution Imaging Spectroradiometer (MODIS) snow cover fraction (SCF) information into the Community Land Model. The different properties associated with the SCF and TWS observations are accommodated through a unified approach using the ensemble Kalman filter and smoother. Results show that this multisensor approach can provide significant improvements over a MODIS-only approach, for example, in the Saint Lawrence, Fraser, Mackenzie, Churchill & Nelson, and Yukon river basins and the southwestern rim of Hudson Bay. At middle latitudes, for example, the North Central and Missouri river basins, the inclusion of GRACE information preserves the advantages (compared with the open loop) shown in the MODIS-only run. However, in some high-latitude areas and given months the open loop run shows a comparable or even better performance, implying considerable room for refinements of the multisensor algorithm. In addition, ensemble-based metrics are calculated and interpreted domainwide. They indicate the potential importance of accurate representation of snow water equivalent autocovariance in assimilating TWS observations and the regional and/or seasonal dependence of GRACE’s capability to reduce ensemble variance. These analyses contribute to clarifying the effects of GRACE’s special features (e.g., a vertical integral of different land water changes, coarse spatial and temporal resolution) in the snow data assimilation system.

1. Introduction

[2] Snow is a very important component of the Earth’s climate system. At the continental and interannual scales, snow cover fraction (SCF) and snow water equivalent (SWE) show large variations, depending on atmospheric circulation patterns. The high albedo and low thermal conductivity of snow significantly affect the land surface energy and water budgets. Consequently, accurate estimation of snowpack properties at a large scale is important for various research areas, for example, (1) hydrological prediction and water resources management, especially for regions in which freshwater availability is heavily dependent on snowmelt [Barnett et al., 2005]; (2) evaluation of coupled general circulation models in terms of their ability to represent observed snow dynamics [e.g., Frei and Gong, 2005]; (3) climate trend quantification in cold regions; and (4) land-atmosphere-ocean interaction associated with snow cover [e.g., Gong et al., 2003].

[3] In recent years derivation of high-quality snow data sets, SWE, in particular, has relied increasingly on data assimilation technology, which optimally blends numerical model results and remotely sensed information to generate more accurate and physically consistent products. Compared with ground observations, which are scattered and often do not represent regional averages, satellite observations effectively expand the spatial coverage to the regional and continental scale. Among satellite observations, SCF products are unique in their estimation of SWE distribution compared with other products, such as microwave-based estimates. Snow has distinct reflectance effects in visible and infrared bands [Hall et al., 2002], which can be measured at relatively high spatial resolutions [Hall et al., 2002; Hall and Riggs, 2007]. Further, SCF retrieval with these algorithms does not require information about the microscale internal properties of snowpack, for example, density, grain size, and liquid water content, although an implicit relation may be involved. Consequently, remotely sensed SCF information has been increasingly used to correct SWE estimates from land surface models, via the ensemble Kalman filter (EnKF) technology [e.g., Andreadis and Lettenmaier, 2006; Clark et al., 2006; Su et al., 2008]. However, a number of limitations degrade the quality of assimilated SWE products, or hamper application in specific climatic and geographic environments, including the following:

[4] 1. A low correlation between SCF and SWE when the model grid averaged SWE exceeds a threshold. In this situation small changes in SCF lead to a wide range of SWE for a given location. Several studies [Clark et al., 2006; Su et al., 2008] have found that SCF assimilation performs best with partial snow cover (e.g., ephemeral snow in mountains and grasslands) but performs poorly when the SCF signal is saturated (near 100% SCF) and insensitive to further SWE increments (e.g., during the accumulation season over boreal forest and tundra).

[5] 2. Parameter errors in the observational operator. A key component of the SCF approach is a parameterized relationship (observation operator or function) between SCF and SWE at each model grid. Various observation functions have been utilized in the literature [e.g., Andreadis and Lettenmaier, 2006; Clark et al., 2006; Su et al., 2008], each having some key parameters characterizing geographic and scale effects. These distributed parameters are usually difficult to measure (since they rely on intensive calibration) and their errors can degrade SWE retrievals.

[6] 3. Errors in satellite-derived SCF data. Although the EnKF algorithm can account for observational noise, SCF error magnitude (variance) can be difficult to quantify in certain environments like mountains [Hall and Riggs, 2007] or when obscured by cloud cover and vegetation [Hall and Riggs, 2007]. These environments can also lead to systematic errors not accommodated by current data assimilation systems.

[7] These three types of errors tend to be magnified when present simultaneously. For example, if the SCF signal is saturated during a boreal forest winter, a small amount of bias in SCF data or uncertainty in parameters may lead to significant errors in the observation operator and hence the SWE update.

[8] All these limitations motivate us to develop a new methodology for improving continental-scale SWE and relevant estimates, using additional information from other satellites. Since SCF observations mainly characterize the spatial distribution of snowpack, the inclusion of further model constraints such as mass and energy terms should be complementary. An analogous approach has been applied in estimating soil moisture, evapotranspiration, and other hydrological states and fluxes [e.g., Renzullo et al., 2008], where multiple data resources are shown to improve estimates. Nevertheless, there has been little research devoted to exploring the nature of multisensor or multifrequency (for radiometer) snow data assimilation. Among the first studies were those of Durand and Margulis [2006, 2007], but their synthetic experiments are confined to point- and river-basin-scale snowpack and they focus on radiometric observations.

[9] Monthly global terrestrial water storage (TWS) estimates have been available from the Gravity Recovery and Climate Experiment (GRACE) satellite system since 2002. GRACE provides estimates of total surface mass change from month to month using changes in the Earth's gravity field, with a spatial resolution comparable to satellite altitude, about 400 km. Thus, unlike visible band instruments, GRACE operate under all-weather conditions (as do microwave radiometers) and measures integrated total land water change from canopy to deep groundwater. Much research has focused on extracting from GRACE meaningful trends and variabilities of individual TWS components (e.g., soil moisture, snow, groundwater) [e.g., Rodell et al., 2004; Niu et al., 2007b; Syed et al., 2008]. More recently Zaitchik et al. [2008] incorporated GRACE TWS data into a land surface model using a Kalman smoother (EnKS) to improve water storage and flux estimates in the Mississippi River basin (where snowpack was simply updated without direct calculation of SWE increments from its ensemble statistics).

[10] In this study we propose to use GRACE TWS measurements to complement Moderate Resolution Imaging Spectroradiometer (MODIS) SCF data assimilation over North America. Integration of radiometer (at visible and infrared bands) and gravity measurements into a land surface model over such a large domain has received little attention. Accordingly, we focus on the following questions: How can we jointly assimilate the two types of snow information that have distinct physical and geographic features? Can the GRACE data assimilation improve SWE and snow depth retrieval, relative to MODIS-only data assimilation? How do these improvements (multisensor versus single sensor), if any, vary geographically, and with what underlying mechanisms? Specific attention is given to the special properties of GRACE TWS data (e.g., its spatial and temporal resolution). The performance of GRACE data regarding the controlling factor for TWS assimilation and the capability of reducing ensemble variance is interpreted quantitatively. The central purpose is to develop observational and algorithmic techniques for accurately characterizing SWE and other cold-region hydrological variables.

[11] Section 2 introduces data and methods applied in the data assimilation experiments. Section 3 describes in detail how these experiments were implemented. The results are analyzed in section 4, and section 5 provides interpretations and comments on specific features of the multisensor data assimilation. Concluding remarks are given in section 6.

2. Data and Methodology

2.1. Moderate Resolution Imaging Spectroradiometer (MODIS) and Gravity Recovery and Climate Experiment (GRACE) Satellite Data Sets

[12] Daily MODIS SCF data at 0.05° resolution (MOD10C1) are used in this study. MODIS uses seven spectral bands to retrieve land surface properties. Its snow mapping algorithm estimates SCF using a normalized difference snow index (NDSI) [Hall et al., 2002] and is able to distinguish between snow and cloud [Hall et al., 2002]. Su et al. [2008] described the MODIS data processing used here, including spatial upscaling of raw data and cloud parameter selection for quality control.

[13] GRACE monthly gravity fields are represented as spherical harmonics to degree and order 60, with most time-variable atmospheric and oceanic gravitational effects removed during data processing. Remaining gravity changes are interpreted as monthly changes in vertically integrated water storage components, for example, snow, soil moisture, and groundwater storage. GRACE estimates are smoothed with a Gaussian averaging kernel with a 500 km radius and filtered to remove longitudinal stripes that are a recognized noise component in current solutions [Chen et al., 2008]. GRACE estimates are then represented on 4° × 4° tiles using bilinear interpolation for compatibility with the model grid configuration (1° × 1° over North America). These time series span the period from November 2002 to May 2007, with missing values in December 2002 and June 2003. Here the TWS data are in their multiyear anomaly form, in which the long-term mean TWS is subtracted from each monthly value. In practice, GRACE time series values are computed from observations taken from about one-half month before to one-half month after the date assigned to the sample.

[14] According to their physical features, MODIS and GRACE contain different information relevant to SWE retrieval. MODIS measures SCF at a high spatial resolution compared to both GRACE and the numerical model, observing both accumulation and ablation with approximately the same level of accuracy. GRACE measures total column water change at a far coarser spatial and temporal scale and does not include information about the individual components contributing to water storage change. However, SWE is expected to be the dominant variable component of TWS in winter in cold regions [Niu et al., 2007b]. Its relative contribution diminishes during melting, when MODIS is thought to be more closely correlated with SWE. GRACE estimates are not affected by vegetation (except as its mass changes), topography, or cloud cover, and GRACE is useful under conditions where the value of MODIS is limited by these influences.

2.2. Observation-Based Climatologic Snow Water Equivalent (SWE) and Snow Depth Data Sets

[15] The Canadian Meteorological Center (CMC) snow depth and SWE climatology data (1969–1997) [Brown et al., 2003] are used for validation purpose. These provide the daily snow depth and SWE at a 0.25° resolution. The snow estimates were obtained by combining abundant station observations in the United States and Canada with model simulations by an optimum interpolation scheme [Brown et al., 2003]. These observationally based data sets are regarded as the best available reference for this research.

2.3. Land Surface Model

[16] The Community Land Model (CLM) [e.g., Bonan et al., 2002; Oleson et al., 2004] is used as the land surface model to assimilate GRACE and MODIS data. It numerically simulates energy, momentum, and water exchanges between the land surface and the overlying atmosphere. Its snow model has multilayers (one to five layers), depending on its thickness, and accounts for processes such as liquid water retention, diurnal cycling of thawing-freezing, snowpack densification, snowmelt, and surface frost and sublimation.

[17] The CLM used in our experiments includes an enhanced frozen soil hydrology scheme and a new aquifer dynamical scheme, among other modifications [Niu and Yang, 2006; Niu et al., 2007a]. The aquifer model, by explicitly simulating groundwater dynamics, facilitates assimilation of TWS observations. Water storage in the saturated zone is a prognostic variable and directly represented in the calculation of TWS in each grid, facilitating the combination of GRACE data with model estimates [Zaitchik et al., 2008]. In addition, CLM’s sophisticated representation of frozen soil hydrology has improved its ability to characterize soil moisture and runoff variability in cold regions, thus reducing the systematic error in TWS estimation. These enhancements reduce model biases (for more discussion see Niu and Yang [2006] and Niu et al. [2007b]). The negative effects of model biases cannot be eliminated by updating state variables alone, and model bias can have complex effects on the data assimilation system; for example, De Lannoy et al. [2007].

2.4. Ensemble Kalman Filter and Smoother

[18] The EnKF [Evensen, 1994, 2003] and EnKS are used to incorporate MODIS and GRACE data into the CLM, respectively. The choice of algorithms, that is, the EnKF for MODIS SCF assimilation and the EnKS for GRACE TWS assimilation, depends on the nature of satellite data sets, as explained here.

[19] The EnKF was first introduced by Evensen [1994] as a Monte Carlo–based approximation to the Kalman filter in a numerical modeling system. More recently, it has been applied in numerous land data assimilation studies [e.g., McLaughlin, 2002; Reichle et al., 2002; Crow, 2003]. It treats some crucial model inputs, such as forcing data, model parameters, and model initial conditions, as random variables, and ensembles of these inputs are generated to represent their distributions. Each ensemble member is propagated forward using the model until a measurement becomes available. The measurement is assimilated into the model simulation by using the ensemble of state variables to represent a low-rank approximation of the joint probability density function between state variables and measurements. Meanwhile the EnKF update is optimal only when certain assumptions are met: (1) unbiased measurements and background (model simulated) variables, (2) Gaussian-type random inputs (e.g., the forcing errors), and (3) linear relationships between states and measurements. Given the preceding properties, the EnKF is able to characterize highly nonlinear land hydrological processes and their associated uncertainties. Such sequential data assimilation accounts for the temporal sampling discrepancy between CLM (3 h) and MODIS data (daily), as discussed by Su et al. [2008], who provide additional details.

[20] The EnKS [Dunne and Entekhabi, 2005, 2006] is theoretically similar to the EnKF but allows for (1) observations that are defined at different times than the model state and (2) observations that span multiple periods of time, comprising, for example, both current and historical model states. Because the GRACE data are at monthly intervals and the CLM runs at 3 h, the EnKS is used to compare model estimates of TWS with those of GRACE and derive updates for state variables. The update of the EnKS is

equation image

where Xi,ta represents the updated ith ensemble member of the state vector, and Xi,tb represents the corresponding ensemble member simulated by the model. The ensemble state vector is defined at time t, which can be a daily average value and differs from the GRACE observation YT1 (monthly TWS anomaly with respect to a multiyear mean) defined for a month T1. HT1(Xi,tb) is the observation function for the month T1, as obtained by integration of model TWS components over all grids within the GRACE tile (4° × 4° in this study) and over all days within month T1. The noise term vT1i is randomly drawn from a Gaussian distribution (with zero mean and variance equal to that of the observation error) to ensure an adequate spread of the analysis ensemble members [Burgers et al., 1998]. The Kalman gain Kt,T1 is obtained with the same ensemble approach as used in the EnKF,

equation image

where RT1 is the autocovariance of the observation error. The autocovariance of the model simulated observation, Cov[HT1(Xtb), HT1(Xtb)], and the cross-covariance, Cov[Xtb, HT1(Xtb)], are estimated from the inner products of the corresponding ensemble [Xi,tb and HT1(Xi,tb)] anomalies, that is, the ensemble minus their mean.

[21] The EnKS provides a reasonable way to match GRACE estimates of changes in TWS for each model simulated day. In reality the GRACE estimates incorporate varied information about any particular geographical location because orbits do not repeat. Thus it is difficult to set a uniform frequency and associated time accurately characterizing the satellite track in each grid. Our approach appears to be justifiable because, with slow temporal variation at large spatial scales, the monthly samples should adequately describe the TWS.

3. Implementation of Data Assimilation Experiments

[22] Three experiments are designed to address the questions stated in section 1. One is the open loop (OL) simulation, where the CLM alone is used to estimate SWE and other land variables. The second is a MODIS-only (MOD) data assimilation experiment similar to that used by Su et al. [2008]. The third is the joint MODIS-GRACE (MOD_GR) data assimilation experiment.

[23] All simulations are driven by the meteorological forcing data set from the Global Land Data Assimilation System (GLDAS) at a 1° × 1° resolution. The GLDAS forcing data are observationally derived fields including precipitation, air temperature, air pressure, specific humidity, and short-wave and long-wave radiation. The forcing data (e.g., precipitation) biases are not explicitly accounted for in this data assimilation approach, because of the lack of such information. The CLM is run from January 2002 to June 2007. The ensemble runs do not assimilate observations until November 2002. In MOD_GR, the relative error in the lognormal perturbation of precipitation is 65%, and the e-folding scale of horizontal error correlations is 3° (in both latitude and longitude) for precipitation and temperature. A temporal correlation of 3 days is assumed for the forcing perturbation. The selection of these forcing parameters is based on previous research on GRACE data assimilation (e.g., Andreadis and Lettenmaier, 2006; Zaitchik et al., 2008). These parameters are also applied in MOD. Additional experiments show that the change of forcing error parameters by Su et al. [2008] to the preceding values did not influence the MOD run significantly, and the main purpose here is to keep them the same in both MOD and MOD_GR. All other ensemble parameters in MOD and MOD_GR are the same as those of Su et al. [2008]. We recognize that a more comprehensive description of forcing error (e.g., only errors in precipitation and temperature are considered in this research; also bias is not considered) needs to be included in further research. The ensemble size is 25, which has been demonstrated to be suitable for large-scale snow data assimilation by Su et al. [2008]. Here we focus on describing the implementation of MOD_GR with EnKS.

[24] On each day in a given month the MODIS SCF is integrated into the ensemble simulation at every 1° × 1° model grid as performed in the MOD experiment. Here a “fixed interval state inflation” is applied, which periodically augments the ensemble spread of SWE at all tiles in every CLM grid. Specifically, the SWE simulation takes the form

equation image

every q days. Here f represents the CLM, ωt represents the perturbation on the state variable xSWE,t with variance of Q, and pt represents the perturbed forcing. At other steps the SWE is simulated by the function f without any state inflation:

equation image

Note that if SWE is ≤0 after inflation, then the inflation is not given.

[25] This inflation scheme is tailored to the needs of ensemble simulation of SWE. The absence of inflation may lead to too small a spread of the ensemble, degrading EnKF and EnKS performance. On the contrary, too frequent an inflation can cause excessive ensemble spread and unrealistic updates. Augmenting the state at each time step is not necessary since the main source of SWE uncertainties, the forcing uncertainty, has been dealt with elsewhere. For this reason, the selection of q and Q is largely application specific, and here q = 6 and Q = 36 mm2 (which are representative in both MOD and MOD_GR, as demonstrated later) are taken as domainwide values for both MOD and MOD_GR. This study does not address the development of more objective ways to select these parameters, such as the adaptive filter algorithm [Reichle et al., 2008]. Our initial tests found that for a MOD run, this state inflation using parameters q1 and Q1 within a reasonable range (q1 ≥ 3 and Q1 ≤ 60 mm2, which includes the case of no inflation) affected (relative to CMC observation data) the data assimilation results only slightly; for a MOD_GR run, a similar range (3 ≤ q2 ≤ 8 and 20 mm2Q2 ≤ 100 mm2) exists in which the changes in q2 and Q2 also barely affected (relative to CMC observation data) the data assimilation results. Parameters that are significantly outside the preceding ranges would dramatically alter the results in two experiments by degrading the quality of the SWE estimates compared with the CMC observation data set (detailed results are not presented here).

[26] At the end of the month when all the MODIS data have been assimilated, the GRACE TWS information (on 4° × 4° tiles) is distributed into the archived state vector (on 1° × 1° grids) consisting of the daily averaged SWE, snow depth, canopy snow, soil moisture, and aquifer storage, according to the ensemble calculated error statistics for each water storage component and the TWS variable. This spatial and temporal disaggregation is accomplished by using the EnKS (equation (1)) with the GRACE observation error equation image set to 20 mm, consistent with previous studies [e.g., Zaitchik et al., 2008]. Because the observation YT1 in equation (1) is an anomaly value (with respect to a mean over the GRACE observation period), HT1(Xi,tb) takes into account the model TWS climatology at each GRACE tile. The state vector at the last step of each month is updated with equation (1), then propagated by the model to the next month to repeat the above data assimilation cycle.

[27] The method presented utilizes each daily ensemble of state variables (Xi,tb in equation (1)) to calculate the corresponding Kalman gain to give a theoretically robust estimate of data assimilation increments, although these updates are not involved in the model propagation. The memory of state variables update is represented by reinitializing the simulation (with the updated state vector) at the last step of each month. Our adaptation of the EnKS could be improved, given the complexity resulting from the temporal scale difference between GRACE data and the model.

4. Results

[28] We focus on evaluating the estimates of SWE and other snow variables, even though other updated land surface states and fluxes are also provided by the MOD_GR simulation. An extensive assessment of the improvement in all the other hydrological variables from the multisensor snow data assimilation should be addressed in future research (our initial analyses find that the impacts of MOD and MOD_GR on estimation of water and energy fluxes, e.g., latent and sensible heat and runoff, are small at the monthly scale).

4.1. Monthly SWE Difference Between MODIS-Only (MOD) and Joint MODIS-GRACE (MOD_GR) Experiments

[29] The spatial distribution of SWE differences between single-sensor (MOD) and multisensor (MOD_GR) experiments illustrates the incremental value of GRACE information. Figure 1 shows that this monthly averaged field has considerable spatial heterogeneity in the cold season (January–April). Large changes in the SWE estimate of the MOD_GR run (compared to MOD) are concentrated in high-latitude regions. In particular, SWE estimates are found to be lower in boreal forests, with the difference ranging from 20 mm in January to 100 mm in April. In the tundra region of the Arctic, SWE estimates are higher by up to 30 mm (in April) after assimilating GRACE data. All of these differences correlate well with the period of snow accumulation, which starts around January and peaks in April. In contrast, the northern Great Plains and Midwest mountainous area show little change.

Figure 1.

Difference in monthly snow water equivalent (SWE; mm) between joint Moderate Resolution Imaging Spectroradiometer (MODIS)–Gravity Recovery and Climate Experiment (GRACE; MOD_GR) and MODIS-only (MOD) data assimilation experiments in the cold season of 2003.

[30] Monthly difference fields in other years demonstrate similar patterns, although the sign of the difference may vary from year to year. A detailed investigation of the sign and its spatial and temporal variation would involve quantitative analyses of several complicated factors, including the SCF parameterization curve, MODIS data bias, and others. The interaction of these components may be complex as demonstrated in section 1, and it is not a focus here.

[31] Generally, the distinct patterns of differences shown in Figure 1 are consistent with the remarks about strengths and limitations of SCF data assimilation presented in section 1. In regions where the SCF signal is saturated (boreal forest and tundra) and the correlation between SCF and SWE is low, the TWS signal is still sensitive to SWE variation and so the EnKS algorithm corrects the MOD estimate. For areas where MOD is expected to achieve its best performance (e.g., the northern Great Plains), the impact of GRACE is less evident. The large change in northwest Pacific coastal regions may reflect the potential of GRACE (1) to alleviate MODIS errors in mountainous regions, (2) to correct parameters error in the observational function, and (3) to reduce the influence of forcing bias in that area. However, it is difficult to identify the relative contribution of these factors given the lack of evaluation tools (e.g., abundant station measurements).

[32] SWE differences alone do not verify the MOD_GR approach, so it is important to directly evaluate the MOD_GR along with other simulations.

4.2. Terrestrial Water Storage Anomaly

[33] The long-term (November 2002 to May 2007) monthly TWS anomaly averaged over eight large river basins (Figure 2) in North America, as simulated by OL, MOD, and MOD_GR and observed by GRACE, is shown in Figures 3 and 4. For those basins where boreal forests dominate, for example, the Mackenzie River basin and the Churchill & Nelson River basin, winter TWS is generally overestimated by MOD and OL relative to GRACE in the first 2 years (November 2002 to April 2004). The MOD_GR run agrees better with GRACE. Since the winter TWS anomaly is mainly attributed to SWE in those regions [Niu et al., 2007b], these results correspond well to the SWE difference depicted in Figure 1. In the following winters, the difference between GRACE and OL-MOD may vary, but as expected, MOD_GR agrees better with GRACE than the other two simulations.

Figure 2.

The eight North American river basins analyzed.

Figure 3.

Monthly terrestrial water storage (TWS) anomaly (November 2002 to May 2007) from the simulations, open loop (OL), MOD, and MOD_GR, and the GRACE observation, averaged over four of the North American river basins shown in Figure 2. For OL the TWS anomaly in other months in the years 2002 and 2007 are also shown for reference.

Figure 4.

Same as Figure 3, but for another four river basins in North America.

[34] In the Saint Lawrence and Fraser River basins, where the snow classes [Sturm, 1995] are maritime and alpine, respectively, results are comparable to those just presented. Overestimation of TWS (especially in Fraser) by MOD in most winters reveals the deficiency of MODIS updates in these geographic environments (related to vegetation cover and/or mountains). The poor performance of MOD in the Fraser River basin may be due to forcing or parameters errors, since the SCF signal should be responsive to SWE variations in that mountainous area.

[35] In the Yukon River basin, all estimates agree reasonably well with each other except for a closer match between MOD_GR and GRACE in the winter of 2007. Spatial heterogeneity of the SWE differences shown in Figure 1 may explain this agreement; that is, overestimation of MOD relative to MOD_GR in the northern part is offset by underestimation in the south. The agreement in the Columbia River basin may have different causes, such as high-quality forcing, a reliable connection between SCF and SWE as described by the observation function, and accurate treatment of MODIS error in the ensemble approach. The hydroclimatologic conditions in this river basin have been studied intensively, and our analyses of the reliability of the forcing data and SCF parameterization in this region support the above attribution. Details are not presented here.

[36] In the North Central and Missouri river basins, where MOD has been found to perform well in this region of flat topography, low vegetation and unsaturated SCF [Su et al., 2008], MOD, and MOD_GR TWS estimates are similar to each other and to GRACE. In particular, Figure 3 shows an increase in TWS in the first two winters over the Missouri River basin in MOD and MOD_GR relative to OL, presumably from the increase in the SWE estimate as validated by Su et al. [2008] with independent satellite data sets. Overall, these results imply that the MODIS data have well constrained the SWE simulation in these two basins, while GRACE has had little impact, though it does not degrade the SWE and TWS estimates.

4.3. Climatologic Monthly SWE and Snow Depth

[37] SWE and snow depth estimates generated by different simulations are directly evaluated through their comparison with the CMC monthly climatology. Our experiments estimate a climatology of SWE for five consecutive winters, as limited by the relatively short length of satellite data.

[38] Figure 5 shows the difference of (multiyear) average April SWE (mm) between simulation results (MOD_GR, MOD and OL) and CMC. In the central northern areas the CMC SWE is systematically lower than model or data assimilation values. However, in many places (especially at high latitudes), the difference between CMC and MOD_GR is significantly smaller than that between CMC and MOD. Those improvements over MOD across the boreal forest are consistent with those shown in the TWS anomaly comparison (Figures 3 and 4). These effects are most prominent in the southern Mackenzie River basin, the southwest rim of Hudson Bay, the center of the Churchill & Nelson River basin, and north of the Rocky Mountains. The OL has a performance comparable to or even better than MOD_GR in a significant portion of the (central) high-latitude areas as displayed in Figures 3 and 4. In the midlatitude part of the domain the differences among OL, MOD, and MOD_GR are much less significant, for example, the Columbia River basin, where the results agree with the pattern shown in Figure 4. Similar to what was shown by Su et al. [2008], in some boreal forest regions MOD SWE differs from CMC by a much larger magnitude than the difference between OL and CMC, indicating that in some locations the MODIS snow data assimilation system may degrade results relative to OL, probably owing to structural problems in the observation function (e.g., negligible correlation between SCF and SWE, parameter error) or other components. Similar effects brought by model deficiencies have been documented previously [Zaitchik et al., 2008].

Figure 5.

The climatological monthly average SWE (mm) for April derived from the Canadian Meterological Center (CMC) long term observation data, and SWE difference in April between MOD_GR and CMC, between MOD and CMC, and between OL and CMC.

[39] Table 1 lists the mean absolute error (MAE) of monthly SWE (millimeters) from January to June at eight river basins (at middle and high latitudes) in North America. The MAE is calculated in each river basin and month by averaging the absolute difference between simulation results and CMC (monthly) over grids in that river basin. A significance test of the MAE is performed and the superscripts show that MOD_GR and OL have significantly lower MAEs (t test, p < 0.01) than some of their counterparts in associated river basins and month. In the Columbia River basin the errors of three experiments are not distinguishable at p < 0.01 level for all months (January to June). In the North Central, Missouri, and Saint Lawrence river basins, MOD_GR (also MOD; not shown here) shows a significantly lower MAE than does OL. Also, MOD_GR shows a consistently lower error than MOD in the Saint Lawrence river basin. MAE results between MOD and MOD_GR in high-latitude areas are similar to that in Figure 5, indicating consistently better performance of MOD_GR over MOD there. In some high-latitude river basins (e.g., Mackenzie, Churchill & Nelson, and Yukon) and given months, OL can show a performance either indistinguishable from or better than that of MOD_GR, and in most of these areas OL also performs better than MOD. Together with Figure 5, these results indicate potential deficiencies in the MOD_GR experiment. In particular, the bias and variance magnitude of GRACE observational error could be important. As indicated by Wahr et al. [2006] GRACE error variance is probably latitude dependent, diminishing at high latitudes where the satellite track density increases. A spatially uniform error variance was assumed for GRACE TWS in this research. Other problems associated with GRACE estimates, such as bias and spatial leakage, including leakage of tide model errors from adjacent oceans, may be important. Another recognized error source is the diminished variance introduced by smoothing GRACE data to suppress noisy spherical harmonics to a great degree. Such problems may contribute to the poor performance relative to OL. Further improvement of GRACE estimates, including bias correction and regional error magnitude (variance) description, is appropriate but outside the scope of this paper. Another important thing to note is that, in contrast to the above spatially averaged error statistics comparison, ensemble results (at the daily temporal scale) from different simulations at grid scale show a much more complex pattern. The relative magnitude of ensemble spreads of OL, MOD, and MOD_GR can vary significantly across grids in different river basins or even within the same basin. This lack of evaluation of ensemble results needs to be mitigated in further research.

Table 1. Mean Absolute Error of Monthly Snow Water Equivalent From January to June at Eight River Basins in North America for Three Experimentsa
  • a

    Values are in mm. MOD_GR, joint Moderate Resolution Imaging Spectroradiometer (MODIS)–Gravity Recovery and Climate Experiment (GRACE); MOD, MODIS only. Letters in parentheses indicate that a specific experiment has a significantly lower Mean absolute error (MAE) (t test, p < 0.01) in the associated river basin and month than some of its counterparts; significant values are bold. In particular, a, the MOD_GR MAE is significantly lower than that of both MOD and open loop; b, MOD_GR MAE is only significantly lower than MOD MAE; c, the MOD_GR MAE is only significantly lower than that of the open loop; d, the open loop MAE is significantly lower than that of both MOD and MOD_GR; e, open loop MAE is only significantly lower than MOD MAE.

MOD_GR53.3 (b)68.8 (b)89.3 (a)116.2 (b)62.1 (a)12.6 (a)
Open_Loop50.7 (e)74.9103.0126.886.021.6
Saint Lawrence
MOD_GR12.6 (a)18.2 (a)23.8 (a)32.8 (a)12.1 (a)2.2
MOD_GR27.7 (a)23.3 (a)26.5 (a)67.8 (b)36.3 (b)8.8 (b)
Open_Loop46.537.841.758.89 (e)26.2 (d)4.7 (e)
Churchill and Nelson
MOD_GR21.6 (b)26.9 (b)42.1 (b)58.1 (b)6.21.9
Open_Loop29.121.3 (e)36.1 (e)48.6 (d)6.31.3
MOD_GR52.8 (c)46.2 (a)52.1 (a)58.6 (b)33.9 (a)26.1 (b)
Open_Loop63.162.468.356.2 (e)59.818.5 (e)
MOD_GR18.0 (c)20.9 (c)12.9 (c)
North Central
MOD_GR19.8 (c)21.1 (c)15.6 (c)

[40] Seasonal variations (from November to June) of SWE and snow depth (climatological values) for selected rectangular areas are shown in Figures 6 and 7. Figure 6 (top) shows an area located south of Hudson Bay, comparing the performance of MOD and MOD_GR for a northern densely vegetated region. From mid December, MOD simulated SWE and snow depth are larger than those for MOD_GR and OL, with the difference peaking in March or April and then gradually decreasing. The benchmark CMC curves are well mimicked by the MOD_GR run. MOD estimates tend to be closer to CMC than to OL during the melting season (May), demonstrating that the recovery of MODIS capability for monitoring snowpack mass variation as the SCF falls far below 100%. Figure 6 (bottom) shows an area located in the central prairies region. OL significantly underestimates snowpack, as shown by Su et al. [2008], and MOD_GR uniformly agrees better with CMC than does MOD, suggesting that the contribution of GRACE in midlatitudes may be location and scale dependent. Even at the large basin scale, there is a need for a more comprehensive assessment of the GRACE contribution when MODIS performance is adequate. We discuss this issue in section 5.2.

Figure 6.

Climatological monthly mean SWE (mm) and snow depth (m) from November to June in two rectangular regions for the simulations (MOD, MOD_GR, and OL) and the CMC long-term observation data. Locations are (top) 51°–55°N, 94°–85°W (box B in Figure 8) and (bottom) 46°–49°N, 98°–94°W (box A in Figure 8).

Figure 7.

Same as Figure 6 but for different locations. Locations are (top) 62°N–68°N, 120°W–130°W (box C in Figure 8) and (bottom) 55°N–60°N, 100°W–110°W (box D in Figure 8).

[41] Figure 7 shows another comparison in two different places: 62°N–68°N, 120°W–130°W (Figure 7, top) and 55°N–60°N, 100°W–110°W (Figure 7, bottom). Similar patterns are shown in these two places, in which MOD_GR and OL are both better than MOD and are close to each other. The OL could be slightly better in February, March, and April. These features are consistent with those analyzed in Table 1, reflecting considerable room for refinement of MOD_GR in those areas.

5. Discussion

5.1. Role of the Autocovariance of the SWE Ensemble

[42] It has been shown that GRACE TWS data can have a substantial influence on snowpack estimates over many high-latitude and some midlatitude regions. The key elements that affect how GRACE contributes in an ensemble context include its fundamental nature as a measure of total column water and its coarse spatial and temporal resolution. According to equations (1) and (2) the covariance between the analyzed variable, SWE, and the simulated TWS largely controls the magnitude of the EnKS increments given the unit innovation. We can explicitly expand this covariance (in scalar form) as

equation image

where Xti,li,SWE, Xti,li,SM, and Xti,li,wa represent SWE, total soil moisture, and aquifer water storage at day ti, land tile li, in month T1 (with N days), and GRACE footprint L1 (with M model grids), respectively. XT1,L1,TWS represents the monthly (T1) average TWS over L1. The symbol equation image represents spatial and temporal averaging. The other symbols have the same meaning as before. Here the decomposition of TWS neglects canopy snow for simplicity. If we further simplify equation (5) by considering only zero-lag covariance parts of the second and third terms on the right side, it becomes

equation image

Although incomplete, this expression provides a first-order estimate of the covariance between the daily SWE at any grid and the GRACE data, in particular, for the accumulation season. It assumes that soil moisture and groundwater anomalies are temporally and spatially not well connected to the SWE anomaly before snowmelts (a separate calculation of the lagged correlation between the SWE and the other two TWS components supports this assumption, but it is not shown here).

[43] Figure 8 provides a monthly composite description of the cross-correlation between SWE and the other two TWS components (the second and third terms on the right-hand side of equation (6)) in February 2003; other months before strong melting show similar results. Only in the colored areas is there a significant correlation (p < 5%) for more than 2 days in February. It is shown that, over most of the domain, neither correlation field is significant for a majority of days within that month. This implies that the autocorrelation of SWE (the first term on the right-hand side of equation (6)) largely determines the magnitude of GRACE information that can be utilized by the CLM.

Figure 8.

In the colored grids, the number of days in which the corresponding daily correlation (zero temporal and spatial lag) between (top) SWE and soil moisture and (bottom) SWE and groundwater is significant (p < 5%) is ≥2 in February 2003. Using a lower standard in the p value test (e.g., p < 10%) did not change these two colored maps significantly (they remained almost the same). Boxes A–D in Figure 8 (top) show the locations in Figures 6 and 7.

5.2. Ensemble Variance Reduction by Assimilation of GRACE

[44] By design, the EnKS reduces ensemble error by combining observations with model estimates. Thus it is valuable to analyze error reduction (or a related quantity) by GRACE in the multisensor data assimilation framework. In addition, the relevant metric can contribute to understanding the GRACE effects in regions where MODIS alone provides considerable improvement. The TWS anomaly shown in Figure 4 (the Missouri River basin and North Central River basin) may be inconclusive for this purpose.

[45] Here we use the following statistic, effectiveness ratio ρ, to define the ensemble variance reduction capability (in scalar form) of GRACE:

equation image

where XSWEb and XSWEa are background and analysis SWE ensembles in the EnKS. Also, note that XSWEb = XSWEa (EnKF). Note that here we cannot directly calculate the ensemble error because the ensemble may have bias. Also, we do not have high-quality observations to quantify ensemble error. So equation (7) provides a purely ensemble-based statistic, which may, to some extent, be an indicator of uncertainty reduction from the EnKS. With this in mind, the normalized index facilitates intercomparison of GRACE’s ensemble variance reduction capability (to the SWE ensemble) among different regions. Its theoretical value ranges from 0 (no reduction of ensemble variance) to 1 (strongest capability for reduction of ensemble variance). To minimize the influence of sampling error in calculating ρ with equation (7), we have derived daily river basin averaged values of ρ. Figure 9 provides the results for two river basins, the Mackenzie and North Central, from January 2002 to June 2002. In the Mackenzie River basin ρ is systematically larger than that at midlatitudes, and its average effectiveness ratio can reach 25% in April but decreases in the melting season when the MODIS capability returns. In the North Central River basin this ratio is consistently below 10% and without evident seasonal variation. Other boreal forest, tundra (compared to the Mackenzie River basin), and midlatitude flat regions (compared to the North Central River basin) have patterns similar to those shown here. However, results in mountainous regions are less consistent. It is argued that ρ (for a large basin average) is also a function of the spatial correlation structure of the MODIS updated ensemble within the GRACE footprint, considering the spatial aggregation nature of GRACE. Further explanation of spatial and seasonal variations of ρ is the subject of continued research.

Figure 9.

Daily river basin averaged ρ (equation (7)), from January to June 2003, for the Mackenzie River basin and the North Central River basin.

6. Concluding Remarks

[46] This study has developed a continental-scale multisensor snow data assimilation system, which assimilates both GRACE TWS and MODIS SCF information into the CLM with the EnKS and EnKF, respectively. Through this new framework various deficiencies associated with MODIS-only data assimilation are effectively accounted for domainwide. Especially, the degraded performance of the MODIS-only approach shown in boreal forest in both this paper and that by Su et al. [2008, Figure 5] is significantly enhanced after using GRACE TWS information. These improvements result from the unique information provided by GRACE, which provides complementary constraints on the ensemble simulation for various climatic and geographical locations. In addition, in regions where MODIS performs adequately, the inclusion of GRACE TWS information does not degrade the estimates, further indicating robustness of this joint assimilation system. Comparison of MOD_GR and OL reveals more complex patterns. In the North Central, Missouri, and Saint Lawrence river basins, MOD_GR is consistently better than OL. In the Fraser and Yukon river basins, MOD_GR is better than or comparable to OL. In the Mackenzie and the Churchill & Nelson river basins, OL can show a performance comparable to or better than that of MOD_GR. These features demonstrate a need to improve on the GRACE data assimilation approach, including better characterization of GRACE bias and error variance.

[47] We have also found that the measurement of the total column water (integrating several water storage components) and the coarse spatial and temporal resolution of GRACE may both benefit and complicate the task of estimating SWE. The impacts of these characteristics on the EnKS update deserve further investigation. Advantages of the finer resolution of MODIS data in the multisensor system may require more detailed analyses, to better characterize the different features associated with each data type and their interaction in the multisensor data assimilation framework.


[48] This work was funded by NASA grants NNX09AJ48G, NAG5-10209 and NAG5-12577 and NOAA grant NA07OAR4310216. Matthew Rodell is thanked for providing GLDAS forcing data. MODIS and AMSR-E snow data are provided by the National Snow and Ice Data Center (NSIDC).