A data fusion approach for mapping daily evapotranspiration at field scale
Hydrology and Remote Sensing Laboratory, Agricultural Research Service, U.S. Department of Agriculture, Beltsville, Maryland, USA
Corresponding author: C. Cammalleri, European Commission, Joint Research Centre, Institute for Environment and Sustainability, TP280 Via E. Fermi 2749, I-21027 Ispra (VA), Italy. (firstname.lastname@example.org)
 Thermal remote sensing methods for mapping evapotranspiration (ET) exploit the physical interconnection that exists between land-surface temperature (LST) and evaporative cooling, employing principles of surface energy balance (SEB). Unfortunately, while many applications in water resource management require ET information at daily and field spatial scales, current satellite-based thermal sensors are characterized by either low spatial resolution and high repeatability or by moderate/high spatial resolution and low frequency. Here we introduce a novel approach to ET mapping that fuses characteristics of both classes of sensors to provide optimal spatiotemporal coverage. In this approach, coarse resolution daily ET maps generated with a SEB model using geostationary satellite data are spatially disaggregated using daily MODIS (MODerate resolution Imaging Spectroradiometer) 1 km and biweekly Landsat LST imagery sharpened to 30 m. These ET fields are then fused to obtain daily ET maps at 30 m spatial resolution. The accuracy of the fused Landsat-MODIS daily ET maps was evaluated over Iowa using observations collected at eight flux towers sited in corn and soybean fields during the Soil Moisture Experiment of 2002, as well as in comparison with a Landsat-only retrieval. A significant improvement in ET accuracy (reducing errors from 0.75 to 0.58 mm d−1 on average) was obtained by fusing MODIS and Landsat data in comparison with the Landsat-only case, with most notable improvements when a rainfall event occurred between two successive Landsat acquisitions. The improvements are further evident at the seasonal timescale, where a 3% error is obtained using Landsat-MODIS fusion versus a 9% Landsat-only systematic underestimation.
 Agricultural water management often requires detailed information about daily crop water use and soil moisture status at field or finer scales. This need becomes particularly relevant in areas characterized by increasing limitations in freshwater availability, such as in the western and central United States [Hall et al., 2008]. This information is conveyed by daily and seasonal estimates of actual evapotranspiration (ET), which constitutes the main form of water loss in agricultural landscapes. Field-scale estimates of ET can be also used to increase crop water use efficiency (economic yield per unit water consumed) and to improve quality of agricultural products. Additionally, comparison of current field ET with normalized values computed over some period of reference provides a means for identifying crop stress and drought impacts [Anderson et al., 2007c, 2011, 2013; Mu et al., 2013].
 Accurate and cost effective estimates of ET over large areas (∼102−104 km2) can be achieved using remote sensing [Schmugge et al., 2002], which represents an invaluable source of information for spatially distributed modeling of the hydrologic balance. Over the last few decades, several satellite-based approaches have been developed to estimate ET at the time of a satellite overpass [Kalma et al., 2008]. These models are generally based on principles of the surface energy balance (SEB), exploiting the remotely derived land-surface temperature (LST) as a proxy indicator of surface water status. Some of these methods have been developed to minimize required model inputs by using semiempirical approaches [e.g., Roerink et al., 2000] or within-scene scaling [Bastiaanssen et al., 1998; Allen et al., 2007a] procedures, while others are more physically based, explicitly modeling the soil-vegetation-atmosphere exchange processes [Norman et al., 1995; Chehbouni et al., 2001].
 Many intercomparisons of thermal infrared (TIR) remote sensing methodologies in different environments have been reported in the literature, mainly focusing on estimates at the sensor overpass time [e.g., Timmermans et al., 2007; Choi et al., 2009; Cammalleri et al., 2012]. However, current limitations on the availability of high spatial resolution (defined here as ∼100 m) TIR data constrain the practical applicability of these methods for routine daily and seasonal ET estimations at field scales. The revisit cycle for a single Landsat system is 16 days, with cloud cover further reducing the actual temporal frequency of data acquisitions for many parts of the United States. Ju and Roy  observed that only applications that require two or less Landsat images per year, without further constraints on temporal distribution, may be largely unaffected by cloud coverage. The land-surface water cycle has a relatively short timescale of variability; hence, its study requires high-frequency sampling. Analyses of Landsat image time series suggest that at least one clear-sky LST retrieval per month is required to adequately define cumulative ET in irrigated agricultural fields in southern Idaho, necessitating a satellite revisit cycle of 4 days or less given the local cloud climatology [Anderson et al., 2012a].
 Methodologies proposed to account for the seasonal evolution of daily ET between infrequent high-resolution remote sensing retrievals are commonly based on simple linear or spline interpolation of a conserved index such as the ratio between actual ET and reference ET, available energy (i.e., the evaporative fraction), or insolation [Allen et al., 2011a; Anderson et al., 2012b]. Some studies report reasonable performance of these methods under certain conditions [e.g., Allen et al., 2007b; Singh et al., 2011]. However, these methods assume that water availability evolves smoothly between successive high-resolution ET retrievals. This will not be the case if a rainfall or irrigation event occurs between retrievals, and such water inputs cannot be directly detected over large areas without ancillary information.
 Fortunately, data from several moderate-resolution (∼1 km) thermal sensors are currently freely available on a daily basis, including the Advanced Very High Resolution Radiometer (AVHRR), MODerate resolution Imaging Spectroradiometer (MODIS), and Advanced Along Track Scanning Radiometer (AATSR). These sensors provide an additional source of information about moisture dynamics at coarser scales that is not often used to improve the temporal accuracy of high spatial resolution products. Over rainfed landscapes, moisture inputs due to rainfall events between Landsat overpasses may often be reasonably captured at the 1 km TIR pixel scale. Some authors have suggested multisensor ET temporal interpolation schemes that use proportional-distribution methods at both pixel [Chemin and Alexandridis, 2004] and scene [Bastiaanssen et al., 2002] scales to combine ET retrievals at different spatial resolutions. Beyond this, multisensor data fusion procedures have not been significantly investigated for reconstructing ET dynamics at field scale.
 While data fusion is commonly used in reconstructing time series of spectral reflectances or vegetation indices [e.g., Gao et al., 2006; Zurita-Milla et al., 2008; Bhandari et al., 2012], applications to ET map merging are less straightforward. Data fusion techniques were originally developed to fuse low-level products (e.g., reflectance), for which relatively small bias and high spatial similarity between sensors are expected. Furthermore, spectral reflectances and vegetation indices typically vary relatively slowly over multiday intervals, simplifying the process of multitemporal image fusion. In contrast, LST and surface flux conditions can change rapidly at hourly time steps. Diurnal cycles in LST complicate fusion of images that are not collocated in time. In addition, strong LST dependencies on view angle—particularly over partially vegetated surfaces—add incompatibility between TIR data acquired by different sensors. The combined effects of view angle and time-of-day differences between thermal data sensors significantly complicate direct fusion of LST products.
 For these reasons, we have chosen to investigate techniques for fusing maps of daily ET retrieved from multiple TIR sensors, rather than fusing LST itself. In moving from a low-order product like LST to a higher-order product like daily ET, impacts of diurnal variability in LST and view angle effects can be accommodated within the ET retrieval methodology, if it is specifically designed to use TIR data collected over a range in times and view angles. On the other hand, being a higher-order product retrieved through several processing steps using multiple input data sources, we expect less consistency between daily ET maps retrieved from different sensors at different spatial scales than we would for reflectance products. Consistency among sensor products is a key factor for success in any data fusion experiment, and the ET retrieval methodology has to be specifically designed to accommodate this aspect. Most data fusion methods have difficulties in predicting disturbance events (e.g., rapid increases in water availability due to rainfall events) if the changes caused by disturbance are transient and not recorded in at least one of the available high-resolution scenes [Hilker et al., 2009]. Moreover, data fusion approaches are often reliant on the existence of “homogeneous” pixels at both fine and coarse resolution [Gao et al., 2006], which is a somewhat more problematic for ET than for reflectance or vegetation indices. ET maps at 30 m resolution can be highly heterogeneous due to small-scale variability in soil type, vegetation type, phenological stage, moisture inputs, and meteorological conditions.
 The aim of this work is to evaluate the capability of a multisensor, multiresolution modeling framework for reconstructing daily and seasonal dynamics in actual ET at field scale. The methodology uses daily 10 km ET estimates as a normalization basis, obtained from the Atmosphere-Land EXchange Inverse (ALEXI) surface energy balance model using hourly thermal data from geostationary satellites. These coarse-scale ALEXI fluxes are used to constrain higher-resolution ET retrievals from both MODIS (1 km, daily) and Landsat (30 m, 16 day) obtained via the DisALEXI flux disaggregation procedure. The ALEXI/DisALEXI approach ensures consistency at the 10 km scale among the ET maps obtained at different spatial resolutions, a necessary condition for merging different sources of data through advanced data fusion algorithms. Importantly, the approach can accommodate TIR data acquired at different view angles and times of day, thus minimizing temporal gaps in both high- and low-resolution datastreams. The MODIS and Landsat maps are then combined using the spatial and temporal adaptive reflectance fusion model (STARFM) [Gao et al., 2006] in order to obtain a final product with the best quality of each data set (fine spatial resolution and high temporal frequency).
 The fusion methodology was applied to MODIS and Landsat images acquired in 2002 over central Iowa, the site of the Soil Moisture Experiment of 2002 (SMEX02), an extensive survey campaign conducted during a period of rapid crop development from June to August. Flux observations were collected at eight micrometeorological flux towers in corn and soybean fields during this experiment, providing a spatially dense validation data set for evaluating remotely sensed flux retrievals over a range in spatial scales. Previous studies [Anderson et al., 2005, 2007a] have investigated the accuracy of instantaneous and daily ALEXI/DisALEXI fluxes using a subset of the available data set, as well as an ALEXI cloudy day gap-filling algorithm. Results of these studies suggested errors of 10% at the time of Landsat overpass on clear days and 15% at daily time steps for ALEXI ET under all-sky conditions. In this study, we use the full available flux data set from June to August, covering crop phenological stages from near-emergence to peak maturity, to assess value added by fusing Landsat and MODIS ET retrievals to provide water use estimates at daily time steps and subfield scales.
2.1. ALEXI/DisALEXI Multiscale Energy Balance Scheme
 The regional Atmosphere-Land Exchange Inverse (ALEXI) model [Anderson et al., 2007b] and the associated flux disaggregation scheme, DisALEXI [Norman et al., 2003; Anderson et al., 2004b], are built upon the two-source energy balance (TSEB) land-surface representation introduced by Norman et al. . The TSEB partitions available energy—the difference between net radiation (Rn, W m−2) and soil heat flux (G0, W m−2)—into turbulent fluxes of sensible (H, W m−2) and latent (λE, W m−2) heat, computed separately for the soil (s) and canopy (c) components of a mixed pixel:
 Equations used to model long-wave and short-wave Rn components can be found in Kustas and Norman , while G0 is computed as a fraction of Rn,s using the relationship proposed by Santanello and Friedl .
 Using a system of energy balance equations for the soil and canopy, the TSEB also partitions remotely sensed surface radiometric temperature (TRAD) observed at view angle θ, into nominal soil (Ts, K) and canopy (Tc, K) temperatures following the relationship:
where fθ is the apparent vegetation fraction at the sensor view angle:
 LAI is the leaf area index (m2 m−2), and Ωθ is a view angle dependent vegetation clumping factor, typically assigned by vegetation class and/or growth stage [Li et al., 2005]. As shown in the model schematic in Figure 1, sensible heat flux components, Hc and Hs, are computed from Tc and Ts (equation (2)) assuming a temperature gradient-transport in-series resistance network between the soil, canopy, and atmosphere [Shuttleworth and Wallace, 1985], and using air temperature (Ta, K) at a reference height above the canopy as an upper boundary condition. An initial estimate of potential (unstressed) canopy transpiration (λEc) is obtained using the Priestley-Taylor relationship [Priestley and Taylor, 1972] applied to the canopy component of net radiation (Rn,c). Then effects of vegetation stress are incorporated by iteratively reducing the effective Priestley-Taylor coefficient until λEs, obtained as a residual from equation (1), is positive, indicating evaporation rather than condensation, which is considered unlikely midday under clear-sky conditions [Norman et al., 1995].
 As can be seen in Figure 1, the surface-to-air temperature gradient is critical to the sensible heat flux determination and thus the overall energy balance assessment in the TSEB, and thus, care must be given to specifying the air temperature boundary condition, Ta, for regional applications. The ALEXI model was specifically designed to accommodate errors due to inconsistency between TRAD and Ta inputs by applying the TSEB in a time-differencing mode [Anderson et al., 1997] to thermal observations collected by geostationary satellites over the morning hours. The TSEB is coupled with a simple slab model [McNaughton and Spriggs, 1986] of atmospheric boundary layer (ABL) development, internally simulating the effect of land-atmosphere feedback on Ta at the blending height. In this experiment, the instantaneous latent heat fluxes retrieved at the second TSEB application time (t2, just before local noon) are upscaled to daytime-integrated ET estimates, ETd (mm d−1), assuming a self-preservation of the evaporative fraction, , during daytime hours [Crago, 1996]:
where λ is the latent heat of vaporization (2.45 MJ kg−1) and the coefficient 1.1 is introduced to adjust for an observed 10% underestimation of daytime average Λ by midday values [Brutsaert and Sugita, 1992]. In equation (4), average daytime values of net radiation and soil heat flux, Rn,d and G0,d (MJ m−2 d−1), respectively, are derived from hourly insolation and meteorological data as described by Anderson et al. [2012b].
 Due to the time-differential model structure, ALEXI is relatively insensitive to time-invariant biases in the retrieved LST data [Anderson et al., 1997], providing a reasonably robust assessment of surface energy fluxes at continental scales, but limited to the coarse spatial resolution of geostationary satellite data (3–10 km). In order to apply the TSEB modeling framework to finer-resolution LST data, generally only available from polar orbiting platforms as instantaneous (one or two times per day at best) snapshots, the DisALEXI disaggregation scheme was developed [Norman et al., 2003]. As schematically described in Figure 1b, the air temperature map diagnosed by ALEXI at time t2 provides an initial upper boundary condition for TSEB applications at the high-resolution TIR data spatial scale and is iteratively tuned at the ALEXI pixel scale to enforce consistency between ALEXI and reaggregated DisALEXI daily H fields [Cammalleri et al., 2012; Anderson et al., 2013]. The latter is adopted as a normalization basis rather than instantaneous H (as used by Norman et al.  and Anderson et al. [2004b, 2005]) in order to accommodate the use of high-resolution LST acquired at different times of day, as in the case of Landsat and MODIS on both the Terra and Aqua satellites. To eliminate boxlike patterns at the scale of the native ALEXI grid, the iteratively refined Ta maps were smoothed by means of a moving-average window filter, 10 km in width, before the final application of DisALEXI to the MODIS and Landsat LST fields. This normalization process ensures that the MODIS and Landsat-scale disaggregated flux maps are consistent at the 10 km ALEXI grid scale, a key factor in successful data fusion.
2.2. Fusion of Landsat- and MODIS-Derived Daily ET Maps
 Using DisALEXI, we can obtain a time series of near-daily ET at 1 km resolution using MODIS, and approximately biweekly to monthly snapshots of daily ET distributions at field scale (30 m) with Landsat TIR. The final step fuses these two datastreams, to capture changes in water availability between Landsat overpass dates as schematically described in Figure 2 (hereafter referred to as the Landsat-MODIS approach). First, gaps in the MODIS time series due to cloud cover were filled in order to obtain a continuous daily data set at 1 km resolution using a simple procedure based on the preservation of the ratio between actual and reference ET (ET0) described by Anderson et al. . Daily maps of ET0 are computed following the FAO-56 procedure [Allen et al., 1998] using the same hourly insolation and meteorological fields used in the ALEXI. Time series of actual-to-reference ET ratio (ETd/ET0) were computed at each pixel, and then these time series were smoothed and gap-filled using a second-order Savitsky-Golay smoothing filter (with a window of ±3 days). ETd maps were retrieved by multiplying the temporally smoothed ETd/ET0 maps and the ET0 maps. This smoothing process is relatively straightforward when applied to MODIS- or GOES-derived ET maps because the time series are relatively dense (typically with gaps on the order of a few days). It becomes more problematic when applied to temporally sparse ET time series from Landsat, with gaps of weeks to months [Allen et al., 2007b; Anderson et al., 2012b].
 The temporally sparse Landsat and daily (gap-filled) MODIS ETd maps were combined using the spatial and temporal adaptive reflectance fusion model, developed by Gao et al. . STARFM uses comparisons of one or more pairs of observed Landsat/MODIS maps, collected on the same day, to predict maps at Landsat-scale on other MODIS observation dates. If the MODIS stream is gap-filled to be time continuous, this results in daily Landsat-scale maps.
 For the generic date, t0, between Landsat clear-sky acquisitions, STARFM predicts a high-resolution Landsat-like scene based on a weighted function of Landsat (L) and MODIS (M) data acquired on date(s) tk and the MODIS data on the prediction date itself. For each high-resolution pixel, the weighted function is computed within a search window centered on that pixel:
where w is the search window size, xi and yi give the pixel location in the coregistered Landsat and MODIS scenes within the search window, tk is the acquisition date for both Landsat and MODIS data, n is the number of Landsat/MODIS pairs used by the algorithm, and the weighting factor, W, is parameterized in terms of the similarities between pixels L(xi,yj,tk) and M(xi,yj,tk) within the searching window. In particular, a normalized inverse distance weight function is used:
where Cijk represents a factor that combines the similarities between L(xi,yj,tk) and M(xi,yj,tk) ET fluxes, the temporal distance between M(xi,yj,tk) and M(xi,yj,t0), and the spatial distance between the central pixel and the candidate pixel within the search window. More details about the STARFM procedure can be found in Gao et al. .
 The STARFM approach is effective in capturing continuous seasonal changes such as phenological changes. In such cases, two pairs of Landsat and MODIS images, n = 2 in equation (5), bracketing the change period are helpful. However, the two-pair option is less effective in reproducing more abrupt changes due to the conversion of surface types or changes in meteorological or surface moisture conditions (e.g., due to a rainfall event) that were not captured in both Landsat images. In this study, STARFM was applied using only a single pair of available Landsat and MODIS images (n = 1). An automated procedure was developed to select the most suitable pair for each prediction date based on an analysis of correlation between MODIS ET maps. The MODIS ET map for the prediction day, M(t0), is compared with all the available M(tk) maps retrieved on Landsat overpass dates, and the map that yields the highest spatial correlation is selected, along with its Landsat pair, as an input to STARFM for that prediction date.
 In order to obtain a measure of the improvement in daily and seasonal ET estimates due to the introduction of MODIS data in the modeling framework, a benchmark time series was also derived by using only the temporally sparse Landsat estimates (referred as Landsat-only interpolation). Following Anderson et al. [2012b], this daily ET time series was obtained by temporally interpolating the ratio between ETd and ET0 at the Landsat overpass days using a spline function and deriving the ET maps in the days without Landsat scene by multiplying that ratio by ET0 (see Figure 2). Essentially, this is equivalent to the gap-filling method that was applied to the MODIS datastream.
3. Experimental Site and Materials
 The Landsat-MODIS data fusion methodology was evaluated over a rainfed agricultural landscape in the Walnut Creek watershed in central Iowa, a 5100 ha area located 5 km south of Ames (41°75′N, 93°41′W, Figure 3). The site is mainly occupied by corn and soybean production fields ranging in size between 40 and 160 ha. The topography of the area is almost flat, with elevations ranging from 265 and 363 m above sea level. The climate of this region is characterized by cold winters and warm summers, with a typical humid continental climate. During the growing season (May-September), maximum temperature ranges from 24.5°C to 29.4°C, while minimum temperature ranges from 9.5°C to 16.0°C [Hatfield et al., 1999]. Precipitation mainly occurs between April and September, with an average monthly value of about 100 mm. Rainfall events during the spring and summer often occur as intense showers [Hatfield et al., 1999].
3.1. Micrometeorological Observations
 During May-September 2002, the remote sensing Soil Moisture EXperiment (SMEX02) was conducted in this area, testing passive microwave retrievals of soil moisture during a period of rapid vegetation growth. The related Soil Moisture-Atmosphere Coupling EXperiment (SMACEX) monitored evolution in surface energy fluxes in corn and soybean fields [Kustas et al., 2005]. SMACEX included 12 flux tower installations across the watershed to measure temporal dynamics in energy, water, and carbon fluxes [Kustas et al., 2003b]. However, of these only eight stations (demarcated by dots in Figure 3) provided continuous records from June until August, from emergence to peak biomass. Sites 003, 161, and 162 were cultivated with soybean, while the other five towers (006, 024, 025, 033, and 151) were sited in corn fields.
 Instrumentation at each flux tower included a 3-D sonic anemometer, fast-response H2O and CO2 density open-path infrared gas analyzer, four-component net radiometer, soil flux plates, soil thermocouples, infrared thermometer, and thermohygrometer. Postprocessing of the high-frequency eddy covariance (EC) data to 30 min averages is described by Hatfield et al. , as well as information on the installation setup. Soil heat flux data were corrected for heat storage in the soil layer above the flux plate, and energy balance closure was enforced in the EC data by preserving the observed Bowen ratio H/λE [Twine et al., 2000]. For model evaluation at daily time steps, the closed fluxes were integrated between sunrise and sunset, excluding nighttime fluxes that are less reliably measured using EC techniques.
3.2. Data Inputs and Processing
 Both ALEXI and DisALEXI models require a combination of several remote sensing and meteorological inputs, as described by Anderson et al. [2007b, 2012b]. The main input data sources used in this study case are outlined in Table 1 and described briefly below.
Table 1. Summary of the Primary Inputs to the ALEXI and DisALEXI (Landsat and MODIS) Modelsa
 GOES data used in ALEXI were collected and processed for the study period as described by Anderson et al. [2007b]. In addition, daily Terra MODIS data tiles over the study were collected, as well as available imagery from both Landsat 5 (Thematic Mapper, TM) and 7 (Enhanced Thematic Mapper, ETM+) for two Landsat worldwide reference system (WRS) scenes covering the study area (paths 26 and 27, row 31). The nominal frequency of coverage for the two adjacent Landsat paths was twice per 8 days; however, due to cloud coverage only five scenes were predominantly clear during the period June-August on day of year (DOY) 158, 174, 182, 214, and 238, with an average frequency interval of 21 days and a maximum gap of 32 days between DOY 182 and 214.
3.2.2. Regional Meteorological Data
 Required surface meteorological fields and atmospheric lapse rate (needed in the ALEXI ABL submodel) were developed with the Fifth-Generation Pennsylvania State University/National Center for Atmospheric Research Mesoscale Model (MM5) [Dudhia, 1993], run at a spatial resolution of 36 km using initial and boundary conditions from the National Centers for Environmental Prediction FNL (Final) Analysis. Hourly insolation data were obtained from hourly GOES-based products at 20 km resolution [Diak and Gautier, 1983; Otkin et al., 2005]. The hourly insolation (Rs) and surface meteorological fields (Ta, u, and ea) used in ALEXI were also used to force clear-sky DisALEXI runs (both Landsat and MODIS), to upscale instantaneous clear-sky ALEXI/DisALEXI fluxes to daily totals, and to gap-fill ALEXI and DisALEXI (MODIS) on cloudy days.
3.2.3. Vegetation Characterization
 LAI maps at 1 km resolution were generated from the 8 day composite (MOD15A2, Collection 5) product [Myneni et al., 2002] and bilinearly interpolated to daily time steps. The accuracy of this product has been estimated to be 0.3 m2 m−2 for cropland and 0.5 m2 m−2 for needleleaf forest [Wang et al., 2004; Tan et al., 2005]. These maps were used for MODIS disaggregation and spatially aggregated to the 10 km CONUS grid for use in ALEXI. The 30 m resolution LAI maps used for Landsat disaggregation were generated using a regression tree approach trained by MODIS 1 km sample data, as described by Gao et al. . This regression tree approach facilitates development of Landsat-scale LAI maps that are maximally consistent with the MODIS LAI product used in ALEXI and in the MODIS-based disaggregation. Direct observations of LAI collected during SMEX02 [Anderson et al., 2004a] were used to evaluate the Landsat-derived maps, indicating an accuracy of 0.2–0.3 m2 m−2 [Gao et al., 2012].
 Vegetation heights were assigned using LAI and land-cover classifications, and surface roughness parameters were derived using the commonly used relationships suggested by Brutsaert . The 1 km land-cover classification developed by University of Maryland (UMD) was used for ALEXI and DisALEXI/MODIS estimates, as described by Anderson et al. [2007b], whereas the 30 m National Land Cover Data (NLCD) map [Homer et al., 2007] was adopted in DisALEXI/Landsat.
3.2.4. Land-Surface Temperature Estimates
 LST maps at 10 km resolution were generated from the GOES Sounder thermal band (10.2–11.2 µm) imagery, atmospherically corrected according to the procedure proposed by French et al.  using atmospheric profiles from MM5 and emissivity estimates dependent on vegetation cover fraction [Anderson et al., 2007b].
 The Terra (MOD11_L2) instantaneous swath LST product at 1 km resolution [Wan and Li, 1997] was used to derive TRAD inputs on daily basis for MODIS DisALEXI application. The analyses included only pixels that were considered clear at the 99% confidence interval as defined by MOD35 cloud mask product [Ackerman et al., 1998]. The 1 km LST fields were sharpened using MODIS 1 km composite normalized difference vegetation index (NDVI) products to reduce effects of off-nadir pixel smearing. The sharpening was performed using the TsHARP procedure [Kustas et al., 2003a], which is based on a self-preservation of the relationship NDVI-TRAD at multiple spatial resolutions.
 Landsat LST maps, at the native spatial resolution of 120 m (Landsat 5) or 60 m (Landsat 7), were derived from thermal band observations following the procedure described by Li et al. , atmospherically correcting at-sensor brightness temperature via MODTRAN® [Berk et al., 1989]. The native resolution LST maps were then sharpened to the 30 m resolution of the Landsat optical bands using the TsHARP sharpening procedure [Kustas et al., 2003a].
4. Results and Discussion
4.1. Analysis of Surface Energy Fluxes on Landsat Overpass Dates
 The 30 m maps of surface fluxes modeled by DisALEXI using Landsat imagery allow a direct comparison with observations acquired by the eight flux towers located in the SMEX02 study site at the instrument footprint scale (∼100 m). In these comparisons, the 30 m modeled fluxes were aggregated using a weighted average over the tower source area, as estimated using the footprint algorithm of Kormann and Meixner . A scatterplot between observed and modeled instantaneous fluxes, sampled at the time of the Landsat overpass on the five clear-sky Landsat dates (DOY 158, 174, 182, 214, and 238), is shown in Figure 4a, while daytime-integrated fluxes are compared in Figure 4b. These plots demonstrate general agreement over a wide range of variability in turbulent fluxes, with instantaneous λE ranging between 200 and 500 W m−2, and corresponding daytime values between 10 and 20 MJ m−2 d−1. For both instantaneous and daytime comparisons, the scatter in the turbulent fluxes is comparable to that for both Rs and Rn, suggesting that model accuracy is strongly influenced by errors in those critical forcing variables. It is important to note that none of the DisALEXI inputs, including meteorological variables, was derived from local observations collected at the validation sites.
 Statistical metrics—mean absolute difference (MAD), root-mean-square difference (RMSD), and mean bias error (MBE)—describing these comparisons are summarized in Table 2, indicating a “reference” model performance at 30 m for days when Landsat data were actually available. For instantaneous flux retrievals, MAD and RMSD indices suggest an average error in the order of 40–50 W m−2 for Rn, H, and λE and of 15 W m−2 for G0. The RMSD obtained for λE is comparable to that reported by Anderson et al. , French et al. , and Choi et al.  in prior evaluations of TSEB, DisALEXI, and other SEB models using data from SMEX02. The relative error (RE, ratio between MAD and observed average flux) in instantaneous λE retrieval was 13.1%, comparable with the typical uncertainty of flux tower measurements [Allen et al., 2011b]. Relative errors in λETd are reduced to 8% at the daily timescale due to cancellation of random errors. These errors associated with direct retrievals of daytime latent heat on actual Landsat scenes can be used as a reference for assessing the performance of the daily fused Landsat-like products, reconstructed using MODIS data between Landsat overpasses.
Table 2. Statistical Metrics Comparing Measured and Modeled Instantaneous and Daytime-Integrated Surface Fluxes on Landsat Overpass Dates at 30 m Resolution
MAD (W m−2)
RMSD (W m−2)
MBE (W m−2)
MAD (MJ m−2 d−1)
RMSD (MJ m−2 d−1)
MBE (MJ m−2 d−1)
4.2. Evaluation of Daily Landsat-Scale ET Time Series
 Daytime-integrated ET at daily time steps and 30 m resolution was computed over the Walnut Creek Watershed for the period June-August (DOY 150–240), 2002 using the Landsat-MODIS and Landsat-only methodologies described in section 2.2, and time series were extracted at the eight flux tower sites for comparison with observations. The plots in Figure 5 report the evolution of vegetation cover fraction (computed from equation (3) with θ = 0) over this period, along with rainfall recorded by the rain gauge closest to each site (see Figure 3). For the corn sites (Figure 5a), vegetation fraction rapidly increased up until DOY 185–195, with a slowly decreasing trend thereafter. The soybean sites (Figure 5b) show a slower rate of increase in green vegetation cover fraction compared to corn, reaching maximum coverage around DOY 215. Site 025 displays a different behavior from the other corn sites and is somewhat more similar to the soybean pattern. Anderson et al. [2004a] associated the depressed growth pattern of site 025 (named WC25) with moisture stress related to relatively high local sand content in the soil profile. Two major rainfall events occurred during the study period: one around DOY 190, during the growing stage for soybean (and corn in site 025) and near maximum cover in most corn sites, and the second around DOY 215–220 after all sites reached peak cover. There was no substantial difference in the temporal distribution of rainfall events across the different sites.
 Figure 6 shows Landsat-only and Landsat-MODIS results at each of the eight flux tower sites in comparison with observed daily fluxes. Figure 6 (top left) indicates which Landsat-MODIS observed image pair (red lines) was used as an input to STARFM to predict Landsat-scale ET on any given day (grey boxes surrounding the red lines). For example, MODIS ET maps for DOY 189–233 were most highly correlated to the MODIS ET map on Landsat imaging date 214, so this Landsat-MODIS image pair was used in STARFM to drive predictions over this date range. This is reasonable, because a major rainfall event occurred around DOY 190 and moisture conditions became abruptly wetter at that point, as reflected in the daily MODIS retrievals. This period includes most of the rainfall events that occurred during the study period, and it also suffers of a lack of Landsat acquisition as displayed by the 32 day gap between the closest Landsat acquisitions.
 In Figure 6, actual ET has been normalized by ET0 to remove day-by-day fluctuations in the ET datastream due to changes in radiation forcing and meteorological conditions in order to better emphasize the detailed behavior of the water stress and crop growth dynamics at the different sites, and the extent to which the MODIS data add value between Landsat overpasses. Observations at all sites show a general pattern of increasing ETd/ET0 during the first part of experiment (before DOY 185) and a plateau in the second part. This uniform generalized behavior can be explained by similarities in meteorological conditions and growing dynamics across the sites. However, a more detailed analysis highlights some significant differences among the sites. In particular, the corn sites 006, 024, and 033 (Figures 6a, 6c, and 6e, respectively) show an almost linear increasing trend from DOY 165 to 180 followed by an extended stage where ETd/ET0 remains relatively stable around a value of 0.85 (roughly corresponding to 5.5–6.0 mm d−1). At these sites, neither major rainfall event during the simulation period caused any significant change in apparent water stress, likely because these fields had reached a high cover fraction before these events occurred. In contrast, the soybean and stressed corn sites, especially fields 003, 161, and 025 (Figures 6b, 6f, and 6d, respectively), display a significant increase in the ratio ETd/ET0 from 0.6 to 0.9 after the first rainfall event (DOY 185, see Figure 6 (top right)), representing a change in ET from about 4 to 6 mm d−1. At these sites, the crops were at an early growing stage (fraction cover of about 0.6), and we are seeing an impulsive increase in the soil evaporation after the rainfall.
 The absence of substantial jumps in observed ETd/ET0 associated with the rainfall events at sites 006, 024, 033, and 162 enables the simple spline interpolation of Landsat estimates (Landsat-only) to reasonably reproduce the ET temporal dynamics. For the other sites, the Landsat-only interpolation underestimates the observed ET from DOY 185 to 210 because it does not capture the increase of ETd/ET0 ratio associated with the rainfall input between Landsat overpasses. For instance, in sites 003 and 161 (soybean) Landsat-only underestimates ETd/ET0 by 0.2 around DOY 195, which corresponds to an underestimation in daily ET of about 1.5 mm d−1. The Landsat-MODIS fused results (orange lines in Figure 6) better reproduce this rapid change in moisture status, particularly at sites 003, 025, and 161. The daily MODIS 1 km LST datastream is able to describe temporal variations in water availability that occurred over the landscape after the rainfall and impart this information to the fused datastream. A response to a second rainfall event around DOY 225 might be reflected in the observed ETd/ET0 ratio at sites 003 and 151. This response was not captured by the Landsat-MODIS fused datastream because there was no clear-sky MODIS scene during that period due to persistently cloudy conditions associated with the wet weather. This highlights a limitation in the fusion prediction capability when clear MODIS acquisitions are not available during persistent cloud cover periods of rainfall, which unfortunately often is when the moisture status is expected to change most significantly.
 The fused data products yield improved seasonal-cumulative ET estimates in comparison with measurements at the tower sites (Figure 7). Due to the systematic underestimation by the Landsat-only daily ET datastream during the DOY 190–210, the seasonal water loss, quantified as the sum of daily ET, was in general underestimated, by 20 mm on average over all sites. This systematic bias is reduced using the Landsat-MODIS approach to an average of 7 mm, essentially unbiased (Figure 7). Improvement can be observed for all the sites, with the only exception of site 024 which was very well represented by the Landsat-only spline. In this case, some noisy values in the MODIS time series (i.e., around DOY 180 and 210, see Figure 6c) served to introduce additional errors into the fused product, whereas the smooth behavior of the Landsat-only method seems to better capture on average the observed ETd/ET0 dynamics. Improvements to the simple Savitsky-Golay filter will be investigated to reduce undesired noise in the Landsat-MODIS estimates.
 Statistical performance metrics computed for all test sites at seasonal timescales are summarized in Table 3. Fusing Landsat-MODIS ET fields reduces MAD on average from 0.75 to 0.58 mm d−1 in comparison with Landsat-only interpolations and improves relative error at all sites except 024, which is satisfactory in both cases. The systematic underestimation in cumulative ET by the Landsat-only approach is also confirmed by computing the slope (b) of regression lines forced through the origin to provide an indication of relative bias between sites and retrieval approaches (Table 3; note that intercept values for unforced regressions were negligible). For Landsat-only, b was less than 1 for all sites except site 024 and was 0.90 on average, indicating a negative bias of approximately 10%. This bias is reduced to 2% (b = 0.98) for the Landsat-MODIS fused datastream. Accordingly, a substantial reduction (in absolute values) in the bias in cumulative flux (Δcum) is observed for almost all the sites (Table 3).
Table 3. Statistical Metrics Comparing Locally Measured and Remotely Modeled Daily ET at the Eight Flux Tower Sitesa
MAD (mm d−1)
MAD (mm d−1)
The bias (modeled − measured) in seasonal-cumulative ET at each site is indicated as Δcum.
 Although the improvements at daily timescale appear small in some cases (reduction of average RE from 14.2% to 11.1%), it is important to notice how the accuracy of the Landsat-MODIS fused results is more consistent across the sites, whereas the simple Landsat-only approach returns satisfactory results only for those sites where rainfall events do not cause notable changes in crop stress and soil evaporation. This finding is further corroborated by the standard deviation of MAD values, which is 0.18 mm d−1 for Landsat-only and only 0.10 mm d−1 for the data fusion product. The improvement at the seasonal scale is evident in Table 3, especially in the reduction of systematic biases in season-cumulative ET observed with the Landsat-only approach.
4.3. Spatial Patterns in Daily ET
 The differences observed at local scale at the location of the flux tower installations can be also observed at larger scales over the landscape by comparing spatial patterns in Landsat-only and Landsat-MODIS time series. On the basis of the results obtained at flux tower sites, we would expect major differences in these retrievals just after the large rainfall event around DOY 190. For this reason, two pairs of Landsat-like ET maps (retrieved between Landsat overpasses) are compared in Figure 8: a first pair obtained for DOY 184 (2 days after the Landsat image acquired on DOY 182) with Landsat-only (Figure 8a) and Landsat-MODIS (Figure 8c), and a second pair obtained with the same models (Figures 8b and 8d, respectively) for DOY 196 (14 days after the Landsat image acquired on DOY 182 and 18 days before the image acquired on DOY 214).
 For DOY 184 (Figures 8a and 8c), prior to the rainfall event, the Landsat-only and Landsat-MODIS modeling systems yield similar ET patterns, with some shift toward higher ET in the Landsat-MODIS retrieval over most of the study area. In contrast, the magnitude of the modeled ET fluxes is quite different for the second pair of images (DOY 196, Figures 8b and 8d), with a notable underestimation in Landsat-only compared to Landsat-MODIS across the whole area. This result is expected because the MODIS LST data on DOY 196 were able to capture the large scale increase in water availability associated with the rainfall event and to convert this information to Landsat spatial scale through STARFM, while this variation in water availability was not captured either by Landsat image on DOY 182 or 214 and thus lost in the Landsat-only datastream. Moreover, a coherent pattern in ET enhancement emerges in the middle of the Landsat-MODIS ET map (Figure 8d), which is not evident in the Landsat-only retrieval (Figure 8b).
 Finally, maps in Figure 9 depict growing seasonal-cumulative ET generated with the Landsat-only (Figure 9a) and Landsat-MODIS (Figure 9b) approaches, computed for the period between DOY 150 and 240. The modeled cumulative ET values range between 50 and 600 mm, corresponding to 0.5–6.5 mm d−1. The highest values were obtained in forest and wooded grassland areas (see Figure 3), whereas the lowest values correspond to urban areas and in a few sparsely vegetated crop fields.
 A comparison between the two spatial distributions shows no significant differences in seasonal ET in areas characterized by extreme ET values, as in the urban area around Ames (central north) and wooded riparian regions. On the other hand, significant differences can be observed in cropland, especially in the central part of the study area, where Landsat-only significantly underestimates Landsat-MODIS values, consistent with the biases observed at the tower sites, as reported in the previous section. This finding is highlighted by the histograms reported on the right side of each maps, showing the frequency distribution of seasonal ET separately for corn and soybean pixels. The two crops were discriminated by means of the detailed land-cover map generated by Doraiswamy et al.  for the study area. These histograms demonstrate the lower ET values estimated by Landsat-only method for both corn and soybean compared to Landsat-MODIS. Moreover, it appears that while the shape of the distribution from both methods is similar for soybean, the variance in seasonal ET among corn fields is significantly lower in Landsat-MODIS than in the Landsat-only retrievals resulting in a more strongly peaked distribution. These differences would have ramifications in applications assessing differential water use across agricultural districts and between growers, as in the case of on-demand water management based on field-scale water loss estimates or precision farming activities.
5. Summary and Conclusions
 Considering the relevance of daily and seasonal estimates of water consumption at field scale in cropped landscapes, reliable methodologies to accurately assess ET are required to adequately support water management practices. The current lack of concurrent high spatial and temporal resolution TIR data significantly limits the applicability of single-sensor remote sensing-based methodologies. However, multisensor approaches, which have shown promise in crop growth modeling, may enhance routine estimation of daily ET at field scale by combining the best quality of each currently available TIR data set. The results reported here focus on the use of a multisensor and multiresolution technique to produce consistent daily ET maps over a wide range of scales (from field to continental) and spatial resolution (from 30 to 10,000 m).
 The modeling framework evaluated here uses ALEXI estimates at continental scales and coarse resolution (10 km) as a boundary condition to retrieve ET maps at both MODIS (daily) and Landsat (approximately biweekly to monthly) spatial resolution (1 km and 30 m, respectively). The approach employs the STARFM data fusion procedure to combine these datastreams, producing daily ET maps at the finest resolution. The retrieval of ET maps at the 30 m scale facilitates direct comparison between model outputs with observations acquired by eight flux towers during the SMEX02 experiment in June-August 2002. Estimates of disaggregated surface energy fluxes at the Landsat overpass time were in substantial agreement with in situ observations (RE of about 13% for λE). When upscaled to daytime-integrated ET, the direct Landsat retrievals on Landsat imaging dates showed an absence of bias and MAD on the order of 0.2–0.3 mm d−1 (RE of about 8%). This flux disaggregation procedure also represents a practical way to evaluate the performance of the coarse (10 km) resolution ALEXI estimates, which are generally difficult to evaluate directly due to scale mismatch between the observation and modeling scale. From this point of view, the use of ALEXI as a boundary condition for DisALEXI procedure appears to minimize potential biases due to errors in specification of the vertical temperature gradient when TSEB is applied at finer resolutions.
 A comparison between daily and seasonal ET data derived by the proposed Landsat-MODIS fusion approach and fluxes from a benchmark case using Landsat data only suggests that significant improvements can be obtained under certain circumstances by incorporating the MODIS LST data. The discriminating factor in this analysis seemed to be the occurrence of rainfall events (which commonly cause a rapid increase in crop water availability and soil surface evaporation) between two consecutive high-resolution (Landsat) acquisitions, especially when those rainfall events occurred in the early stage of crop growing. This was the case for soybean fields, which were characterized by a fraction cover of about 0.6 when the main rainfall event occurred. The effect of the rainfall on the unstressed corn fields with near-full cover was less evident, and in these cases the data fusion provided small improvements. The fusion procedure appears to correctly capture the actual dynamics of crop water use in a wide range of cases, whereas a simple spline interpolation of Landsat estimates (normalized with ET0) tends to underestimate the observed ET due to the lack of information about hydrological inputs that occur between high-resolution acquisitions. A similar behavior can be expected when a rapid onset of crop stress occurs between two Landsat successive overpasses due to extended dry conditions.
 The analysis of statistical metrics, such as MAD and RE, further corroborate these findings, suggesting a moderate but significant reduction of the errors at all flux sites due to Landsat-MODIS data fusion, especially those in soybean fields. The accuracy of the fusion results was more uniform and consistent across the different sites as compared to the simple Landsat-only ETd/ET0 spline interpolation, which performs well only in the absence of rainfall events or over unstressed crops with full coverage (corn fields in this study). Furthermore, the improvement in terms of seasonal-cumulative ET achieved by introducing MODIS data is evidenced by a significant decrease in negative bias with respect to observed seasonal-cumulative flux, reduced from −20 mm for Landsat-only to −7mm (3% of the seasonal-total ET) for the MODIS-Landsat fusion time series.
 These results suggest that over these rainfed agricultural areas, the capability of the 1 km resolution MODIS TIR data to resolve the spatial scale of observed rainfall events improved the reliability of Landsat-scale ET estimates, which capture field-to-field variability in water use. Over large agricultural districts with relatively uniform management practices, the contribution of both rainfall and irrigation will tend to even out spatial differences in ET fluxes over similar crop types, reducing spatial heterogeneity in water use at kilometer scales. In such areas, data fusion will add value to seasonal ET assessments as demonstrated in this study. In areas with more sparsely distributed irrigated fields, however, the main moisture inputs will be at field scales significantly smaller than a MODIS pixel and data fusion may not be as effective. Further tests are underway to explore the capability of MODIS data to account for moisture variations that occur over agricultural landscapes with varying densities of irrigation activity. Additional studies are being conducted to evaluate utility of ET data fusion at grassland and forested sites within different climatic regions in the United States.
 The authors would like to thank Jerry Hatfield and John Prueger from the USDA-ARS Laboratory for Agriculture and the Environment for supporting the continued operation, maintenance, collection, and processing of the eight eddy covariance flux tower systems used in this study. Support for this research was provided by NASA (grant NNH11AQ82I). The U.S. Department of Agriculture (USDA) prohibits discrimination in all its programs and activities on the basis of race, color, national origin, age, disability, and where applicable, sex, marital status, familial status, parental status, religion, sexual orientation, genetic information, political beliefs, reprisal, or because all or part of an individual's income is derived from any public assistance program. (Not all prohibited bases apply to all programs.) Persons with disabilities who require alternative means for communication of program information (Braille, large print, audiotape, etc.) should contact USDA's TARGET Center at (202) 720–2600 (voice and TDD). To file a complaint of discrimination, write to USDA, Director, Office of Civil Rights, 1400 Independence Avenue, S.W., Washington, DC 20250-9410, or call (800) 795–3272 (voice) or (202) 720–6382 (TDD). USDA is an equal opportunity provider and employer.