Assessing the impact of end-member selection on the accuracy of satellite-based spatial variability models for actual evapotranspiration estimation

Authors

  • Di Long,

    Corresponding author
    1. Bureau of Economic Geology, Jackson School of Geosciences, The University of Texas at Austin, Austin, Texas, USA
    2. Department of Biological & Agricultural Engineering, Texas A&M University, College Station, Texas, USA
    • Corresponding author: Di Long, Bureau of Economic Geology, Jackson School of Geosciences, The University of Texas at Austin, Austin, TX 78758, USA. (di.long@beg.utexas.edu)

    Search for more papers by this author
  • Vijay P. Singh

    1. Department of Biological & Agricultural Engineering, Texas A&M University, College Station, Texas, USA
    2. Department of Civil & Environmental Engineering, Texas A&M University, College Station, Texas, USA
    Search for more papers by this author

Abstract

[1] This study examines the impact of end-member (i.e., hot and cold extremes) selection on the performance and mechanisms of error propagation in satellite-based spatial variability models for estimating actual evapotranspiration, using the triangle, surface energy balance algorithm for land (SEBAL), and mapping evapotranspiration with high resolution and internalized calibration (METRIC) models. These models were applied to the soil moisture-atmosphere coupling experiment site in central Iowa on two Landsat Thematic Mapper/Enhanced Thematic Mapper Plus acquisition dates in 2002. Evaporative fraction (EF, defined as the ratio of latent heat flux to availability energy) estimates from the three models at field and watershed scales were examined using varying end-members. Results show that the end-members fundamentally determine the magnitudes of EF retrievals at both field and watershed scales. The hot and cold extremes exercise a similar impact on the discrepancy between the EF estimates and the ground-based measurements, i.e., given a hot (cold) extreme, the EF estimates tend to increase with increasing temperature of cold (hot) extreme, and decrease with decreasing temperature of cold (hot) extreme. The coefficient of determination between the EF estimates and the ground-based measurements depends principally on the capability of remotely sensed surface temperature (Ts) to capture EF (i.e., depending on the correlation between Ts and EF measurements), being slightly influenced by the end-members. Varying the end-members does not substantially affect the standard deviation and skewness of the EF frequency distributions from the same model at the watershed scale. However, different models generate markedly different EF frequency distributions due to differing model physics, especially the limiting edges of EF defined in the remotely sensed vegetation fraction (fc) and Ts space. In general, the end-members cannot be properly determined because (1) they do not necessarily exist within a scene, varying with the spatial extent, resolution, and quality of satellite images being used and/or (2) different operators can select different end-members. Furthermore, the limiting edge of EF = 0 in the fc-Ts space varies with the model, with SEBAL-type models having inherently an increasing curvilinear limiting edge of EF = 0 with fc. The spatial variability models therefore require careful calibration in order to deduce reasonable EF-limiting edges and then confine the magnitudes of EF estimates.

1. Introduction

[2] Actual evapotranspiration (ETa), comprising vegetation transpiration, surface evaporation, and interception from the vegetative surface, plays a key role in the exchange of water and heat between the land surface and the lower atmosphere. An accurate understanding of the magnitude and distribution of ETa on the Earth's surface is of importance to many disciplines, e.g., hydrology, agriculture, ecosystem, meteorology, and forestry, and to many related applications, e.g., water resources allocation, irrigation scheduling, crop yield forecasting, weather prediction, drought monitoring, and vulnerability of forests to fire [Anderson et al., 2007; Bastiaanssen et al., 2005; Long and Singh, 2010; Mackay et al., 2007; McVicar and Jupp, 1998; Norman et al., 2003; Verstraeten et al., 2008]. Over the last three decades, satellite remote sensing has provided an unprecedented opportunity for capturing the variability in ETa across a variety of spatial and temporal scales that are not attainable by conventional techniques (e.g., weighing lysimeter, energy balance Bowen ratio (EBBR), and eddy covariance (EC) systems). There are a wide variety of models for ETa estimation developed by incorporating remotely sensed land surface temperature (Ts) and other critical variables, e.g., albedo (α) and fractional vegetation cover (fc).

[3] The models, classified as the “spatial variability models” by Kalma et al. [2008], are unique in interpreting the contextual relationship between the normalized difference vegetation index (NDVI) or fc and Ts [e.g., Batra et al., 2006; Carlson et al., 1995a; Gillies et al., 1997; Jiang and Islam, 2001; Price, 1990], or between α and Ts [e.g., Roerink et al., 2000; Verstraeten et al., 2005] to deduce the evaporative fraction (EF, defined as the ratio of latent heat flux (LE) to available energy (A)) and ETa. In addition, the surface energy balance algorithm for land (SEBAL) [Bastiaanssen et al., 1998] and a variant, the mapping ET with high resolution and internalized calibration (METRIC) [Allen et al., 2007], pertain to the spatial variability model, as they incorporate the spatial variability in Ts and two constant end-members for a specific scene of image [Bastiaanssen et al., 2002], termed the “hot pixel” and the “cold pixel,” to deduce sensible heat (H) and LE by quasi-linear interpolation of extremes.

[4] Several studies have evaluated a range of remote sensing-based ETa models, which provide insights into the performance of these models under varying soil moisture, environmental, and climatic conditions. French et al. [2005a] examined the utility of the two-source energy balance model (TSEB) [Norman et al., 1995] and SEBAL at the soil moisture-atmosphere coupling experiment (SMACEX) site characterized primarily by rainfed corn and soybean in central Iowa in 2002. A bias of ∼ −80 W m−2 for LE estimates from SEBAL [French et al., 2005b] was attributable to the inability to fully distinguish wet and dry extremes at the study site. Timmermans et al. [2007] further examined the utility of TSEB and SEBAL over a subhumid grassland (Southern Great Plains '97) and a subarid rangeland (Monsoon '90) by performing a sensitivity analysis of the two models and comparing their H estimates over different land cover types, confirming that TSEB and SEBAL tend to be most sensitive to Ts, and SEBAL may not be applicable over sparsely vegetated areas. An intercomparison of SEBAL, surface energy balance system (SEBS) [Su, 2002], and TSEB was performed, and ETa estimates from these models were compared with the counterparts from the soil and water assessment tool (SWAT) over a watershed (1850 km2) in the Chao River basin in North China [Gao and Long, 2008]. In general, there was consistency in lumped ETa estimates from these remote sensing-based models, whereas large differences in the frequency distributions of ETa were observed. Gonzalez-Dugo et al. [2009] compared METRIC, TSEB, and another empirical one-source model at the SMACEX site, showing slightly higher accuracy of ETa retrievals from the three models than other studies due to the use of measured net radiation (Rn) and soil heat flux (G). Choi et al. [2009] extended their study on a trapezoid NDVI-Ts model, METRIC, and TSEB by comparing output of these models and found significant discrepancies in the spatial distributions of H and LE estimates between different models. Table 1 lists published studies regarding the evaluation of triangle, SEBAL, and METRIC models.

Table 1. Studies on Evaluation of Three Spatial Variability Models, Including Triangle, SEBAL, and METRIC Modelsa
ModelLiteratureName of Experiment/Study Site, Area, Climate, and Landscape PropertiesDOY (Soil Moisture Condition), Sensor Type, and Number of End-Member Pair for Each SceneGround-Based Measurement, Energy Closure Approach, and Number of Total MeasurementsAccuracy of EF or LE (W m−2)
  1. a

    BR means the Bowen ratio method and RE means the residual energy balance method for energy balance closure for eddy covariance systems. It is noted that biases in Batra et al. [2006] and Choi et al. [2009] were calculated as observed fluxes minus estimated fluxes, instead of other studies as estimated fluxes minus observed fluxes.

TriangleBatra et al. [2006]Southern Great Plains (SGP), ∼200,000 km2, from humid in the east to semiarid in the west, mixed farming, interrupted forest, tall and short grass115, 163, 190, 229, 276, and 287 in 2001; 91, 99, and 287 in 2002; 82, 90, 91, 262, 285, and 295 in 2003; 15 MODIS, 12 NOAA16, and 6 NOAA 14; 1EBBR stations, BR, 60 for MODIS and NOAA16, 21 for NOAA 14Bias: −29 W m−2 for MODIS, 12 W m−2 for NOAA16 and −28 W m−2 for NOAA14; RMSE: 53 W m−2 for MODIS, 51 W m−2 for NOAA16, and 55 W m−2 for NOAA14; R2: 0.84 for MODIS, 0.79 for NOAA16, and 0.77 for NOAA14
Jiang et al. [2009]Sothern Florida, 124,000 km2, humid climate, sawgrass, sugar cane, and marshDays of clear sky and available ground-based measurements in 1998 and 1999, NOAA14, 1EBBR stations and lysimeters, BRBias: 0.049 RMSE: 0.158
SEBALFrench et al. [2005a, 2005b]SMACEX, ∼670 km2, humid climate, corn and soybean fields174 (intermediate) in 2002, ASTER, 1EC, BR, 8Bias: −82 W m−2
Timmermans et al. [2007](1) Monsoon '90, ∼10 km2, subhumid, grassland; (2) SGP97, ∼45 km2, semiarid, rangeland(1) 213 (dry), 216 (wet), 221 (intermediate) in 1990, airborne instruments, 1; (2) 180 (wet) and 183 (intermediate) in 1997, airborne instruments, 1(1) Variance method, -, 24; (2) EC, BR, 8MAD: 61 W m−2;
    MAPD: 23%
    RMSE: 70 W m−2
Wang et al. [2009]Las Cruces in New Mexico, 3600 km2, pecan and alfalfa, semiarid climate247 (intermediate) in 2002, ASTER, 1EC, no correction, 2Relative error: 4.3%–13%
METRICChoi et al. [2009]SMACEX, ∼670 km2, humid climate, corn and soybean fields174 (intermediate), Landsat TM; 182 (dry), Landsat ETM+ in 2002, 1EC towers, BR and RE, 18Bias: 53 W m−2 for RE, 19 W m−2 for BR; RMSE: 75 W m−2 for RE, 55 W m−2 for BR
Gonzalez-Dugo et al. [2009]SMACEX, ∼670 km2, humid climate, corn and soybean fields174 (intermediate), Landsat TM; 182 (dry), Landsat ETM+; 189 (wet), Landsat ETM+ in 2002, 3EC towers, BR and RE, 29RMSE: 35−44 W m−2 for RE, 39–48 W m−2 for BR; R2: 0.85–0.89 for BR, 0.81–0.83 for BR

[5] In general, these published studies compared in detail flux estimates with ground-based measurements in terms of root-mean-square difference (RMSD) and bias that describe the magnitude of model-measurement discrepancies. Input variables, applicability of models under a certain environment, sensitive variables, and the distinction between the one-source and the two-source models have been fully discussed [e.g., French et al., 2005a; Timmermans et al., 2007]. However, how the end-member selection compounded by subjectivity impacts the LE estimates, and whether there is a common mechanism for error propagation remain unclear and warrant further investigation. Model intercomparison has suggested notable differences in flux estimates from varying models over an entire modeling domain or a specific land cover type, though in some cases, the one-source and two-source models produced comparable discrepancies with respect to ground-based measurements [e.g., Choi et al., 2009; French et al., 2005a; Timmermans et al., 2007]. Few studies investigated error propagation in terms of model physics and reported the coefficient of determination (R2). Exploring the fundamental reasons for differences in LE estimates over a modeling domain and examining detailed mechanisms of error propagation could be of great value for a greater understanding of the deficiencies in model physics and improvements to a range of remote sensing-based ETa approaches.

[6] The objectives of this study, therefore, were to (1) examine how the end-members of the triangle model, SEBAL, and METRIC impact the resulting EF retrievals at the field scale; (2) examine how the end-members impact the frequency distributions of the EF estimates at the watershed scale; (3) examine if the limiting edges of EF within the fc-Ts space explicitly or implicitly involved in the models can depict the reality; and (4) discuss the common mechanisms of error propagation and potential ways to resolve uncertainties in the spatial variability models.

2. Background Theory

[7] In general, all spatial variability models are based on the equation of energy balance on the Earth's surface:

display math(1)

where Rn is the net radiation (W m−2); G is the soil heat flux (W m−2); A is the available energy (W m−2); H is the sensible heat flux (W m−2); and LE is the latent heat flux (W m−2). Rn can be expressed as

display math(2)

where Sd is the downwelling shortwave radiation (W m−2); ε is the surface emissivity (dimensionless); εa is the atmospheric emissivity (dimensionless); σc is the Stefan-Boltzmann constant (5.67 × 10−8 W m−2 K−4); and Ta is the air temperature (K).

[8] SEBAL and METRIC compute G as a fraction of Rn as [Allen et al., 2007; Bastiaanssen, 2000]

display math(3)

[9] Latent heat flux is therefore calculated as the residual of equation (1) when Rn, G, and H are derived in sequence. Triangle models directly deduce EF from the NDVI or fc-Ts space by interpolating between “warm” and “cold” edges defining nominal boundaries to this distribution of data points (Figure 1). Daily ETa (mm d−1) is often calculated by using remotely sensed EF, which is assumed to be equivalent to the 24 h average EF to partition daily net radiation as SEBAL and the triangle model do [Bastiaanssen et al., 2002; Jiang et al., 2009; Long et al., 2010]. However, this assumption can result in uncertainties in the ETa estimates under partial cloudy conditions throughout a day, across areas where nocturnal transpiration is large [Van Niel et al., 2011] and forested areas where rainfall interception and subsequent evaporation from the canopy can be appreciable [Schellekens et al., 1999]. METRIC computes daily ETa by using the reference ET fraction at the satellite overpass time, which is assumed equal to the 24 h average ET fraction to partition daily reference ET [Allen et al., 2007]. Comparison based on EF instead of LE or ETa in this study was intended to isolate the effect of Rn on LE, adopting a straightforward scale to assess the impact of end-members on the output from triangle, SEBAL, and METRIC models.

Figure 1.

Conceptual scatterplot of remotely sensed fc and Ts. Colored circles represent surfaces/pixels with varying fc and Ts. Green color in dots represents relatively high soil surface water content, and yellow color represents relatively low soil surface water content in terms of the concept of soil surface water content isopleths [Carlson, 2007; Carlson et al., 1995b]. Trapezoid ABCD represents a trapezoidal framework of the fc-Ts space [e.g., Long and Singh, 2012a; Moran et al., 1994], triangle ABC represents a triangular framework of the fc-Ts space [e.g., Jiang and Islam, 2001; Sandholt et al., 2002], and rectangle ABCE represents a degenerate triangular framework of the fc-Ts space [e.g., Batra et al., 2006; Jiang et al., 2009]. Warm edges 2, 4, and 5 represent hot extremes possibly selected in triangle, SEBAL, and METRIC models. Cold edges 1–3 represent cold extremes possibly selected in triangle, SEBAL, and METRIC models. Point i represents a pixel within the fc-Ts space. Distance l represents the difference in Ts between point i and warm edge 1. Distance l′ represents the difference in Ts between point i and warm edge 2. Distance m represents the difference in Ts between cold edge 1 and warm edge 1 for point i. Distance m′ represents the difference in Ts between cold edge 1 and warm edge 2 for point i.

2.1. Triangle Models

[10] Figure 1 also shows a conceptual representation of triangle models. EF for a pixel i from the triangle model [e.g., Jiang and Islam, 2001] is estimated by interpolating parameter ϕ of the cold edge (ϕ = 1.26, corresponding to cold edge 1 in Figure 1) and the warm edge (ϕ = 1.26NDVIi/NDVImax or ϕ = 1.26fc,i/fc,max, corresponding to warm edge 1 in Figure 1) based on the ratio of the Ts difference between warm edge 1 and pixel i (l in Figure 1) to the Ts difference between warm edge 1 and cold edge 1 (m in Figure 1). Later, the warm edge of the triangle model was simplified as a constant hot extreme (point A and warm edge 2 in Figure 1) [Batra et al., 2006; Jiang et al., 2009]. The ϕ value for the horizontal warm edge is taken to be zero, and the ϕ value for a pixel i is interpolated based on the ratio of the Ts difference between warm edge 2 and pixel i (l′ in Figure 1) to the Ts difference between warm edge 2 and cold edge 1 (m′ in Figure 1). EF from the triangle model can be written as [Batra et al., 2006; Choi et al., 2009; Jiang et al., 2009]

display math(4)

where ϕmax is equal to 1.26; Tmax and Tmin are the temperatures (K) of the hot extreme and cold extremes throughout a scene; Δ is the slope of saturated vapor pressure at Ta (kPa °C−1); and γ is the psychometric constant (kPa °C−1). The quantity Δ/(Δ+γ) remains fairly invariant under generally homogenous meteorological fields (i.e., Ta, vapor pressure ea, and wind velocity u). It is apparent from equation (4) that variation in EF with Ts follows an inverse linear relation; Tmax and Tmin play a key role in determining EF.

2.2. SEBAL

[11] SEBAL calculates H by assuming a linear relationship between ΔT and Ts across an entire image where ΔT (K) is the difference between the aerodynamic temperature and Ta [Bastiaanssen et al., 1998, 2005]. This assumption obviates the specification of roughness length for heat transfer parameterized in the one-source models that cannot be assessed on the basis of generic rules in heterogeneous landscapes [Bastiaanssen et al., 2005; Carlson et al., 1995c]. Coefficients of the linear relationship are determined by a hot extreme and a cold extreme selected by the operator from the satellite image. For the hot extreme, LE is taken to be zero; thus, H for the hot extreme is equal to its available energy. For the cold extreme, H is taken to be zero and LE for the cold extreme is equal to the available energy. Sensible heat flux from SEBAL is calculated as

display math(5)

where ρ is the air density (kg m−3); cp is the specific heat of air at constant pressure (J kg−1 K−1); rah is the aerodynamic resistance (s m−1) [Long and Singh, 2012b]; and a (dimensionless) and b (K) are the coefficients of the assumed linear relationship between Ts and ΔT derived by the two extremes:

display math(6)
display math(7)

where subscript hot denotes variables for the hot extreme. Long et al. [2011] showed that Tmax, Tmin, and Ahot are the most sensitive variables associated with end-members in SEBAL, and the hot extreme plays a more prominent role than does the cold extreme. Latent heat flux can be eventually obtained by equation (1) after Rn, G, and H are calculated.

[12] From the perspective of the fc-Ts space, the two extremes in SEBAL bound two limiting edges of EF [Long and Singh, 2012b]. Sensible heat flux and EF for each fc class can be calculated using the following equations:

display math(8)
display math(9)

where subscript i denotes variables in the fc class i.

2.3. METRIC

[13] METRIC inherits the key assumption of linear correlation of Ts with ΔT from SEBAL and interpolation of H for all pixels in an image, except the end-members in terms of coefficients a and b [Allen et al., 2007]. It differs from SEBAL in a slight modification of the energy balance for the two extremes, considering conditions deviating somewhat from the reality. This means that METRIC does not strictly require zero LE = 0 at the hot extreme where a soil water balance model [Allen et al., 1998] is used to infer residual soil evaporation for the hot extreme. The cold extreme is tied to measurements of reference ET made at a ground station somewhere within the modeling domain. The use of reference ET, which is sometimes larger than available energy for the cold extreme, partly accounts for the impact of advection on the energy balance for the cold extreme [Choi et al., 2009]. However, the modification does not substantially alter the model physics of SEBAL, which may make METRIC exhibit similar performance and mechanisms of error propagation as does SEBAL. Coefficients a and b of METRIC are incorrectly expressed in Allen et al. [2007]. They are, in fact, in the forms below (note that coefficients a and b in this study correspond to b and a in equation (29) in Allen et al. [2007]):

display math(10)
display math(11)
display math(12)
display math(13)

where subscript cold denotes variables for the cold pixel. It is apparent that in SEBAL LEhot in equation (12) is zero, and dTcold becomes zero due to LEcold = Acold. In METRIC, LEcold is inferred by the American Society of Civil Engineers (ASCE)-standardized Penman-Monteith reference ET multiplied by a reference ET fraction (ETrF) of 1.05 [Allen et al., 2007], and LEhot is calculated by the reference ET multiplied by ETrF for the hot extreme, which is estimated by a soil water balance model [Allen et al., 1998].

[14] Note that all the three spatial variability models require the selection of end-members from satellite images by the operator. Efforts on automating the selection of end-members have been made [e.g., McVicar and Jupp, 1999, 2002] and are still ongoing for SEBAL and METRIC [Allen et al., 2007; Choi et al., 2009], which would reduce subjectivity. Characteristic variables of the end-members, e.g., Tmax, and Tmin, are subsequently derived and constitute input of equations (4) and (6)-(13). It is assumed that points A and B in Figure 1 represent the realistic hot and cold extremes that can meet the basic assumptions of these models. There are three possibilities for a hot extreme and a cold extreme to be selected or specified, corresponding to warm edges 2, 4, and 5, and cold edges 1–3 in Figure 1.

3. Materials

3.1. Study Site

[15] The SMACEX campaign was conducted in central Iowa, USA, ranging in latitude between 41.87°N and 42.05°N and in longitude between −93.83°W and −93.39°W (Figure 2) between 15 June (day of year (DOY) 166) through 8 July (DOY 189) in 2002. It provided extensive measurements of soil, vegetation, and meteorological properties and states for a greater understanding of mechanisms of water and heat exchanges with the atmosphere [Kustas et al., 2005]. The field campaign was primarily conducted in the Walnut Creek watershed, just south of Ames in central Iowa. Rainfed corn and soybean fields dominate the Walnut Creek watershed. These crops grew rapidly during the campaign.

Figure 2.

Location and false color composite of Landsat TM imagery acquired on 23 June 2002, of the SMACEX site in Ames, central Iowa, USA. The Walnut Creek watershed is delineated in yellow and the main Walnut Creek and its branch are shown in green. The meteorological-flux (METFLUX) network, comprising 12 field sites, is shown in numbered green circles nested with cross wires. Letter C denotes corn and S denotes soybean for the major crop type at each EC tower.

[16] The mean annual rainfall of this region is 835 mm/year, which is classified as a humid climate. Precipitation during the campaign occurred a few days prior to 15 June (DOY 166), with a minor rainfall event wetting the soil surface in most areas with either 0–5 mm or 5–10 mm of precipitation on 20 June (DOY 171). This was followed by a drydown period for the Walnut Creek watershed until 4 July (DOY 185) [Kustas and Anderson, 2009; Kustas et al., 2005]. The topography is characterized by low relief and poor surface drainage, with a mean elevation of ∼300 m. The extensive measurements of surface fluxes in combination with diverse agricultural crops make the SMACEX site an ideal test bed for evaluating a range of remote sensing-based models [e.g., Anderson et al., 2007; Choi et al., 2009].

3.2. Flux Tower Measurements

[17] A network consisting of 12 fully operational meteorological-flux (METFLUX) towers was deployed within or in the vicinity of the Walnut Creek watershed (flux tower (FT) 13, 14, 151, 152, 161, 162 within the watershed; FT03, 06, 23, 24, 25, and 33 outside the watershed), employing EC systems at 12 field sites, in which five sites were corn and seven sites were soybean (Figure 2). These towers were instrumented with a variety of sensors for measuring turbulent fluxes of LE and H, as well as radiation components (i.e., incoming and outgoing shortwave and longwave radiation) and soil heat fluxes at 30 min intervals. Additional in situ hydrometeorological observations encompassed 10 min averaged Ta, relative humidity, and wind speed and direction. Air temperature and relative humidity were measured at heights ranging between 1.16 m and 2.66 m for different EC towers but remained unchanged during the campaign. Wind velocity was measured at heights ranging between 1.83 m and 5.03 m for differing EC towers during the campaign, five of which in corn fields were elevated by 1–2 m between DOY 179 and DOY 181 to accommodate growth in the corn. Observed EF calculated by observed LE over the sum of observed LE and H, i.e., preserving the Bowen ratio, at these EC towers for two image acquisition dates [e.g, Anderson et al., 2005, 2007; French et al., 2005b] was used to evaluate the three spatial variability models. If the energy balance closure is achieved by the residual method, i.e., ground-based LE derived from observed Rn-G-H, storage corrections to soil heat flux measurements should be performed. Table 2 contains three statistical metrics quantifying the discrepancies between measurements and estimates. EF estimates from the three spatial variability models were averaged over the estimated upwind source-area/footprint for each flux tower using the approach proposed by Li et al. [2008]. Details of these sensors and processing of measurements can be found in Kustas et al. [2005] and Prueger et al. [2005].

Table 2. Description of Statistics Used in This Studya
Statistical VariablesDescriptionEquation
  1. a

    In equations, Pi represents a sample of model simulations and Oi represents a sample of ground-based observations.

μSample mean math formula
σStandard deviation math formula
γSkewness math formula
RMSDRoot-mean-square difference math formula
MAPDMean absolute percentage difference math formula
BiasBias math formula

3.3. Remote Sensing Data Sources and Variable Derivation

[18] During SMACEX, three cloud-free scenes of Landsat Thematic Mapper (TM)/Enhanced Thematic Mapper Plus (ETM+) imagery were acquired, two of which were used in this study. One scene of Landsat TM was acquired at 10:20 A.M. (local time) on DOY 174 (23 June 2002) spanning vegetated canopy cover from 50% to 75%. The other scene of Landsat ETM+ was acquired at 10:42 A.M. (local time) on DOY 182 (1 July 2002) spanning vegetated canopy cover from 75% to 90%. Land surface temperature was retrieved from the thermal band of the Landsat images using parameters specifically for the SMACEX site [Li et al., 2004], with uncertainties within ∼1°C for the Landsat ETM+ image and ∼1.5°C for the Landsat TM image. Albedo was retrieved from the visible, near-infrared, and shortwave infrared bands of the Landsat images using Allen et al. [2007]'s algorithm. Spatial mean of Ta across all EC towers at the satellite overpass time was 29.6°C and 29.4°C for DOY 174 and DOY 182, respectively, with DOY 174 showing a smaller standard deviation of 0.34°C than DOY 182 with a standard deviation of 0.42°C. Information on antecedent precipitation, fc, and meteorological conditions suggests that DOY 174 had a more homogenous surface moisture and meteorological condition than DOY 182.

4. Methods

4.1. Selection of End-Members

[19] A brief review of the three spatial variability models in section 'Background Theory' shows that Tmax and Tmin bound EF within a scene. EF for the remaining pixels is linearly or quasi-linearly interpolated by the two extremes [Choi et al., 2009; Long and Singh, 2012b], and hence they play a critical role in determining the resultant EF or LE across a scene. To test this hypothesis, EF was simulated by the three spatial variability models in combination with differing end-members. Three operators having knowledge of remote sensing and the spatial variability models selected end-members against the scatterplot of the fc-Ts space (Figure 3) and land cover map of the SMACEX site, with three hot extremes and three cold extremes identified for each day. The three hot and three cold pixels resulted in nine (3×3) combinations of end-members for subsequent calculation of EF (Table 3).

Figure 3.

Scatterplots of fc and Ts for the SMACEX site in central Iowa, USA, on (a) DOY 174 and (b) DOY 182 in 2002. Numbered red circles (1–3) represent hot extremes, and numbered blue circles (1–3) represent cold extremes selected by different operators.

Table 3. Hot and Cold Extremes With Their Characteristic Variables at the Soil Moisture Atmosphere Coupling Experiment (SMACEX) Site for DOY 174 and DOY 182 in 2002a
Case (Hot, Cold)FTTmax (°C)Ahot (W m−2)fc,hotTmin (°C)Acold (W m−2)fc,cold
174182
  1. a

    Column 1 shows nine combinations of three hot pixels (numbered 1–3) and three cold pixels (numbered 1–3) for both days, referring to Figure 3 showing these extremes on the scatterplots of fc and Ts for DOY 174 and DOY 182. FT means IDs of eddy covariance towers for calculating reference ET as input for METRIC.

174 and 182HotColdHotCold174182174182174182174182174182174182
1 (1, 1)3325332442.348.7481.7442.50.170.1326.128.8593.0610.90.840.87
2 (2, 1)0625332442.649.6478.3440.90.330.1326.128.8593.0610.90.840.87
3 (3, 1)2525332443.151.5467.0442.50.270.1426.128.8593.0610.90.840.87
4 (1, 2)3306332442.348.7481.7442.50.170.1326.829.0588.1638.10.920.90
5 (2, 2)0606332442.649.6478.3440.90.330.1326.829.0588.1638.10.920.90
6 (3, 2)2506332443.151.5467.0442.50.270.1426.829.0588.1638.10.920.90
7 (1, 3)3306332442.348.7481.7442.50.170.1326.930.5570.9550.90.940.92
8 (2, 3)0606332442.649.6478.3440.90.330.1326.930.5570.9550.90.940.92
9 (3, 3)2506332443.151.5467.0442.50.270.1426.930.5570.9550.90.940.92

[20] The hot pixels for DOY 174 were selected north, west, and south of the study domain, respectively (Figure 4a), which were indentified to be a bare surface (hot pixel 1) and late plantings of a soybean crop (hot pixels 2 and 3). The cold pixels were selected from the corn fields with fc ranging from 0.84 to 0.9 in the southeast (cold pixel 1) and west (cold pixels 2 and 3). For DOY 182, the hot extremes were selected from the bare surfaces with low fc, ranging from 0.13 to 0.14 in the north of the study domain (Figure 4b). The cold extremes were concentrated in the corn fields, with fc ranging from 0.87 to 0.92 in the east. It is noted that differences in Tmax or Tmin were within 1°C for DOY 174 but within 3°C for DOY 182 (Table 3). This is attributed to the combined effect of higher spatial resolution of Ts (60 m) and soil moisture variability for DOY 182 (standard deviation of Ts: 3.3°C) than DOY 174 (the spatial resolution of Ts: 120 m; the standard deviation of Ts: 2.8°C). A larger contrast in soil surface water content and, consequently, Ts would likely result in larger differences in the selected end-members.

Figure 4.

End-members for (a) DOY 174 and (b) DOY 182 at the SMACEX site selected by different operators.

[21] Furthermore, there is no certain pattern for the distribution of end-members at the SMACEX site, which could be due in part to the large variability in convective precipitation and rapid change in soil moisture in summer. An additional hot extreme was deduced by extrapolating the warm edge of the fc-Ts space to intersect with fc = 0 [Batra et al., 2006]. However, the deduced hot extreme in conjunction with the triangle model failed to generate acceptable EF estimates, thereby not being involved in further discussion.

[22] Reference ET values of the selected end-members were calculated using meteorological data from the nearest EC towers as input for METRIC (Figure 4 and Table 3). Typical values of ETrF for hot pixels vary between 0 and 0.1, and 0.1 could be appropriate during the SMACEX campaign based on Choi et al. [2009], though the soil water balance model showed values smaller than 0.1 for the 2 days being tested.

4.2. Theoretical Expression of Limiting Edges of EF for Triangle and SEBAL-Type Models

[23] Definitions of limiting edges of EF are different in the spatial variability models, but these differences and associated potential deficiencies have not yet been systematically investigated. Furthermore, there remains a question: Are the definitions of these limiting edges reasonable? It is apparent from equation (4) that EF for the warm edge of the triangle model (segment AE in Figure 1) is zero (Ts = Tmax); EF for the cold edge is equal to the quantity 1.26Δ/(Δ+γ), which is equal to 1 when Ta = 31.2°C and to 0.93 when Ta = 25°C.

[24] It is noted that the limiting edges of EF in SEBAL are not explicitly shown because the interpolation is applied to H rather than EF. It can be derived that EF for the cold edge of SEBAL is equal to 1 in terms of equations (6)-(9) when Ts = Tmin. Here, we derive the limiting edge of EF = 0 inherent in SEBAL. Writing the first two terms of the Taylor series of Ai at Ta results in the following:

display math(14)
display math(15)
display math(16)

where Ai(Ta) is the net energy for pixels in fc class i, in which Ts is replaced by Ta and Δ′ is the derivative part of the second term of the Taylor series of Ai(Ts).

[25] Let equation (9) be equal to zero. Combining equations (8) and (14)-(16) results in

display math(17)

where math formula is the temperatures of the driest surfaces where EF is equal to 0 for a full range of fc in SEBAL. Meteorological fields were generally uniform at the study site [Long and Singh, 2012a]. Variation in math formula with fc is, hence, caused primarily by variation in α in Ai (Ta) and the roughness length for momentum transfer, zom, in rah,i. We suggest that the functional relationship between α and fc for the limiting edge of EF = 0 can be approximated by the upper envelope of the scatterplot of fc and α [Long and Singh, 2012b], because the driest surface tends to have the largest α given an fc class. The functional relationship between zom and fc can be constructed using the method of Tasumi [2003]. Coefficients a and b are derived from the combination of end-members, which resulted in the highest EF accuracy shown in section 'EF Estimates at the Field Scale Using Different End-Members'.

5. Results

5.1. EF Estimates at the Field Scale Using Different End-Members

[26] Comparison of ground-based EF measurements and EF estimates from the three models using nine combinations of end-members is displayed in Table 4 and Figure 5. There are five key points to be made from this investigation.

Table 4. Discrepancies Between EF Estimates From Triangle, SEBAL, and METRIC Models and Corresponding Measurements Under Nine Combinations of Selected Extremes on DOY 174 and DOY 182a
ModelCold PixelHot PixelCaseDOY 174DOY 182
R2RMSDMAPDBiasR2RMSDMAPDBias
  1. a

    Cold pixels 1–3 and hot pixels 1–3 are in increasing order of the magnitude of temperature (referring to Figure 5), respectively. Statistics R2, RMSD, and MAPD represent the coefficient of determination, root-mean-square difference, and MAPD (Table 2). The values in italics show extreme statistics from the nine cases for each model on each day.

Triangle1110.960.2433.830.240.460.1313.07−0.09
220.960.2332.55−0.230.460.1211.58−0.08
330.960.2230.50−0.210.460.108.83−0.05
2140.960.2230.85−0.220.460.1212.39−0.08
250.960.2129.56−0.210.460.1110.92−0.07
360.960.2027.51−0.190.460.098.26−0.05
3170.960.2230.40−0.210.460.1010.03−0.03
280.960.2129.11−0.200.460.109.46−0.02
390.960.1927.070.190.460.099.170.00
SEBAL1110.910.2027.680.190.430.087.390.01
220.910.1925.96−0.180.430.087.510.00
330.900.1722.50−0.160.430.088.000.02
2140.910.1723.07−0.160.430.087.610.00
250.910.1519.69−0.140.430.087.720.01
360.910.1519.69−0.140.430.088.320.03
3170.910.1824.34−0.170.440.0910.350.04
280.910.1722.64−0.160.440.0910.970.05
390.910.1419.270.130.440.1012.370.07
METRIC1110.900.1317.880.130.430.088.250.03
220.900.1216.41−0.120.430.088.790.04
330.900.1013.46−0.100.430.0910.200.05
2140.910.1113.96−0.100.430.088.550.03
250.910.1113.96−0.100.430.089.130.04
360.910.0911.07−0.080.430.0910.700.06
3170.910.1115.04−0.110.440.1012.520.07
280.910.1013.59−0.100.440.1113.360.08
390.910.0810.720.080.440.1214.740.09
Figure 5.

Comparison of flux tower EF measurements and EF estimates from the triangle model, SEBAL, and METRIC for nine combinations of selected extremes on DOY 174 (red symbols) and DOY 182 (blue symbols), respectively. Circles denote the combination of hot pixel 1 and the corresponding cold pixels (1–3) shown in each subplot. Diamonds denote the combination of hot pixel 2 and the corresponding cold pixels (1–3). Triangles denote the combination of hot pixel 3 and the corresponding cold pixels (1–3).

[27] First, differing combinations of end-members can result in largely different magnitudes of the EF estimates at EC towers, showing the mean absolute percentage difference (MAPD) for the triangle model ranging between ∼27.1% and 33.8% on DOY 174 and between 9.2% and 13.1% on DOY 182. SEBAL and METRIC show generally smaller discrepancies, with MAPD for SEBAL ranging between 19.3% and 27.7%, and between 7.4% and 12.4%, and MAPD for METRIC ranging between 10.7% and 17.9%, and between 8.3% and 14.7% on the 2 days, respectively. It is also noted that for the three models, the highest and lowest accuracy of the EF estimates occur under two extreme cases, i.e., the greatest/smallest Ts for both end-members. For instance, the triangle model combined with hot pixel 1 and cold pixel 1, both of which are the lowest temperatures in the hot extremes and cold extremes being selected (referring to Table 3 and Figure 3), generated EF with the largest errors in all combinations of end-members. In contrast, the highest accuracy occurred under the combination of hot pixel 3 and cold pixel 3 for both days, which are the highest temperatures in the hot extremes and cold extremes. SEBAL and METRIC also show similar correlation between the retrieval accuracy and the combinations of end-members.

[28] Second, given a cold extreme, the EF estimates from the three models increase with increasing hot extremes and decrease with decreasing hot extremes. Likewise, given a hot extreme, the EF estimates increase with increasing cold extremes and decrease with decreasing cold extremes. This means that the two end-members have a similar function in determining the magnitudes of EF estimates. They both play a key role in determining the overall discrepancy (bias, RMSD, and MAPD) between predictions and measurements. It is important to note that if the two extremes move in the opposite direction, the EF estimates for pixels with moderate Ts values may remain relatively invariant; however, the EF estimates for pixels with Ts values close to the two extremes are changed more prominently due to the varying extremes. So while end-members involve uncertainties, there is still a possibility that EF estimates from spatial variability models show agreement with ground-based measurements at a handful of flux towers with moderate Ts values [e.g., Choi et al., 2009; Timmermans et al., 2007].

[29] Third, in general, the triangle model underestimates EF, showing negative biases for all combinations of extremes for both days (Table 4). The underestimation may be related to both selected end-members being lower than the realistic ones for the triangle model (corresponding to the combination of warm edge 5 and cold edge 3 in Figure 1). As such, with the selected end-members incrementally moving upward (i.e., increasing Tmax and Tmin), errors involved in the EF estimates from the triangle model can be reduced to varying degrees. The same explanation can also be applied to the underestimation of EF from SEBAL and METRIC for DOY 174, with errors generally decreasing with the two end-members being increased. For DOY 182, SEBAL/METRIC overestimates EF by the same amount as the triangle model underestimates. The combination of cold pixel 1 and hot pixel 1 seemed to result in the highest accuracy for both SEBAL and METRIC, with errors generally increasing with the two end-members progressively moving upward.

[30] Fourth, R2 of different models seems to be essentially of the same order (R2 > 0.9 and ∼0.4 for DOY 174 and 182, respectively). Differences in R2 for different days can be ascribed to varying degrees of correlation of Ts and EF measurements. Linear regression analysis of Ts and ground-based EF measurements shows R2 of 0.96 and 0.46 for DOY 174 and DOY 182, respectively (Figure 6), which is the same as R2 for the triangle model, but slightly degraded by SEBAL and METRIC as R2 of∼0.90 and ∼0.44 for DOY 174 and 182, respectively (referring to Table 4). These findings demonstrate that all spatial variability models tested here have no substantially different R2, which characterizes the ability of predictions to explain variability in measurements. Utility of these models appears to depend largely on the capability of Ts to capture EF. Model mechanisms and end-members play a negligible role in determining R2 over relatively homogenous agricultural fields. The primary differences in these spatial variability models lie in the definition of limiting edges of EF, which largely determines the magnitude and frequency distributions of the EF estimates to be shown in section 'EF Estimates at the Watershed Scale Using Different End-Members'.

Figure 6.

Regression analysis of ground-based EF measurements and corresponding remote sensing-based Ts retrievals for DOY 174 and DOY 182.

[31] Fifth, R2 remains fairly invariant using varying end-members for the same model, which means that varying end-members does not impact the R2 of a model but functions in controlling the magnitudes and frequencies of EF estimates. Consistent overestimation or underestimation of EF by a spatial variability model can take place due to an inappropriate selection or specification of end-members or an unrealistic definition of its limiting edges. If ground-based measurements are available, the discrepancies between estimates and measurements can be alleviated by tuning the variables or parameters of end-members. METRIC uses ground-based reference ET and/or the priori knowledge about the crop coefficient to resolve the energy balance equation for the cold extreme, and attempts to infer the possible LE for the hot extreme using a soil water balance model. To that end, spatial variability models entail a calibration procedure to constrain the EF magnitudes within a reasonable range, which is intrinsic in the spatial variability models but has not been discussed in the literature to date.

5.2. EF Estimates at the Watershed Scale Using Different End-Members

[32] Frequency distributions of EF estimates from the triangle model and SEBAL across the study domain for DOY 174 are shown in Figures 7 and 8 (those for METRIC for DOY 174 and all models for DOY 182 are provided in the Supporting Information). Statistical metrics of spatial mean, standard deviation, and skewness (Table 2) for these EF frequency distributions for both days are presented in Table 5. Four key points can be implied in the following.

Figure 7.

Frequency distributions of EF estimates from the triangle model on DOY 174 for nine combinations of extremes corresponding to cases 1–9 in Table 3, showing statistics of spatial mean, standard deviation, and skewness for each case.

Figure 8.

Frequency distributions of EF estimates from the SEBAL model on DOY 174 for nine combinations of extremes corresponding to cases 1–9 in Table 3, showing statistics of spatial mean, standard deviation, and skewness for each case.

Table 5. Statistics of Frequency Distributions of EF Estimates From Triangle, SEBAL, and METRIC Models for Nine Combinations of Selected Extremes on DOY 174 and DOY 182a
ModelCold PixelHot PixelDOY 174DOY 182
μσγμσγ
  1. a

    Cold extremes 1–3 and hot extremes 1–3 are in increasing order of the magnitude of temperature (referring to Figure 3), respectively. Greek letters μ, σ, and γ denote the mean, standard deviation, and skewness of the EF estimates. The values in italics show extreme statistics of each model for the nine cases on each day.

Triangle110.480.170.010.590.160.17
20.490.170.010.610.16−0.17
30.500.170.010.640.14−0.17
210.500.180.010.600.16−0.17
20.510.180.010.620.16−0.17
30.520.170.010.650.14−0.17
310.500.180.010.650.18−0.17
20.510.180.010.670.17−0.17
30.530.170.010.690.150.17
SEBAL110.540.180.190.670.160.43
20.550.18−0.190.690.16−0.43
30.580.17−0.190.710.14−0.43
210.560.19−0.190.670.17−0.43
20.570.19−0.190.690.16−0.43
30.600.17−0.190.720.14−0.43
310.560.19−0.200.720.17−0.44
20.580.19−0.190.730.16−0.44
30.600.180.190.750.150.43
METRIC110.600.150.180.710.140.43
20.610.15−0.180.720.13−0.43
30.620.14−0.180.740.12−0.42
210.610.16−0.190.710.14−0.43
20.620.15−0.180.730.13−0.43
30.640.15−0.180.750.12−0.42
310.620.16−0.180.740.15−0.43
20.630.15−0.190.760.14−0.43
30.650.15−0.190.780.13−0.43

[33] First, the three spatial variability models generated larger EF estimates when the hot or cold extreme moved upward (increasing Tmax or Tmin). This characteristic can be demonstrated by cases 1–3, 4–6, and 7–9 with increasing Tmax and a fixed Tmin for both days, showing a progressively increasing spatial mean for cases 1–3, cases 4–6, and cases 7–9, respectively. This finding is consistent with the performance of the three spatial variability models at EC tower scales discussed in section 'EF Estimates at the Field Scale Using Different End-Members'. Differing end-members for the same model can result in different spatial means of the EF estimates.

[34] Second, varying end-members for these models does not substantially influence the standard deviation and skewness of the EF estimates. The shape of the frequency distribution for the same model remains essentially invariant under the changing end-members. Only the positions of these frequency distributions relative to the origin are different. This is because varying the end-members of a model does not alter its model physics, and consequently, cannot affect the pattern of output. A reasonable magnitude of spatial mean of EF from a model can therefore be derived by properly tuning the magnitude of Tmax and/or Tmin if the detailed and accurate spatial distribution of EF or LE is not the primary issue of concern. This finding would be meaningful to coarse-spatial-resolution images of operational satellites (e.g., Terra/Aqua-MODIS and GOES) as they are often utilized to generate large-scale EF or ETa, instead of capturing EF or ETa at field scales. This is because tuning end-members from coarse-spatial-resolution images makes it possible to generate reasonable magnitudes of regional EF or ETa estimates.

[35] Third, different models show markedly different shapes of the frequency distributions of EF estimates (Figures 7 and 8 and Figures S1–S4). SEBAL and METRIC generated a similar frequency distribution of EF estimates for nine cases on both days, with the skewness ∼0.185 on DOY 174, and ∼0.435 on DOY 182. The EF frequency distributions from the triangle model are different, showing the skewness of 0.01 and −0.17 on DOY 174 and DOY 182, respectively. Differences in the skewness between the triangle and SEBAL-type models are due to different model physics and definitions of limiting edges of EF within the fc-Ts space.

[36] Fourth, the EF frequency distributions from the three models generally exhibit a bimodal separation of flux patterns from corn and soybean fields on DOY 174 (Figures 7 and 8, and Figure S1) and DOY 182 (Figures S2–S4), in which DOY 182 showed a more pronounced bimodal separation. This is because under rapid growth in corn and soybean, differences in heat and water fluxes between the two crops would be more marked on the latter day. Utility of these models to capture spatial differences in water and heat fluxes can be ascribed mostly to the ability of thermal infrared remote sensor data to respond to the reality of the specific time-of-day energy balance partitioning.

5.3. Explicit Expression of Limiting Edges of EF for Triangle and SEBAL-Type Models

[37] Coefficients a and b in equation (17) were derived from the combinations of extremes, which resulted in the highest EF accuracy shown in section 'EF Estimates at the Field Scale Using Different End-Members' (referring to Table 4), i.e., case 9 for DOY 174 and case 1 for DOY 182. Numerical solutions of math formula in equation (17) are shown in Figure 9.

Figure 9.

Temperatures of limiting edges of EF = 1(or ≈1) and EF = 0 intrinsic in the triangle and SEBAL models for a full range of fc on DOY 174 and DOY 182, respectively. Letters A and B represent hot extremes on the 2 days.

[38] As illustrated in section 'Theoretical Expression of Limiting Edges of EF for Triangle and SEBAL-Type Models', temperatures of EF = 0 and EF ≈ 1 (ϕmax·Δ/(Δ+γ) = 0.99 at Ta = 29.6°C for DOY 174 and =0.988 at Ta = 29.4°C for DOY 182) for the triangle model form two horizontal limiting edges, i.e., Tmax and Tmin, throughout the fc-Ts space (Figure 9). Temperature of surfaces where EF = 1 for SEBAL is Tmin. It is interesting to note that temperature of surfaces where EF = 0 for SEBAL is curvilinear, and increases with increasing fc, which seems to contradict a realistic decreasing warm edge with increasing fc [Moran et al., 1994]. The increasing limiting edge of EF = 0 intrinsic in SEBAL is primarily because Ai(Ta) tends to increase with increasing fc, and constants a and b do not account for the effect of variation in fc on the limiting edge of EF = 0.

6. Discussion

[39] There are three key issues regarding the applicability of spatial variability models: (1) how the end-members determine the magnitudes and distributions of EF or LE estimates; (2) whether the end-members can be appropriately selected or determined by the operator or by other automatic methods; and (3) whether the limiting edges of EF within the fc-Ts space can represent the reality for a study site of interest.

6.1. How the End-Members Determine the Magnitudes and Distributions of EF Estimates

[40] To quantify the impact of end-members on EF or LE estimates from spatial variability models, some studies have performed sensitivity analyses and evaluated their accuracy with a relatively dense network of EC or EBBR towers under a certain climate and environmental condition. Note that taking only one pair of end-members from a specific scene of image may not gain a full understanding of how variation in end-members impacts the magnitude, distribution, and accuracy of a model. Timmermans et al. [2007] performed a sensitivity analysis of SEBAL by tuning Tmax and Tmin with ±2 K, and found variations in the H estimates on the order of 20–25%. Long et al. [2011] assessed the sensitivity of SEBAL by using 29 MODIS scenes acquired across varying meteorological conditions and diverse landscapes, indicating that variations of ±2 K in Tmax(Tmin) can result in variations in the H estimates on the order of 10–15%, and variations in ±5 K in Tmax(Tmin) can result in variations in the H estimates on the order of 25–50%. Both hot and cold pixels in SEBAL have a similar effect on the magnitudes of variation in H and therefore LE estimates [Long et al., 2011; Timmermans et al., 2007]. Results of this study at both field tower scale and watershed scales demonstrate that varying magnitudes of Tmax or Tmin can result in a shift in the mean, but cannot greatly alter the shape of the EF frequency distribution, evidenced by generally similar standard deviation and skewness. The frequency distribution of EF is closely related to that of Ts, which is demonstrated across all EC towers in section 'EF Estimates at the Field Scale Using Different End-Members' and by some other studies [e.g., Choi et al., 2009; Gao and Long, 2008]. It can therefore be generalized that for all spatial variability models, elevating Tmax or Tmin can lead to increases in the resulting EF or LE estimates. On the contrary, damping Tmax or Tmin can result in decreases in EF or LE estimates. This finding would be useful for calibration of the spatial variability models because systematic biases of some spatial variability models could be removed or reduced by modifying the magnitudes of Tmax or Tmin, or other associated characteristic variables, e.g., Ahot, which is analogous to elevating or damping Tmax or Tmin. It is important to note that tuning characteristic variables of end-members or end-member selection themselves could reconcile the discrepancies between estimates and measurements for pixels with moderate Ts values in some cases; distortion of the flux estimates could, however, occur over relatively dry or wet surfaces. For instance, increasing Tmax but meanwhile decreasing Tmin may not result in appreciable differences in EF estimates for surfaces with moderate Ts values; the two effects offset each other. However, erroneous EF estimates for pixels close to the upper and lower limiting edges would occur.

6.2. Whether the End-Members Can Be Appropriately Selected or Determined

[41] Selection of end-members is another critical issue that has been controversial in the remote sensing-based ET estimation community. The model developers claimed difficulties and subjectivity in selecting end-members [e.g., Allen et al., 2007; Bastiaanssen et al., 2010], and other researchers have confirmed this through a number of theoretical and experimental studies [e.g., Choi et al., 2009; French et al., 2005a, 2005b; Long and Singh, 2012b; McVicar and Jupp, 1998; Timmermans et al., 2007]. This is because different extents of study sites and spatial resolutions of satellite images could have variable contrast in soil moisture and fc, which may fundamentally determine the magnitudes of Tmax and Tmin and therefore result in differing EF and LE retrievals. For instance, selection of the hot extreme from a humid agricultural field (e.g., the SMACEX) could be problematic, and the cold extreme may not exist over arid and semiarid regions. Moreover, seasonal variation in vegetation and crops could also result in different contrasts in soil moisture and fc. Satellite images acquired in late autumn, winter, and early spring could exhibit more homogeneous wetness and surface cover conditions, which may result in underperformance of spatial variability models. Therefore, determination of end-members and applicability of spatial variability models may also depend on the season. Furthermore, evaluation of SEBAL and METRIC herein has shown that during rapid growth in crops in rainfed fields coinciding with marked variation in soil moisture, locations of end-members may vary with days. This may not hold true over irrigated areas where the cold extreme may be fixed to certain surfaces if contamination by clouds can be favorably removed from a scene [Allen et al., 2007; Long et al., 2011]. In practice, varying extents and spatial resolutions of satellite images would be available, and it is not occasional to obtain images contaminated by clouds. These issues exacerbate difficulties and uncertainties in selecting end-members [Marx et al., 2008; Verstraeten et al., 2005]. It is noted that METRIC tends to do better at the cold extreme by using ground-based reference ET to resolve the energy balance for the cold end. However, the determination of actual ET for the hot extreme appears to be more difficult and sometimes needs consultation with the model developers [e.g., Choi et al., 2009].

[42] Limiting edges for triangle models can also be derived by regression analysis of fc-Ts or α-Ts scatterplots [e.g., Jiang and Islam, 2001; Verstraeten et al., 2005]. To apply the triangle model to arid areas where the lower limiting edge of EF appears to rarely exist in a scene, the upper limiting edge was extended to intersect with fc = 1 to infer a constant lower limiting edge [Tang et al., 2010]. However, this method still overestimates the lower limiting edge, evidenced by pixels with relatively high fc and lower Ts than the inferred lower limiting edge [see Tang et al., 2010, Figure 7]. Derivation of limiting edges appears to be far less than satisfactory, and determination of end-members for spatial variability models has not been fully resolved.

[43] On the other hand, TSEB and SEBS tend to be context independent, not being affected by uncertainties in the selection of end-members [e.g., French et al., 2003; Kalma et al., 2008]. This is achieved by (1) a more realistic description of the two-source scheme involved in TSEB [Kustas and Anderson, 2009; Kustas et al., 2007] or (2) increasing the input effort of meteorological variables (e.g., Ta and ea) to derive wet and dry limits for each pixel for SEBS [Su, 2002]. Some approaches do not necessitate end-member selection, either, e.g., the normalized difference temperature index (NDTI) [McVicar and Jupp, 1998, 1999, 2002], the two-source trapezoid model for ET (TTME) [Long and Singh, 2012a], and a modified TTME model [Yang and Shang, 2013]. Critical in TSEB and SEBS is the use of the Priestley-Taylor equation to determine the wet limit of transpiration from vegetation canopy for TSEB and the use of relatively coarse meteorological fields to determine the wet and dry limits of LE based on the Penman-Monteith equation for each pixel for SEBS. These issues combined with associated procedures were investigated [e.g., Agam et al., 2010] and warrant further study. To fully understand longer-term evaporative dynamics, including the recently widely reported declines in observed atmospheric evaporative ability made by measuring pan evaporation [McVicar et al., 2012], fully physically based estimates of potential ET are advocated.

6.3. Whether the Limiting Edges of EF Within the fc-Ts Space Can Represent the Reality for a Study Site of Interest

[44] Notable differences in the frequency distributions of EF estimates between triangle and SEBAL-type models were observed. Distinct patterns between the one-source models and the two-source models have also been found [e.g., Choi et al., 2009; French et al., 2005a; Gao and Long, 2008; McCabe and Wood, 2006; Timmermans et al., 2007], though they were able to generate comparable estimates of LE or ETa at field scales. We conclude that the end-members are one of the primary reasons responsible for the different frequency distributions. Analyses in sections 'Theoretical Expression of Limiting Edges of EF for Triangle and SEBAL-Type Models' and 'Explicit Expression of Limiting Edges of EF for Triangle and SEBAL-Type Models' demonstrate that the upper limiting edges of EF between the triangle model and the SEBAL-type model are different, with the limiting edges of the triangle model being explicitly shown and remaining constant across the fc-Ts space, and the upper limiting edge of the SEBAL-type models being implicitly shown and increasing as fc increases. These limiting edges can only be taken as a rough approximation of reality. Different wet and dry limits for each fc class for different models can result in different magnitudes of EF estimates and consequently the varying frequency distributions over the modeling domain.

[45] Timmermans et al. [2007] indicated that tuning Tmax or Tmin for a specific land cover could minimize the discrepancies between H estimates and ground-based measurements, which, however, was compromised by degrading H estimates for other land cover types. This finding can be further explained by the study herein. For instance, tuning Tmax so as to make the upper limiting edge of the SEBAL-type models for low fc surfaces closer to the theoretical limiting edges depicted by a trapezoid space [Long and Singh, 2012a; Moran et al., 1994] (referring to trapezoid ABCD in Figure 1), it is likely to increase the discrepancies over high fc surfaces. This may also explain the reason why tuning the ETrF for the bare surface in METRIC can lead to improvements in the LE estimates for a certain crop but degrade the estimates for the other [Choi et al., 2009].

[46] The critical studies mentioned earlier suggest that determination of end-members of spatial variability models tends to be far less than satisfactory, depending largely on subjectivity, spatial extent, and/or resolution of satellite images. As output of these spatial variability models is not deterministic in most cases, utility and robustness of these models are severely impaired. Efforts have been made to more realistically depict the limiting edges of spatial variability models. Based on field experiments and theoretical analysis, Moran et al. [1994] developed a trapezoid framework for the vegetation index-Ts space to configure the limiting edges of crop water deficit, with the hot edge decreasing with increasing vegetation index. McVicar and Jupp [1998] developed the NDTI approach, which calculates extreme temperatures at meteorological stations by inverting a specific-time-of-day resistance energy balance model (REBM). We suggest that derivation of end-members based on Moran et al. [1994]'s framework would be an effective way to amend the limiting edges of spatial variability models, and therefore provide more reliable spatial patterns of EF or LE estimates.

7. Conclusion

[47] We examined the impact of end-member selection on the performance and mechanisms of error propagation of three satellite-based spatial variability models for ETa estimation, i.e., the triangle model, SEBAL, and METRIC. Varying end-members can result in markedly different magnitudes of EF estimates at both field and watershed scales. The hot and cold extremes exercise a similar impact on the discrepancy between EF estimates and ground-based measurements, i.e., given a hot (cold) extreme, the EF estimates tend to increase with increasing cold (hot) extreme, and decrease with decreasing cold (hot) extreme. Predictability of all spatial variability models depends primarily on the capability of Ts to capture EF and definitions of limiting edges of EF within the remotely sensed fc-Ts space. In most cases, the end-members cannot be appropriately determined from the fc-Ts space, because (1) they do not necessarily exist within a scene, varying with the spatial extent and resolution of satellite images being used; and/or (2) different operators can select different end-members. In addition, the limiting edge of EF = 0 in the fc-Ts space varies with the model, with SEBAL-type models showing an increasing curvilinear edge, which contradicts a decreasing edge shown in a trapezoidal framework [Moran et al., 1994]. Varying end-members cannot substantially affect the standard deviation and skewness of the EF frequency distribution. However, different models can generate remarkably different EF frequency distributions due to differing limiting edges. As such, the spatial variability models require careful calibration to infer reasonable EF limits and then LE and ETa estimates. In water resources management, these spatial variability models should be used with great caution, because ETa estimates can be significantly different due to slight differences in selected end-members.

Acknowledgments

[48] We greatly thank the National Snow and Ice Data Center for providing the SMACEX data sets to perform this study. The authors are grateful to the Associate Editor and three reviewers who provided thorough and constructive reviews. The manuscript has improved as a result. This work was financially supported by the United States Geological Survey (USGS, project 2009TX334G).

Ancillary