Evaluation of WaPOR V2 evapotranspiration products across Africa

The Food and Agricultural Organization of the United Nations (FAO) portal to monitor water productivity through open‐access of remotely sensed derived data (WaPOR) offers continuous actual evapotranspiration and interception (ETIa‐WPR) data at a 10‐day basis across Africa and the Middle East from 2009 onwards at three spatial resolutions. The continental level (250 m) covers Africa and the Middle East (L1). The national level (100 m) covers 21 countries and 4 river basins (L2). The third level (30 m) covers eight irrigation areas (L3). To quantify the uncertainty of WaPOR version 2 (V2.0) ETIa‐WPR in Africa, we used a number of validation methods. We checked the physical consistency against water availability and the long‐term water balance and then verify the continental spatial and temporal trends for the major climates in Africa. We directly validated ETIa‐WPR against in situ data of 14 eddy covariance stations (EC). Finally, we checked the level consistency between the different spatial resolutions. Our findings indicate that ETIa‐WPR is performing well, but with some noticeable overestimation. The ETIa‐WPR is showing expected spatial and temporal consistency with respect to climate classes. ETIa‐WPR shows mixed results at point scale as compared to EC flux towers with an overall correlation of 0.71, and a root mean square error of 1.2 mm/day. The level consistency is very high between L1 and L2. However, the consistency between L1 and L3 varies significantly between irrigation areas. In rainfed areas, the ETIa‐WPR is overestimating at low ETIa‐WPR and underestimating when ETIa is high. In irrigated areas, ETIa‐WPR values appear to be consistently overestimating ETa. The relative soil moisture content (SMC), the input of quality layers and local advection effects were some of the identified causes. The quality assessment of ETIa‐WPR product is enhanced by combining multiple evaluation methods. Based on the results, the ETIa‐WPR dataset is of enough quality to contribute to the understanding and monitoring of local and continental water processes and water management.

combining multiple evaluation methods. Based on the results, the ETIa-WPR dataset is of enough quality to contribute to the understanding and monitoring of local and continental water processes and water management.

K E Y W O R D S
accuracy assessment, actual evapotranspiration, consistency, continental Africa, direct validation, penman-Montieth, remote sensing, water resources management 1 | INTRODUCTION Actual evapotranspiration (ETa) is the second largest process in the terrestrial water budget after precipitation (PCP). ETa is also an essential component of plant growth and, therefore, the carbon cycle. Available water resources are becoming, or are already scarce, in many basins worldwide (Degefu et al., 2018). The acceleration of the water cycle from a climate change perspective will further influence water availability not only for human consumption but also our food sources (Rockström, Falkenmark, Lannerstad, & Karlberg, 2012). For this purpose, accurate estimates of ETa are required for several management tasks, including, but not limited to, water accounting, water footprint, basin-wide water balances, irrigation, crop management and monitoring of climate change and its impact on crop production. These activities require ETa at varying extents and spatio-temporal resolutions.
Validation of these remote sensing products is an essential step in understanding their applicability and characterize uncertainty. This uncertainty can guide if the ETa product is suitable as input into different water management activities along with the associated risk when making a decision based on the product. Many studies exist that attempt to validate large remote sensing-based ETa datasets. Most studies are focused on one or two validation methods at one scale. The most common validation methods are either point or pixel scale against ground-truth data, like eddy covariance (EC) measurements (e.g. Mu et al., 2011), or spatial intercomparison of a product over regions, land classes, biomes (e.g. Mueller et al., 2011). Some authors validate multiple products against each other for spatial and temporal patterns and against groundtruth data (e.g. Hu, Jia, & Menenti, 2015;Nouri et al., 2016). Liu et al. (2016) evaluated basin-scale ETa estimates against the water balance method. Velpuri, Senay, Singh, Bohms, and Verdin (2013) compared MOD16 (1 km) at point scale to EC and at basin scale to the water balance. Other than a few occasions, for example, Velpuri et al. (2013), these validation efforts often failed to evaluate the product at multiscale, from pixel to basin or region.
The best-practice validation strategies of big remote sensing datasets have been proposed by Zeng et al. (2015Zeng et al. ( , 2019. They recommend multi-stage validation activities that include combinations of direct validation, physical validation and cross-comparisons. In practice, many developers of remote sensing products include all or at least a combination of these activities during their validation. To name a few, these include the MODIS MODLAND product (Morisette, Privette, & Justice, 2002;Morisette, Privette, Justice, & Running, 1998); Copernicus Global Land Service products Dry Matter Productivity (Swinnen, Van Hoolst, & Toté, 2015); and ASTER land surface temperature (Schneider, Ghent, Prata, Corlett, & Remedios, 2012).
In regions such as Africa, where little observational data is available, validation should utilize all available avenues for ascertaining product quality, with a multi-step and -phase validation strategy that includes direct validation (with ground measurements), physical consistency check and cross-comparisons. As such, the limitations due to the sparseness of available data are reduced, and the product quality is understood from a multi-scale perspective, by using validation bestpractice and combining multiple validation techniques.
The latest available database of continental products, released in 2019, for Africa and the Middle East, is now available on FAO portal to monitor water productivity through open-access of remotely sensed derived data (WaPOR; https://wapor.apps.fao. org/home/WAPOR_2/2). It provides the highest available spatial resolution for an operational open-access ETa and interception (ETIa-WPR) product at the continental scale. This article presents a multi-scale validation of the version 2 (V2.0) ETIa-WPR. The results from each validation procedure were analysed individually and then as a whole to determine trends and draw conclusions of the product quality.
2 | DATA AND METHODS

| The dataset
The analysis dataset is the ETIa-WPR V2.0 products available on the WaPOR portal (https://wapor.apps.fao.org/home/WAPOR_2/1). The ETIa-WPR is based on a modified version of the ETLook model (ETLook-WaPOR) described in Bastiaanssen, Cheema, Immerzeel, Miltenburg, and Pelgrum (2012). The ETLook-WaPOR model uses Penman-Monteith to estimate ETa adapted to remote sensing input data (FAO, 2020a). The Penman-Monteith approach uses the combined approaches of the energy balance equation and the aerodynamic equation and is described in the FAO-56 drainage paper (Allen, Pereira, Raes, & Smith, 1998). The ETIa-WPR defines soil evaporation and transpiration separately using Equations (1) and (2) where E and T (mm/day) are the evaporation and transpiration, respectively and λ is the latent heat of vaporization. R n (MJ m −2 day −1 ) of the soil (R n,soil ) and canopy (R n,canopy ) is the net radiation and G (MJ m −2 day −1 ) is the ground heat flux. ρ air (kg/m 3 ) is the density of air, C P (MJ kg −1 C) is the specific heat of air,(e sat − e a ) (kPa) is the vapour pressure deficit (VPD), r a (s/m) is the aerodynamic resistance, r s (s/m) is the soil resistance, or canopy resistance when using the Penman-Monteith-model to estimate evaporation or transpiration, respectively. δ = d(e sat )/dT (kPa/ C) is the slope of the curve relating saturated water vapour pressure to the air temperature, and γ is the psychometric constant (kPa/ C). This approach partitions the ETIa-WPR to evaporation and transpiration using the modified versions of Penman-Monteith, which differentiate the net available radiation and resistance formulas based on the vegetation cover according to the ETLook model (Bastiaanssen et al., 2012). A major difference between ETLook-WaPOR and ETLook is the source of remote sensing data for the soil moisture. In the original ETLook soil moisture is derived from passive microwave, and in the WAPOR approach soil moisture is derived from land surface temperature (LST).
The WaPOR database provides ETIa-WPR in three spatial resolutions dependent on the location and extent. The products available specifically for Africa are shown in Table 1 and are available online on the WaPOR portal (https://wapor.apps.fao.org/home/WAPOR_2/1).
Interception (I) is the process where the leaves intercept rainfall.
Intercepted rainfall evaporates directly from the leaves and requires energy that is not available for transpiration. Interception (mm/day) is a function of the vegetation cover, LAI and PCP.  (Rienecker et al., 2011). The weather data is resampled using a bilinear interpolation method to the 250 m resolution. The temperature is also resampled based on elevation data.

| Validation approach and workflow
The validation approach comprises three components, physical validation, direct validation and level consistency ( Figure 1). The physical validation and direct validation were undertaken on the L1 product for the period 2009-2018. The physical validation (Section 2.3) includes an assessment of the water balance and water availability (Section 2.3.1) and a spatial and temporal consistency check (Section 2.3.2) for the extent of Africa. The water balance utilizes other existing continental datasets to complete the water balance and is therefore also considered cross-validation. The spatial and temporal consistency checks if spatial and temporal patterns were being captured. The direct validation (Section 2.4) involves a comparison to ETa estimations from EC stations. The level consistency (Section 2.5) checks for the consistency between levels and therefore indicates if the quality of the L1 product is representative of the L2 and L3 products.

| Water balance and water availability
The basin-scale performance of ETIa-WPR is analysed for 22 major hydrological basins of Africa (Lehner & Grill, 2013) through three approaches ( Figure 2). First, the ETIa-WPR was compared to the PCP on an annual basis to analyse the water consumed through ETIa to the water available from PCP. F I G U R E 1 Validation approach used in the validation of the ETIa-WPR product in Africa Second, the basin-scale water balance approach compared the long-term ETIa-WPR product to the long-term ETa derived from the water balance (ETa-WB). In many studies, the long-term water balance (>1 year) for large basins assume a negligible change in storage (Hobbins, Ramírez, & Brown, 2001;Wang & Alimohammadi, 2012;Zhang et al., 2012). The long-term water balance, taken from 2009 to 2018 in this case, is therefore defined using Equation (2).
where PCP is the long-term precipitation and Q is the long-term basin runoff or streamflow and the ETa-WB is the long-term ETa derived from the water balance. The PCP product found in the WaPOR portal was obtained from the Climate Hazards Group Infrared Precipitation with Stations (CHIRPS) dataset (Funk et al., 2015). The long-term Q was obtained from the Global Streamflow Characteristics Dataset (GSCD) (Beck, De Roo, & Van Dijk, 2015

| Spatial and temporal consistency
The temporal and spatial trends were observed over the African continent in space and time by observing mean ETIa-WPR, SMC and NDVI for all climate zones during the study period on a dekadal basis. The Koppen-Geiger classification ( Figure 2) is used to consider the mean dekadal values for the main climatic zones in Africa (Kottek, Grieser, Beck, Rudolf, & Rubel, 2006). A sample size of 30,000 stratified random pixels is used to represent the continental. This corresponds to less than 0.01% of the total image, however, is considered suitable to represent seasonal trends for the major climate zones. The arid or desert class-B-dominates Africa (57.2%), followed by the tropical class-A (31.0%) and then warm temperate-C (11.8%). The largest sample count corresponds to the largest climatic zones, with a linear 1:1 line representing area to count. The data is further disaggregated

| Direct validation
The ETIa-WPR is compared to the in situ ETa from EC fluxes (ETa-EC) at a dekadal scale using 14 locations (13 across Africa and 1 in the Spain extension area) ( Figure 2). The country, station code, vegetation, climate zones and available data for comparison-for both WaPOR and the local site, are shown in Table 3  GEOS-5 (VPD and RET) and MSG (RET only), as compared to being derived from satellite images. GEOS-5 and MSG are available daily and satellite image gaps do not influence the quality of the VPD and RET quality. The RET-EC was estimated using the same method adopted by WaPOR (FAO, 2020a), which is based on FAO-56 (Allen et al., 1998), and was derived from in situ (EC) meteorological data.
where r s is taken as 70 s/m, r a is taken as 208/u obs and u obs is the observed wind speed (m/s) at 10 m. irrigation areas for all pixels. Table 4 shows the description of each L3 irrigated area. The EC station at Zankalon is located in a L3 area.

| Level consistency
Therefore, as part of the level consistency, all three levels were also compared to the ETa-EC at this station. The method described in Section 2.4 was used to extract the L3 and L3 ETIa-WPR at the station.

| Spatial and temporal consistency
The mean ETIa-WPR, SMC and NDVI were plotted for all climate zones for the northern and southern hemisphere. Figure 6 shows some examples of the largest sub-zones per main climate; wet tropical-savanna (Aw), arid-desert-hot (Bwh) and temperate dry winter-warm summer (Cwb

| Direct validation
The agreement between ETIa-WPR and ETa-EC is shown in Figure 7 and Table 6. Figure 7 shows the time series of ETIa-WPR and ETa-EC for all available in situ data from all EC stations. Table 6 shows the corresponding metrics for each station, including correlation (r), root mean square error (RMSE), bias, mean average percent error (MAPE) the coefficient of determination (R 2 ) and the average NDVI and LST quality for the comparison period. A good overall correlation (r = 0.71) is found between all sites and observations. Substantial variations existed between sites. Consistency in results is seen between years for most sites. The ETIa-WPR typically captured seasonality at most sites.
The best-performing sites in terms of correlation and R 2 are SN-

SD-DEM. SD-DEM does overestimate ETIa-WPR when ETa-EC is low
and NDVI is low. These sites are also associated with having highquality LST and NDVI layers (the average LST quality for the comparison period is equal to or less than 1). This site has an inferior NDVI quality layer and a very low correlation with VPD. As a result, errors in the input meteorological data may highly influence ETa-EC estimates at the site.
The results show noticeable improvement for all metrics on average across all sites on a monthly scale (Figure 8 and Table 7). ETa-EC when ETa-EC is less than 1.6 mm/day and underestimating ETa-EC when ETa-EC is greater than 1.6 mm/day.

| Level consistency
The consistency between the ETIa data products for the L1 and L2 data products is high. The ETIa-WPR RMSE, between L1 and L2, for The median correlation for all dekads in the study period is 0.84, and the median R 2 is 0.84. The RMSE is highest when the ETIa-WPR is highest. The RMSE temporal trend is in line with the seasonal trend in the Awash and displays the two seasons associated with the F I G U R E 8 The relationship between monthly mean daily ETIa-WPR (mm/day) plotted against monthly mean daily ETa-EC (mm/day). Only months with valid observations for all dekads within that month are included. The dotted black line represents the linear regression, and the red line represents the 1:1 line

T A B L E 7
Statistics comparing monthly ETIa-WPR with ETa-EC in 14 locations; more information about sites is available in Table 3 Month intertropical convergence zone. The correlation is above 0.73 on 95% of dekads, and lowest on dekads when the mean ETIa-WPR is highest.
The Koga has the lowest consistency of the schemes. Although the RMSE between L1 and L3 is lower, ranging from 0.3 to 0.7 mm/ day, the median correlation is 0.67, and the median R 2 is 0.45. Zankalon performed slightly better, with a median correlation of 0.71 and a median R 2 of 0.51. The RMSE is higher in Zankalon than the Koga, but this reflects the higher ETIa-WPR values found in the area.
The ODN had the same RMSE (0.6 mm/day) as Zankalon and the highest range of RMSE (0.2-1.6 mm/day). The correlation and R 2 are also similar, with median values of 0.73 and 0.53, respectively. All schemes show similar per cent bias medians (9-12%). The only scheme that shows a systematic bias is ZAN, where the L1 is consistently higher ETIa-WPR values than L3.
The 10-daily average ETa-EC and ETIa-WPR for all three spatial resolutions at EG-ZAN are shown in Figure 10.   (Glenn et al., 2007;Wilson et al., 2002).
Underestimation bias is larger than overestimation bias and increases with increasing ETIa-WPR. However, Africa as a continent is dry with The WaPOR SMC is considered, on average, high in arid regions (e.g. Figure 6) and therefore, ETIa-WPR is likely not effectively accounting for soil moisture limitations. The high SMC is resulting in an overestimation of the evaporation component in particular, as NDVI is low and therefore the region is dominated by the evaporation component of ETIa-WPR. Arid regions should be largely regulated by water availability rather than energy. Conversely, under well-water conditions, the Penman-Monteith method is primarily driven by Rn (e.g. energy limited) (Rana & Katerji, 1998). As Penman-Monteith is a linearized approximate solution, problems may occur in extreme conditions and errors in the soil evaporative term (Leca, Parisi, Lacointe, & Saudreau, 2011). Majozi et al. (2017b) (Rana & Katerji, 1998). Extreme conditions include when aerodynamic resistance is high, >50 m/s (Paw, 1992). High aerodynamic resistance can occur in sparse vegetation, when surface temperature is much greater than air temperature (e.g. water-stressed conditions) and when wind speed is very low (Dhungel, Allen, Trezza, & Robison, 2014;Paw, 1992). Cleverly et al. (2013) and Steduto, Todorovic, Caliandro, and Rubino (2003) found when the standard aerodynamic resistance values were used the Penman-Monteith method over-and underestimated RET when RET is low and high respectively and suggested the aerodynamic F I G U R E 1 2 Upper-number of observations for a given ETa-EC range. Lower-bias of dekadal ETIa-WPR (mm/day), as compared to ETa-EC, plotted against the increasing ranges of ETa-EC (mm/day) for observations at natural vegetation sites (orange bar), irrigated agriculture sites (blue bar) and all sites (grey bar). Note that the ETa-EC in non-irrigated sites was only greater than 5.5 mm/day for three observations, they are not included in the bias calculations shown in the figure as it is not considered a representative sample size resistance should vary with climatic variables as it is responsive to relative humidity gradients.
It is recommended to further verify the behaviour of the SMC.  (Table 4), as does the EG-SAA and EG-SAB. Therefore, these sites may be particularly influenced by this effect as 0.2 ha is 3% of an L1-250 m pixel, 20% of an L2-100 m pixel and 200% of an L3-30 m pixel (e.g. see Figure 10).

| Why is WaPOR misrepresenting ETIa when ETIa is high in humid conditions?
ETIa-WPR is not representing ETa well in water unlimited conditions with high humidity. The Penman-Monteith method is not suitable for very low VPD (or high humidity) (Paw & Gao, 1988). Further, for tall crops, the VPD can have a considerable influence on the error (Rana & Katerji, 1998). It is not suitable in these conditions because of the linear assumption of saturated vapour pressure and air temperature. Paw (1992) advised that the use of non-linear equations should be used in extreme conditions to maintain errors of less than 10-15%.
Quality of input data is likely affecting the quality of the ETIa-WPR in these regions. Low-quality data or missing RH data means VPD is calculated from Tmin. In humid climates condensation occurs during the night, which leads to an overestimation of VPD (Allen et al., 1998), which is found when Penman-Monteith is applied without RH data in humid regions of Ecuador (Córdova, Carrillo-Rojas, Crespo, Wilcox, & Célleri, 2015). In water unlimited regions, the overestimation of VPD can lead to higher ETa, as it is easier for the flux to occur when there is less moisture in the air. Further, these regions frequently contain low-quality NDVI and LST layers in these regions. This is resulting for example, in overestimation of radiation at GH-ANK skewing results at this location. The NDVI and LST quality layers are therefore a good indicator of the quality of the ETIa in these regions.

| Product consistency
There is very high consistency between L1 and L2 products. The high consistency is partly explained by the use of a downscaled MODIS product before the introduction of PROBA-V in 2014 and the SMC component, which is based on MODIS for both L1 and L2 for the entire database period. The high consistency suggests that at a given scale, for example basin scale, the 100 m product provides no additional value to the 250 m resolution. However, at higher resolution applications, the product does show spatial variation not captured by the L1 product (e.g. Figure 11) and may provide better insight into intra-and inter-field level variations.
The consistency between the L1 and L3 products is mixed. The Awash and ODN L3 areas show high consistency between L1 and L3.
In the Koga, there is a strong positive bias for L1 ETIa-WPR, while the agreement between L1 and L3 in the Koga and in Zankalon is lower.
These errors are likely largely attributed to the different input temporal and spatial resolutions available from the satellite platform combined with high spatial and temporal heterogeneity in the area (e.g. Koga and Zankalon have much smaller irrigated fields and higher crop diversity than the Awash and ODN-see Table 4). All levels have a dekadal timestep. However, the satellite revisit period varies, having revisits of 1-day, 2-days and 16 days for MODIS (L1), PROBA-V (L2) and Landsat (L3), respectively, with daily meteorological data input. The variation in the revisit period can lead to differences when interpolating images to a dekadal timescale, particularly in rainy periods and during the growing season (Gao, Masek, Schwaller, & Hall, 2006). Uncertainty of up to 40% has attributed to the difference in a 16-day revisit as compared to 4-day revisit, depending on climate and season (Guillevic et al., 2019), though this was without daily meteorological data as a tool for interpolation. Conversely, the L3 dataset can capture more spatial variability for a given image as compared to the L1 and L2 data, which is highly important when using non-linear models. Therefore, the L3 dataset is expected to perform better in areas of higher spatial heterogeneity (Sharma, Kilic, & Irmak, 2016).

| CONCLUSIONS
The WaPOR products for Africa and the Middle East provide the highest resolution continuous near real-time products available so far to monitor ETIa. Current validation efforts need to be continued and intensified to confirm the suitability of these products for various uses. However, significant issues with the sparseness of available ground-truth measurements make direct validation to in situ, insufficient as a sole means to validate the ETIa product over continental Africa. To compensate for insufficient ground-truth locations, we added physical consistency and level consistency checks as part of the validation analyses.
The ETIa-WPR product is responsive to general trends in the magnitude of ETIa for most climates and shows good correlations at both local (EC) and basin (WB) scales. In dry irrigated areas, WaPOR appears to be overestimating ETIa, particularly the coarse resolution.

DATA AVAILABILITY STATEMENT
Data available on request from the authors, except for SMC and NDVI data layers which were provided by the FRAME consortium.