Global canopy rainfall interception loss derived from satellite earth observations

Information on global canopy rainfall interception loss is essential for understanding the dynamics of land surface processes and the water cycle. In addition, large uncertainty remains regarding the global variation in this factor. The development of satellite earth observations has provided a great opportunity for the estimation of global rainfall interception loss to reveal the spatiotemporal variations. In the current study, the analytical Gash model was adapted and applied to estimate global rainfall interception loss from 2001 to 2015 on the basis of satellite remote‐sensing products, for example, gross rainfall amount and rate and leaf area index. The Dalton‐type equation was adopted to estimate the wet canopy evaporation rate in the revised Gash model. The estimation on the basis of the Dalton‐type equation showed better agreement with the ground observations than the classical Penman–Monteith equation in terms of estimated rainfall interception loss and its ratio to gross rainfall. Large spatial variations in global rainfall interception loss were found, and high values appeared in the regions with dense vegetation cover and high gross rainfall amounts, for example, tropical rainforests, whereas high rainfall interception to gross rainfall ratios occurred in the regions with low rain rates and high vegetation cover. This study constitutes a significant step forward in understanding global hydrological cycle on the basis of earth observation products.

been widely conducted to measure interception loss, these in situ observations are still limited to specified locations and times, and long-term interception loss monitoring is rare (Komatsu, Shinohara, Kume, & Otsuki, 2008;Soto-Schonherr & Iroume, 2016;Yan et al., 2015;Zhang, Zhao, Li, Huang, & Tan, 2016). There is also large uncertainty when upscaling these site-specific observations to the plot or regional scale (Manfroi et al., 2006), especially in regions with sparse vegetation or heterogeneous landscapes. These seriously constrain the availability of information on the spatial and temporal variations in interception, which could be utilized for further regional or global analyses and applications in global change studies.
With the development of remote-sensing technologies, satellite earth observations have become an attractive option for determining the spatial and temporal variations in variables related to land surface processes due to the advantages of vast coverage, fine spatial representations, low costs, and the applicability in ungauged areas. The estimation of land surface water and heat flux has long been highlighted in remote-sensing studies (Bastiaanssen et al., 1998;Norman et al., 1995;Mu et al., 2007;Su 2002). However, remote-sensing-based rainfall interception loss research was only recently proposed (Cui & Jia, 2014;Cui, Jia, Hu, & Zhou, 2015;de Jong & Jetten, 2007;Galdos, Alvarez, Garcia, & Revilla, 2012;Miralles, Gash, Holmes, De Jeu, & Dolman, 2010;Nieschulze, Erasmi, Dietz, & Holscher, 2009;Zheng & Jia, 2016). Initially, simple relationships between variables retrieved by remote sensing and interception were introduced. For example, regression relations between satellite-observed vegetation parameters and the canopy water storage capacity were built to calculate regional interception (de Jong & Jetten, 2007). Moreover, a regression model of normalized difference vegetation index was directly constructed to predict the spatial variations in rainfall interception, and this model was applied to a tropical forest (Nieschulze et al., 2009). The second approach, which was based on a physical model, was developed and utilized to improve the integration of remote-sensing variables, for example, a simple conceptual method was presented to integrate moderate resolution imaging spectroradiometer normalized difference vegetation index and leaf area index (LAI) data to estimate the distribution of interception by utilizing a physical-based model (Galdos et al., 2012). The global estimation of interception was also presented by an analytical model driven by multiple satellite observations, mainly including precipitation, lighting intensity, and canopy cover information, which were further included in the Global Land-Surface Evaporation: The Amsterdam Methodology (GLEAM) model for global evapotranspiration estimation (Miralles, De Jeu, Gash, Holmes, & Dolman, 2011). The Poisson distribution was further adopted to express the subpixel heterogeneous of vegetation index in the analytical model to improve the estimation of interception loss over heterogeneous surfaces on the basis of remote-sensing observations of forest canopy structure parameters and gross rainfall (Cui et al., 2015;Cui & Jia, 2014). This model was developed under the assumption that the Poisson distribution was more suitable than the widely adopted normal distribution for describing the effects of heterogeneous forest coverage density on interception loss. These former studies mostly utilized satellite-observed vegetation index and P g data to estimate interception, and they ignored the impact of rainfall characteristics on the interception estimation. However, it has long been recognized that rainfall interception loss depends strongly on the characteristics of a rain event, the canopy structure, and meteorological variables, which control the evaporation during and after rainfall (Gash, 1979;Gash, Lloyd, & Lachaud, 1995;Muzylo et al., 2009;Rutter, Robins, Morton, & Kershaw, 1972;Staelens, De Schrijver, Verheyen, & Verhoest, 2008). The remote-sensing-based interception estimation fails in considering these control factors, and it is most likely caused by the knowledge gap between remote-sensing science and hydrological modelling.
This study aims to contribute to the development of a physical model driven by satellite observations to estimate global rainfall interception loss. To do this, the widely used Gash analytical model (Gash, 1979;Gash et al., 1995)

| Revised Gash model
The Gash analytical model was first established on the basis of dense forests in the middle-high latitude forest regions and was further adapted for application in sparse forest canopies (Gash, 1979;Gash et al., 1995). This model assumes that rainfall is intercepted during a discrete storm, and the intercepted water is lost completely before the next storm. There are three phases of rainfall interception loss: (a) wetting phase as rainfall reaches the canopy; (b) saturation phase as the canopy reaches its maximum water storage capacity; and (c) drying phase after rainfall has ceased.
Evaporation is calculated for each phase with an average evaporation rate for all rainfall events. Although it is an event-based model, it has been successfully applied on a daily basis with the assumption of one storm per rain day (Cui & Jia, 2014;Zheng & Jia, 2016).
Different from the initial Gash model, which estimates the rainfall interception by the canopy and trunks separately, this study estimates the total rainfall interception loss (E I ), considering the canopy and trunk as an integration following Zheng and Jia (2016).
where Fc is the vegetation cover fraction; P g,i (millimetre) is the gross rainfall of the ith of n total rainfall events; Ē/ R is the ratio of monthly averaged wet canopy evaporation rate (Ē; millimetre per hour) to the monthly averaged rainfall rate ( R ; millimetre per hour); and P 0 g is the threshold value of precipitation required to saturate the vegetation, which is estimated as where S v is the vegetation capacity (millimetre) equal to the sum of the storage capacity of the canopy and that of the trunks.
where LAI is the leaf area index; S L is canopy storage capacity per unit leaf area (millimetre) given according to the International Geosphere-  (Levia & Germer, 2015;Wallace et al., 2013). For simplification, the current study assumed a linear relation between S T and canopy height to interpolate S T , which is reasonable because the trunk and branch volume of a high tree is generally large.
The value of Ē/ R is considered the most sensitive parameter in the Gash model (Gash et al., 1995;Cui & Jia, 2014). Ē and ‾ R were assumed to be constant within one month, and they were estimated where h r is the monthly rainfall hours estimated from a 30-min precipitation product, P g,i is the rainfall amount of ith of n total rainfall events within the month from precipitation product, and E wet is the wet canopy evaporation rate. A former study also estimated R on the basis of the empirical relationship with the lighting frequency (Miralles et al., 2010). E wet was previously estimated following the Penman-Monteith equation (Cui & Jia, 2014;Holwerda, Bruijnzeel, Scatena, Vugts, & Meesters, 2012;Rutter et al., 1972). However, recent studies suggested that net radiation plays a minor role in wet canopy evaporation, and net radiation is typically very small due to cloud cover, and the energy required to sustain wet canopy evaporation has been mainly attributed to advection energy (van Dijk et al., 2015), and the where ρ is the air density (kilogram pe cubic metre); c p is the specific heat of air at constant pressure (J kg −1 K −1 ); λ is the latent heat of vaporization (MJ kg −1 ); γ is the psychrometric constant (kPa K −1 ); e * s −e a is the water vapour pressure deficit between the saturated surface and the overlying air during raining days (kilopascal); and r a is the bulk boundary layer resistance (second per metre), which is estimated mainly based on wind speed and canopy height (Allen, Pereira, Raes, & Smith, 1998). The Penman-Monteith equation was applied following the Food and Agriculture Organization 56 Irrigation and Drainage Paper (Allen et al., 1998), and the surface resistance was set to 0 constantly for E wet, PM estimation (Cui et al., 2015).

| Data sources and practical applications
The standard moderate resolution imaging spectroradiometer land cover product (MCD12), which has 17 land cover types on the basis of the International Geosphere-Biosphere Programme classification, was collected to separate different plant types. The LAI data were collected from the Global LAnd Surface Satellite (GLASS) product (Xiao et al., 2014) to estimate Fc on the basis of Equation (7). Briefly, the GLASS LAI product was produced mainly based on general regression neural networks, with a spatial resolution of 1 km and temporal resolution of 8 days. In addition, Fc in Equation (1) is estimated according to its relationship with LAI following the Beer's law (Carlson & Ripley, 1997;Nilson, 1971) where a is the light extinction coefficient and LAI is the green leaf area index.
The global gridded near-surface meteorological forcing data, including air temperature, air pressure, dew point temperature, wind speed, and downward shortwave and longwave radiation fluxes were retrieved from the ERA5 data set (http://apps.ecmwf.int/datasets/) to estimate the wet canopy evaporation in Equation (6) and averaged to E in Equation (4). The downloaded 6-hr meteorological forcing data with 0.125 resolution were averaged to daily and 8-day resolutions and then downscaled to 1 km using statistical downscaling approaches following Hu and Jia (2015).
The global canopy height map at 1-km spatial resolution was collected, and it was obtained on the basis of the laser altimeter data onboard ICES using the regression tree approach (Simard, Pinto, Fisher, & Baccini, 2011). The canopy heights of high vegetation, including forests and shrubs, were directly set as the values from the global canopy height map. The canopy heights of short vegetation, for example, grasses and crops, were parameterized following former studies (Cui et al., 2015).
The CMORPH V1.0 precipitation products, including the CMORPH raw data set and blended data set, were collected (ftp.cpc. ncep.noaa.gov/precip/CMORPH_V1.0) to provide the rainfall information. The CMORPH V1.0 raw precipitation product, with a temporal resolution of 30 min and a spatial resolution of 8 km, was also adopted to obtain the monthly averaged rain rate following Equation (5). Because the CMORPH V1.0 raw precipitation product only covers the zone between 60 S to 60 N, for the region beyond 60 N, which is not covered by CMORPH V1.0 raw precipitation product, the monthly averaged rain rate was set to constant as the mean synoptic rainfall rate (Miralles et al., 2010). The CMORPH V1.0-blended precipitation product at 0.25 spatial resolution is an integrated product that combines infrared and passive microwave satellite observations, and it has been blended with gauge observations to provide high-accuracy precipitation data with global coverage (Joyce, Janowiak, Arkin, & Xie, 2004;Xie & Xiong, 2011). The rainfall data were obtained by separating the total precipitation into liquid (rainfall) and solid (e.g., snowfall) forms according to the air temperature threshold method (Clark et al., 2006;Yang, Dickinson, Robock, & Vinnikov, 1997). The rainfall data sets were further resampled to 1 km by using the bilinear interpolation method. In this study, only rainfall interception is estimated, and snowfall interception loss is omitted.
The interception loss was finally estimated on the basis of Equation (1) at a resolution of 8 days and 1 km. It was assumed that there is only one rainfall event a day, under which the total rainfall events (n in Equation (1)) were set as the number of rainfall days within the 8-day period. The interception loss at 1-km resolution from 2001 to 2015 was obtained. The estimated rainfall interception loss data at 1-km resolution were further aggregated to 0.25 resolution for analysis and mapping.

| Validation
To validate the retrieved variables, in situ observation data were commonly adopted to compare with the estimation. The in situ canopy rainfall interception loss is conventionally observed indirectly as the difference between P g measured above the canopy or in a neighbouring open area and the sum of the throughfall and stemflow sampled simultaneously on the forest floor (Llorens & Domingo, 2007;Rodda & Dixon, 2012;Staelens et al., 2008;Zimmermann & Zimmermann, 2014 (Miralles et al., 2010).
However, the mismatch between the estimation and observation periods will surely increase the uncertainty for the validation.
In the current study, to minimize the inconsistencies between the estimations and observations, we collected only the reported interception observations with detailed information on the observation period and location that were consistent with our estimation, and we abandoned those that did not match temporally or spatially.
Initially, 90 literatures were collected and reviewed, and observations from 13 literatures were found to provide precise observation location and period that match our estimation and hence were finally adopted. Table 1 provides the basic information of the selected data sets for validation.
The collected data were compared with both the estimated E I and its ratio to P g (hereafter called interception ratio) at 1 km for validation. The annual averaged or growing season averaged values were compared because seasonal data were generally not available. were calculated in this study to illustrate the difference between the estimations and ground observations.
where y represents the observed value, y 0 represents the estimated value, n is the number of observations, and y max and y min are the maximum and minimum values of the observed values, respectively.

| Comparison of estimated rainfall interception loss with in situ observations
In addition to the direct comparison of the estimated E I with the ground observations, E I /P g was also compared (Figures 1 and 2). The    These results indicate that the Dalton-type equation is more suitable for wet canopy evaporation than the Penman-Monteith equation. Thus Equation (6) was ultimately adopted in the current study for global rainfall interception loss estimation, and the following analysis was based on the wet canopy evaporation from the Daltontype equation. show very high E I and E I /P g . In the midlatitude regions, for example, from 20 N to 40 N, both E I and E I /P g were very low, which is consistent with the patterns of P g and Fc. The second E I peak in the Northern Hemisphere was found near approximately 60 N, where the absolute value was lower than that in the tropical regions. The E I /P g peaked at approximately 60 N in the Northern Hemisphere, where the extreme values were 60%, which is related to the high Fc and low R in this region.

| Comparison with existing rainfall interception loss data set
The estimated E I and E I /P g showed overall similar global pattern as GLEAM (Figures 3 Figure 4), and it could be also illustrated by F I G U R E 5 Pixel-by-pixel scatterplot to compare the (a) rainfall interception loss amount and (b) interception ratio estimated in the current study to Global Land-Surface Evaporation: The Amsterdam Methodology product. Blue solid lines represent the fitting curves, and black dash lines present the y = x lines Figure 5 that presents the pixel-by-pixel comparison. Some discrepancies could be found, especially in the tropical regions and high latitude region in the Northern Hemisphere where the estimated E I and E I /P g were lower than GLEAM (Figures 3 and 4). The estimated E I and E I /P g generally agrees well with the GLEAM results, with high correlation coefficients (.91 and.78 for E I and E I /P g , respectively) and low MB and NRMSE ( Figure 5). The slope values of fitting the estimate E I and E I /P g with GLEAM are very close to 1, indicating the consistency with GLEAM.

| The proposed method and data set
Rainfall interception is an important hydrological process that links vegetation and the atmosphere, which impacts both water resource management and climate change (van Dijk et al., 2015). The evaporation rates of intercepted rainfall greatly exceed the transpiration rates because evaporation is not physiologically controlled by plants, especially in areas with tall vegetation cover where aerodynamic conductance is high (Muzylo et al., 2009). Hence, interception loss is usually regarded as a substantial net addition to plant transpiration and soil evaporation when obtaining the total evapotranspiration of a plantsoil unit . Compared with plant transpiration, interception loss is usually considered as a nonbeneficial water use as it does not benefit plant productivity (Karimi et al., 2013). Physically based models are more attractive for interception loss estimation due to their solid physical and mathematical fundamentals and are less data set specific than empirical models. The Rutter model and the Gash model are the most famous physically based models (Gash, 1979;Gash et al., 1995;Rutter et al., 1972). The Gash model has been widely applied to different land cover types, for example, rainforest, coniferous forest, broadleaf forest, shrubs, and crops, and has been proven to be able to estimate regional and global interception loss with high accuracy and efficiency when combined with a satellite remote-sensing data set (Cui et al., 2015;Miralles et al., 2010).
The method proposed in the current study could be validated by the relatively high correlation and low bias and NRMSE compared with the ground observations collected from the literature (Figure 1).
For the small rainfall amounts, the error of the estimated E I /P g may be relatively large, whereas the error of the estimated E I was relatively small. For large rainfall amounts, the estimated E I /P g was low and very close to the ground observations, and the error of the estimated interception loss was mainly caused by the uncertainty in the input rainfall data.
However, we also noticed that observation errors introduce uncertainty when validating the results. Even small errors in the observations of P g , throughfall, and stemflow could result in high relative errors in the interception loss measurements. For example, a 2.5% error in the P g and throughfall observation would cause a 22% error in the interception loss estimation (Muzylo et al., 2009).
Although it seems simple to measure rainfall with a gauge, the uncertainty of rainfall measurements could be very large, for example, wind exposure will cause an error of approximately 5~80% in the rainfall measurements depending on the location and the gauge type (Nespor, 1999;Rodda & Dixon, 2012). Furthermore, the throughfall error is typically approximately 3.5%, whereas the stemflow error could reach 60% (Llorens & Domingo, 2007). Even though these uncertainties are noticeable, there is no generally accepted method to correct the bias to obtain ground truth data on interception loss.

| Difference with GLEAM
Previous studies have also utilized satellite observations to drive physical models to estimate interception loss, for example, the GLEAM estimation, which also estimated interception loss under the Gash model framework (Miralles et al., 2010). Our method differed from GLEAM in several aspects.
Our estimation treated leaves, branches, and trunks as one unit, so the trunk storage capacity affected evaporation losses both during and after rainfall following Cui and Jia (2014). Conversely, GLEAM estimates the interception loss from the canopy and trunk separately, which could account for evaporation from the trunks after rainfall but does not allow for evaporation from the trunks during rainfall (Wallace et al., 2013). When trunk storage is small, the difference is also small but as trunk storage increases, large difference can be found (Wallace et al., 2013).
The current study calculated the rainfall hours on the basis of the CMORPH rainfall data with a temporal resolution of 30 min and estimated the rainfall rate accordingly, which was a direct and effective method. While, GLEAM adopted an indirect way to obtain the rainfall rate, which is based on the separating of the convective and frontal rainfall according to the lightning intensity retrieved from the Optical Transient Detector and the Lightning Imaging Sensor. This process could lead to the systematic underestimation of convective rainfall at high rainfall rates because the presence of lightning is a sufficient condition for the occurrence of convection but not a necessary one (Miralles et al., 2010). In addition, previous studies suggested the use of ground gauge observations to estimate the rainfall rate and interpolated this value to the regional scale when observation data with high temporal resolution (e.g., 1 hr or less), and proper spatial representativeness were available (Cui et al., 2015).
We retrieved the monthly averaged wet canopy evaporation on the basis of physical-based equation, and GLEAM adopted a fix value for the monthly averaged wet canopy evaporation for global application regardless of the large spatiotemporal variations in the wet canopy evaporation rate. Previous studies reported Ē generally ranging from 0.17 to 0.5 mm hr −1 (Holwerda et al., 2012;Wallace et al., 2013), and our estimation also shows the large spatial variation of Ē ( Figure S1).
Furthermore, the data sets used as inputs for the current estimation were different than those used for GLEAM. For instance, we used CMORPH V1.0-blended precipitation products, whereas GLEAM V3 is based on ensemble precipitation products (Martens et al., 2017).
We combine the laser altimeter-based plant height data and the MCD12 land cover product to separate high forest from short vegetation, whereas GLEAM V3 uses the global MOD44B continuous vegetation product as an input to identify the high and short vegetation within a pixel.

| Factors impacting the interception loss algorithms
Rainfall is the water source of interception loss, and vegetation canopy is the surface of intercepted water storage and evaporation.
Hence, P g and Fc are two main inputs for E I estimation as shown in Equation (1), and they overall dominate the global variation of E I .
Previous studies have concluded that interception is linearly related to the P g amount, and the linear fitting equation varies site to site (Saito et al., 2013;Soto-Schonherr & Iroume, 2016;Staelens et al., 2008;Su, Zhao, Xu, & Xie, 2016;Sun, Onda, & Kato, 2014). Besides the P g amount, the rain rate, which is the ratio of P g amount to rainfall hours, also impacts the interception loss (Cuartas et al., 2007;Zhang et al., 2016). A higher rain rate will result in a lower interception because the wetting period is relatively short with a certain amount of rainfall, whereas frontal rainfall is accompanied by relatively long wetting period (Muzylo et al., 2009;Staelens et al., 2008;van Dijk et al., 2015); thus, the temporal shift in the rain regime also contributes to the interception loss variation. According to a sensitivity analysis, interception could decrease by more than 12% if the rain rate increased by 10% (Cui & Jia, 2014). The canopy structure impacts the interception loss mainly through the canopy water storage capacity (Dietz et al., 2006;Fathizadeh, Hosseini, Zimmermann, Keim, & Darvishi Boloorani, 2017;Komatsu et al., 2008;Pypker et al., 2005;Voss, Zimmermann, & Zimmermann, 2016), and both the complete loss of forest by deforestation and the decrease in forest leaf area due to drought could cause a reduction in the canopy water storage capacity.
Other meteorological parameters impact rainfall interception mainly through the impact on the wet canopy evaporation rate (Ē Teklehaimanot & Jarvis, 1991;Rutter et al., 1972;Pereira, Gash, David, Monteiro, & Valente, 2009;Llorens, Poch, Latron, & Gallart, 1997;Šraj, Brilly, & Mikoš, 2008). The most accurate way to determine Ē in Gash model is based on the linear regression of tree interception loss and rainfall if the interception loss is known in advance, and the physical Penman-Monteith equation has been widely adopted to estimate the regional wet canopy evaporation.
However, many studies have reported that the wet canopy evaporation rates generated by these two methods showed large differences at the site scale (Schellekens, Scatena, Bruijnzeel, & Wickel, 1999;van Dijk et al., 2015;Vernimmen et al., 2007;Waterloo, Bruijnzeel, Vugts, & Rawaqa, 1999). Moreover, other studies also indicated that the measured interception loss could exceed the wet canopy evaporation estimated by the Penman-Monteith equation (Saito et al., 2013), indicating the large uncertainty in the wet canopy evaporation estimation. The reasons for this difference seem to be closely related to the uncertainties in air humidity measurements, aerodynamic conductance rainfall duration estimations, as well as the evaporative cooling effect and the horizontal advection (Holwerda et al., 2012;Pereira et al., 2016;van Dijk et al., 2015). In addition, the Dalton-type equation was recently suggested to be more suitable for wet canopy evaporation estimation (Carlyle-Moses & ,Pereira, Gash, David, Monteiro, & Valente, 2009. Thus, in the current study, we compared the wet canopy evaporation estimated from the widely applied Penman-Monteith equation and Dalton-type equation and found that the estimation on the basis of Ē by Dalton-type equation agreed better with the ground observations in terms of both interception loss and interception ratio.
Hence, the Dalton-type equation was adopted to retrieve Ē for global E I estimation.

| CONCLUSIONS
The Gash analytical model was revised and applied to estimate the global rainfall interception loss in the current study mainly based on satellite remote-sensing products, including CMORPH precipitation data, GLASS LAI data, and laser altimeter-based canopy height data. The Dalton-type and Penman-Monteith equations were compared to estimate the wet canopy evaporation rate in the revised Gash model, and the Dalton-type equation was ultimately adopted due to its strong agreement with the ground observations. The accuracy of the revised Gash model was validated on the basis of the ground observations collected by previous studies. Both the interception loss amount and interception ratio data showed good agreement with the ground observations. Meanwhile, the spatial variation in the estimated interception was consistent with the spatial variation of GLEAM interception loss data, indicating its good global estimation performance.
The variation in interception could be explained by the variations in P g , rain rate, and vegetation cover. Large interception values were found to accompany high P g , low rain rates, and high canopy cover fractions, whereas high interception ratios accompanied low rain rates and high canopy cover fractions. Thus this pattern results in the high interception values during the wet period in tropical regions and high interception ratios in the dense forest of the middle-high latitudes in the Northern Hemisphere.