Comparison of evapotranspiration estimates using the water balance and the eddy covariance methods

The eddy covariance method estimates the energy flux of latent heat for evapotranspiration. However, imbalance between the land surface energy output and input is a well‐known fact. Energy balance closure is most commonly not achieved, and therefore the eddy covariance method potentially underestimates actual evapotranspiration. Notwithstanding, the method is one of the most established measurement techniques for estimating evapotranspiration. Here, evapotranspiration from eddy covariance (ETEC) is cross‐checked with evapotranspiration calculated as the residual of the water balance (ETwb). The water balance closure using ETEC is simultaneously validated. Over a 6‐yr period, all major terms of the water balance are measured including precipitation, recharge from percolation lysimeters, and soil moisture content from a cosmic‐ray neutron sensor, a capacitance sensor network, and time domain reflectometry (TDR), respectively. In addition, we estimate their respective uncertainties. The study demonstrates that both monthly and yearly ETEC and ETwb compare well and that the water balance is closed when ETEC is used. Concurrently, incoming available energy (net radiation minus ground heat flux) on average exceeds the turbulent energy fluxes (latent heat flux and sensible heat flux) by 31%, exposing the energy–surface imbalance. Consequently, the imbalance in the energy balance using the eddy covariance method must, to a lesser degree, be caused by errors in the latent heat estimates but can mainly be attributed to errors in the other energy flux components.


INTRODUCTION
The eddy covariance (EC) method is one of the most established measurement techniques for estimating evapotranspi-the land surface. All terms can be both positive and negative, and thereby switch the direction of the energy flux.
It is a well-documented fact that the land surface energy balance equation is rarely closed using energy fluxes estimated by the EC method (Foken et al., 2011;Leuning, van Gorsel, Massman, & Isaac, 2012;Oncley et al., 2007;Wilson et al., 2002). In most cases, the available energy represented by the difference between incoming R n and outgoing G is larger than the sum of the outgoing turbulent fluxes of H and LE (Foken, 2008). The relative energy balance closure ratio (EBC ratio ) represents the imbalance of the in-and outgoing energy fluxes and is computed as Depending on the type of surface, ranging from forest to bare soil, an EBC ratio between 70 and 90% is often reported (Foken, Wimmer, Mauder, Thomas, & Liebethal, 2006;Stoy et al., 2013;Wilson et al., 2002). Estimation of R n and G are often considered not to be the cause of surface energy imbalance (Foken et al., 2011). Instead, an underestimation of the turbulent fluxes appears more likely and, as a consequence, the EC method may underestimate the actual ET (Twine et al., 2000). The imbalance of energy output and input to the land surface cannot be explained only by measurement errors (Foken, 2008;Mauder et al., 2006). Therefore, our understanding of the physics of the systems must be incomplete (Culf, Foken, & Gash, 2004). Several authors state that undetected vertical transport of LE and H, at large spatial and temporal scale, may explain the energy balance problem (Foken et al., , 2011Stoy et al., 2013). Largescale eddies are related to landscape heterogeneity.
It is important to evaluate the potentially incorrect measurement of ET by use of the EC method (ET EC ). The water balance method enables validation of ET EC . Comparing ET EC with ET estimated from the water balance method (ET wb ), allows insights into the performance of the EC method. The water balance method estimates ET wb directly from the mass balance of water and applies to a given soil volume over a specific period of time Here, P is precipitation, I is irrigation, R is recharge, and ΔSWC is the change in soil water content over a specific period. The ΔSWC is negative when water is removed from the system and positive if the system gains water. Calculations are normalized such that all variables are taken as millimeters of water over a given time period.
Previous studies have compared alternative methods for estimating ET with the EC method. The studies are contradictory, and the conclusions depend on the estimation method used. The different methods for estimating ET wb are, for Core Ideas • Latent heat flux and evapotranspiration (ET) is one and the same. • It appears in both the water and the energy balance equation. • The ET from eddy covariance (ET EC ) is compared with ET from the water balance (ET wb ). • This study demonstrated no significant difference between ET EC and ET wb . • The eddy covariance estimated latent heat flux is not the cause of energy imbalance.
example, weighing and nonweighing lysimeters, soil moisture probes, and water balance modeling. Studies using weighable lysimeters as reference often find ET EC to be smaller than ET wb , thus concluding that the EC method underestimates LE (Alfieri et al., 2012;Chávez, Howell, & Copeland, 2009;Ding et al., 2010;Gebler et al., 2015;Mauder et al., 2018;Wohlfahrt et al., 2010). Despite an incomplete energy balance, studies using different kind of soil sensors Vásquez et al., 2015), soil sampling campaigns (Imukova, Ingwersen, Hevart, & Streck, 2016), and studies using water balance modeling (Scott, 2010), generally obtain good agreement between ET EC and ET wb . In such cases, LE is not the major cause of the energy balance gap, and, if so, bias in the other energy fluxes or unconsidered energy storage terms cause the energy balance closure problem. This study is based on long-term data collection carried out within the framework of the Danish Hydrological Observatory, HOBE (Jensen & Refsgaard, 2018). As part of this program, an agricultural field observatory was established. For evaluating the water and energy balances over several years, the observatory measured all relevant water and energy fluxes. The two independent measurement techniques for estimating ET, the EC method and the water balance method, are evaluated and compared. Concurrently the uncertainties of the individual water balance components are estimated, thereby giving an uncertainty estimate for ET wb . The use of multiple measurement devices, especially for measurement of precipitation, enables this uncertainty estimation.
Until now, only a few published studies have compared long-term measurements of ET using the EC method with other methods for estimating ET. Moreover, the previous long-term studies all assess ET from weighable lysimeters (Gebler et al., 2015;Hirschi, Michel, Lehner, & Seneviratne, 2017), and thereby no other estimation methods have been investigated on a long-term basis. Our hypothesis is that the imbalance in the energy balance using the EC method is, to a lesser degree, caused by errors in the LE estimates but can mainly be attributed to errors in the other energy flux components or unaccounted effects. The findings can provide useful quality control on the measurement of LE in micrometeorological flux studies.
A data-driven approach is adopted where aggregated daily ET wb estimates from the water balance method are compared with ET EC for a 6-yr period (2010)(2011)(2012)(2013)(2014)(2015). In contrast with previous studies using soil sensors for evaluating the water balance, this study is based on long-term daily time series. Most studies using soil sensors evaluate the water balance for time periods from a few days to 6 mo (Ding et al., 2010;Imukova et al., 2016;Schelde et al., 2011;Vásquez et al., 2015). Hereby, they exclude the daily and seasonal fluctuations in ET, which are highly relevant in land surface modeling. Moreover, many previous studies do not evaluate ET throughout periods with precipitation (Hirschi et al., 2017;Ingwersen et al., 2011;Schelde et al., 2011;Wilson, Hanson, Mulholland, Baldocchi, & Wullschleger, 2001). During dry periods, measurements of temporal changes in SWC are directly comparable with ET because precipitation, irrigation, and recharge in Equation 3 can be neglected. This study contains measurements of all major water balance terms, and therefore ET wb is calculated as a residual term, both during and in between precipitation events. When computing the water balance, ET wb is estimated using a cosmic-ray neutron sensor (CRNS) and capacitance soil moisture sensors, respectively, for estimating soil moisture change (ΔSWC) in the root zone. Below the root zone, ΔSWC is estimated using time domain reflectometry (TDR).

Study site
The study site is an agricultural field observatory (56.0373 • N, 9.1608 • E; 62 m asl) located in the Skjern River catchment in the western part of Denmark. The climate is temperate, with yearly average temperature of 8.5 • C and an undercatch-corrected annual precipitation of 983 mm yr −1 in the period 2010-2015. Glacial processes formed the landscape, resulting in a terrain that is practically flat with loose glacial and meltwater deposits that dominate the subsoil. The soil, classified as a Spodosol, consists of coarse sand below a 0.25-to 0.5-m-thick organic topsoil. The groundwater level is located well below the root zone at a depth of approximately 5-6 m. There is an intensive production of winter and spring barley (Hordeum vulgare L.) at the site. The farmer applies 20-25 mm of irrigation several times during the growing season from April to June.
The hydrological observatory has been operational since 2009 (Jensen & Refsgaard, 2018). Figure 1 provides an overview of the field location and some of the installed mea-surement devices. The instrumentation is placed inside a 16-m × 80-m plot vegetated with short grass and weeds. The instrumentation plot is located within a 300-m × 400-m agricultural field and is surrounded by fields with similar crop heights.

Measurement setup
Instrumentation, data issues, and analysis methods vary for each flux and state variable in the energy and water balance equations. As a result, they are described individually below. Instrument locations can be found in Figure 1.

Latent and sensible heat
The EC measurement device estimates LE and H and represents a footprint on the order of hundreds of meters. The EC tower is equipped with a sonic anemometer (R3-50, Gill Instruments) at 12 m above ground level (agl). From an inlet close to the sonic anemometer, a tube moves air into an open-path CO 2 /H 2 O gas analyzer (LI-7500, LI-COR). All measurements are processed using EddyPro version 4.2.1 software (LI-COR) as described by Jensen, Herbst, and Friborg (2017). The 30-min data of LE and H fluxes from the gas analyzer are post-processed using the quality-control criteria suggested by Foken et al. (2005). Rejected LE values are replaced by gap-filled values by applying the Red-dyProcWeb online tool (https://www.bgc-jena.mpg.de/bgi/ index.php/Services/REddyProcWebGapFilling; Reichstein et al., 2005). Information on the measurement method and post-processing of the EC data at the field site can be found elsewhere Jensen et al., 2017;Schelde et al., 2011).
The gap-filled 30-min LE fluxes are added up to daily values and are converted to ET EC (mm d −1 ) by where ρ w (kg m −3 ) is the density of water. The heat of vaporization λ (MJ kg −1 ) is a function of the temperature T ( • C) described by the equation given by Ding et al. (2010): Daily values are only calculated if all 48 30-min values are available for the specific day. Likewise, 30-min fluxes of H are also summed to daily values.
Evaluating the uncertainty for the LE data provides an indication of the accuracy of the estimate. As only one measurement device is available, uncertainty cannot be assessed directly. Instead, the random uncertainties related to sampling Vadose Zone Journal F I G U R E 1 Experimental site. EC, eddy covariance; CRNS, cosmic-ray neutron sensor; TDR, time domain reflectometry error and gap-filling of the measured 30 min LE data, in terms of SD, were summed to yearly values following the method described in Jensen et al. (2017): where SD LE is the uncertainty for annual values of LE, u r,30 min and u g,30 min are the 30-min uncertainties related to sampling errors and gap-filling, respectively, N is the total number of 30-min flux averaging intervals in a year, and G is number of gap-filled 30-min flux averaging intervals. Uncertainties of daily and monthly LE values are derived in a similar manner.

Precipitation and irrigation
The experimental site is equipped with six precipitation gauges: 1. An international reference pit gauge measures liquid precipitation (Goodison, Louie, & Yang, 1998). The pit gauge comprises a Pluvio2 automatic weighing precipitation device (OTT Hach Environmental) placed within a pit with the gauge orifice at the soil surface level. In order to reduce rain splash, a metal grid covers the surrounding pit. Wind correction is not necessary for the pit gauge. 2. A Double Fence International Reference (DFIR) (Goodison et al., 1998) measures solid precipitation during the cold season (i.e., approximately from November to March). The DFIR comprises a Pluvio2 automatic weighting precipitation gauge placed at the center of the double fence with the gauge orifice at 3.0 m agl. The solid precipitation data are corrected for undercatch based on wind speed and precipitation intensity (Allerup, Madsen, & Vejen, 1997). The details on how to construct a DFIR can be found in Goodison et al. (1998). 3-4. Two standard unshielded weighing precipitation gauges, hereafter named Gauge Unshielded A and B. Both gauges are Pluvio2 automatic weighing precipitation devices with the gauge orifice at 1.5 m agl. The liquid precipitation data are corrected for undercatch based on wind speed and precipitation intensity (Allerup et al., 1997). 5. A standard unshielded Rimco 7499 tipping bucket precipitation gauge. The gauge orifice is at 1.5 m agl, and the measurements are corrected for undercatch similar to Gauges 3 and 4. 6. A shielded Pluvio2 automatic weighing precipitation gauge with the gauge orifice at 1.5 m agl. The measurements from this gauge are not corrected for undercatch.
The precipitation time series used in this study are on hourly basis. Outliers outside the interquartile ranges of 1.5 are replaced with the mean of the remaining precipitation gauges at the specific time step. The hourly precipitation datasets are then aggregated to daily values, resulting in a 66-97% data coverage (Table 1). Daily values are only calculated if all 24 1-h values are available for the specific day. As mentioned above, the DFIR is only operational during wintertime and therefore has a relatively low data coverage of 32%.
A precipitation time series is constructed, termed the "best assembled precipitation dataset" (BAPD). This time series is developed based on expert judgement by combining data from the available precipitation gauges at the site considering which gauge that gives the most reliable estimate at a given point in time given the available data, experience, and the weather conditions. Table 2 lists the overall guideline on how to prioritize the individual precipitation gauges. In the construction of BAPD, the pit gauge data comprise 74% of the hourly values, whereas 12% of the values are from Gauge Unshielded A, and 11% are from DFIR.
The precipitation gauges only partly capture irrigation applied to the surrounding field. On irrigation days as obtained from the farmers field management report, precipitation is set to 0, and the registered irrigation is assumed to be the input to the system. However, if the measured precipitation exceeds the recorded irrigation amount, then the surplus is maintained in the precipitation record.
Precipitation events were for the most occasions detectable in all gauges, but the magnitude of the events were different among gauges, leading to uncertainty for the daily precipitation amount. The uncertainty for the BAPD dataset is estimated using the SD of the daily time series from the six precipitation gauges excluding days with irrigation: were N is the number of precipitation gauges (N = 6).

Recharge
Recharge is estimated using nonweighable percolation lysimeters. The lysimeter facility consists of four buried containers with open tops besides a collection well ( Figure 1). The surface area of each lysimeter is 3.2 × 3.88 m. In order to allow for normal field management across the entire field, the depth of the upper face of the lysimeters is at 0.6 m below ground level (bgl). The impermeable lower face slopes to facilitate collection of recharge water by a perforated tube and to prevent buildup of local water tables at the bottom of the lysimeters. As a result, the lysimeter depth varies from 1.7 to 2.1 m bgl. The lysimeters were backfilled, maintaining the order of the excavated soil horizons, and are therefore representative of the surrounding area in terms of soil conditions, vegetation, and management. Tipping buckets measure recharge from the individual lysimeters by counting the number of tips every 15 min. This results in four independent recharge time series. Further technical specification on the lysimeter facility at the field site can be found in Vásquez et al. (2015). The 15-min values of recharge are aggregated to daily values, and the four time series are subsequently averaged. If time steps are missing in all four time series, the data gaps in the mean daily time series are filled using linear interpolation. Data gaps account for <0.5% of the daily values, and a maximum of 14 consecutive days. Finally, the uncertainties related to the daily mean values are calculated as the SD of the four time series using The best assembled precipitation dataset (BAPD) is constructed as indicated in this were N is the number of lysimeters (N = 4). Calculation of SD on recharge results in an individual estimate of the uncertainty for each daily measurement. The four lysimeters are located so close that the differences between the recharge estimates primarily relate to measurement accuracy. The calculated uncertainty, only to a small degree, relates to heterogeneities in environmental conditions such as topography, soil texture, and vegetation within the footprint of the EC tower.

Soil water content
Soil water content is a state variable of the water balance equation, and ΔSWC, calculated as the difference in SWC from day to day, represents the daily change in soil water content. The SWC in the root zone is estimated using various techniques representing different scales and sensing depths: (a) a single capacitance station, hereafter subscripted "point," installed within the instrumentation plot, (b) a capacitance field sensor network, hereafter subscripted "network," distributed within the surrounding agricultural fields, and (c) a CRNS. In addition, TDR probes, located within the lysimeters, estimate SWC at deeper soil layers. The paragraphs below elaborate the three SWC estimation methods. Figure 1 displays their locations.

Capacitance sensors
The capacitance sensors, 5TE from Decagon Devices (2014), measure temperature, bulk dielectric permittivity, and bulk electrical conductivity at 30-min intervals. The SWC is calculated from the measured apparent dielectric permittivity using the relationship of Topp, Davis, and Annan (1980). Since the dielectric permittivities of water and ice differ substantially, we discard measurements obtained at soil temperatures below 1 • C. The single capacitance station, placed within the instrumentation plot, collected data continuously throughout the period of 2010-2015. The station has five capacitance sensors: two at 0-to 5-cm depth, two at 20-to 25-cm depth, and one at 50-to 55-cm depth. Soil water contents are averaged to daily values on days where more than half of the data estimates are reliable, skipping days with less reliable data estimates. Daily changes in SWC for the point capacitance station (ΔSWC point ) are calculated for all five sensors, followed by averaging for each sensing depth. The underlying assumption using the single capacitance station time series is that the day-to-day soil water changes within the instrumentation plot (ΔSWC point ) is comparable with ΔSWC in the surrounding agricultural area despite the different land covers.
The capacitance network consists of six measurement stations, each having the same measuring configuration as the single capacitance station. All capacitance stations in the capacitance network are located within the footprints of both the EC station and the CRNS sensor footprint (Figure 1). Twice a year, the stations were removed and reinstalled due to agricultural activities such as harvest, plowing, and sowing. This result in data gaps in the ΔSWC network time series. A single capacitance sensor at 20-to 25-cm depth showed substantially smaller variation in SWC than the remaining sensors and has therefore been discarded. Furthermore, measurements from two sensors at 50-to 55-cm depth were discarded in the period of 28 June 2015-1 Aug. 2015, as the values were outside the expected range.
Below, the various approaches for estimation of uncertainty in ΔSWC network and ΔSWC point are described. The ΔSWC network and the associated SD (SD ΔSWC, network ) are calculated for each sensing depth individually. The ΔSWC network at each daily time step is given as the mean of ΔSWC network for all sensors at each sensing depth (12 sensors at 0-to 5cm depth, 11 sensors at 20-to 25-cm depth, and six sensors 50-to 55-cm depth), and the uncertainty is given as the SD of ΔSWC network among all sensors at the given time step: where N is the number of sensors at each sensing depth. The SD ΔSWC,network thereby represents both the accuracy in the measurements and the uncertainty related to the heterogeneity in environmental conditions. The capacitance point dataset comprises only one capacitance station. To include spatial variability in the uncertainty estimate, the uncertainty of ΔSWC point was calculated as the mean of SD ΔSWC, network of the entire dataset (2010)(2011)(2012)(2013)(2014)(2015): Hence, the SD of each individual daily measurement of ΔSWC point is thereby given as the mean of the SD of ΔSWC for the capacitance network, as this was assumed the best method to estimate uncertainty in ΔSWC point . If Equation 9 was applied to calculate the uncertainty in ΔSWC point , the uncertainty would be based on only five sensors (two sensors at 0-to 5-cm depth, two sensors at 20-to 25-cm depth, and one sensor at 50-to 55-cm depth). This would give an inadequate estimate of the SD, as only two sensors are available at each of the sensing depths with no possibility to calculate SD for the deepest sensor, as only one sensor is available. Furthermore, only measurement accuracy and not environmental heterogeneity would be included in the uncertainty estimate.
In the calculation of the uncertainty in ΔSWC for the water balance calculations, the SDs are given in millimeters of water with (a) the capacitance sensor placed at 0-5 cm, representing the top 15 cm of the soil column, (b) the capacitance sensor located at 20-25 cm representing the 15-to 30-cm soil column, and (c) the capacitance sensor located at 50-55 cm representing 30-to 60-cm soil column. The total uncertainty in millimeter water within the 0-to 60-cm soil column is estimated as the square root of the sum of variances of the three zones:

Cosmic-ray neutron sensor
Daily estimates of SWC in the root zone was also estimated using the CRNS method (Andreasen et al., 2017;Zreda, Desilets, Ferré, & Scott, 2008). This method estimates SWC in the upper decimeters of the soil in the surrounding hectometers from the detector (Zreda et al., 2008). Accordingly, CRNS provides estimates of SWC at a scale useful for studies of land surface processes and at a scale on the same order as the footprint of EC. Water stored in the biomass is negligible compared with the SWC and has not been considered. The method takes advantage of the inverse relationship between measured cosmic-ray neutron intensities in the epithermal energy range and SWC. Andreasen et al. (unpublished data, 2020) carried out site-specific calibration using multiple soil samples. The CRNS station at the site has been operational since March 2013.
A standard approach to estimate the uncertainty of the CRNS method relates to the measured neutron count rates (Andreasen et al., 2017). This uncertainty only reflects the uncertainty in obtaining correct neutron counts by the detector. Here, we adopt another method to estimate the uncertainty. We identified a stable period of 14 d with little variation in the capacitance network SWC dataset (5-18 Oct. 2013). We defined uncertainty in ΔSWC CRNS as the SD between the 13 daily ΔSWC CRNS estimates during the stable period: where N is the number of ΔSWC measurements with CRNS during the stable period (N = 13). We assign this uncertainty to all daily time steps of the CRNS time series.

Time domain reflectometry
In order to evaluate daily SWC below the root zone, 1-m-long custom-made TDR probes (Vásquez et al., 2015) installed vertically in each lysimeter are used. There are four clusters with nine TDR probes in each, for a total of 36 probes. The average SWC for each cluster is used for the further analysis. The top of the TDR probes are at the level of the upper face of the lysimeters, thereby extending vertically from 0.6-1.6 m bgl. Further technical specification on the TDR probes at the field site can be found in Vásquez et al. (2015). We define the uncertainty in ΔSWC TDR as the SD of the four clusters: where N is the number clusters of TDR (one in each lysimeter, N = 4). Similar to the uncertainty estimate of the recharge, the uncertainty in ΔSWC TDR mainly represents the measurement accuracy of the TDR probes and, to a minor degree, the variability caused by differences in environmental settings.

Ancillary measurements
Meteorological instrumentation includes several anemometers across the site for measuring wind speed and wind direction. Furthermore, the flux mast is equipped with a temperature and relative humidity sensor (HMP 45C, Vaisala Oyj) and a four-component radiation sensor at 4 m agl (NR01, Huxeflux Thermal Sensors). Soil heat flux (G) is determined using two heat flux plates (HFP01, Hukseflux, Thermal Sensors) placed at 0.05 m bgl. Because of agricultural management, the heat flux plates were removed and reinstalled twice a year. In the period of 2010-2015, 7% of the 30-min measurements from the soil heat flux plates are missing. Throughout the calculation of soil heat flux G, we have assumed that the changes in heat storage above the plates are negligible. All ancillary data are stored at 30-min intervals and are subsequently aggregated to daily values. Ringgaard et al. (2011) provides details about the equipment and post-processing of ancillary data.

Land surface energy balance and water balance
The relative energy balance ratio (EBC ratio ) in Equation 2 is a measure of the energy balance closure. In addition, we also compute the absolute energy balance discrepancy (D energy ) to provide a more exhaustive analysis of the energy balance closure: Likewise, the relative water balance closure ratio (WBC ratio ) and the absolute water balance discrepancy (D water ) describe the water balance closure where I is irrigation. Note that ET EC represents the actual ET in Equations 15 and 16. When evaluating the yearly water balance, changes in soil water content (ΔSWC) are neglected. However, when computing monthly water balances, ΔSWC point is used to account for monthly changes of soil water storage. The SWC point dataset has only minor data gaps.

Estimation of ET wb
Equation 3 describes the water balance equation for a given soil volume. Equation 3 is only applied on days where all three components (P, R, and either ΔSWC point , ΔSWC network , or ΔSWC CRNS ) are available. It is assumed that daily changes in SWC in the deeper soil layers has a negligible effect on the water balance. Therefore, ET wb is still calculated, even though there is a data gap in the daily time series of ΔSWC TDR . The same applies if ΔSWC from one of the three capacitance sensing depths is missing. Three separate water balance approaches are applied depending on the method for estimating ΔSWC that represents different soil sensing volumes. The ΔSWC point and ΔSWC network represent ΔSWC for 0-to 60-cm depth, whereas ΔSWC CRNS represents 0-to 20-cm depth. The ΔSWC for the deeper soil layers represented by ΔSWC TDR depends on the method for estimating SWC in the root zone. If capacitance sensors are used for estimating ΔSWC in the root zone, then SWC TDR represents 60-to 160-cm depth, but if CRNS is used for estimating ΔSWC in the root zone, then SWC TDR represents 20-to 160-cm depth. The water balance equations read accordingly In Equations 17-19, ΔSWC in cubic meters per cubic meter is converted to millimeters according to the specified representative volume. Furthermore, ΔSWC below 160 cm is not considered.
Comparing daily ET EC with daily ET wb allows insights into the performance of the EC method. Twenty-four hours is regarded as the smallest possible time step for comparison between the EC and the water balance approaches. Subsequently, daily ET wb and ET EC are aggregated to monthly and yearly values.

Autocorrelation and uncertainty estimation
In order to allow for aggregation of daily variances of ET EC , P, R, and ΔSWC to monthly and yearly values, an autocorrelation analysis was performed. Uncertainty in daily ET EC , P, R, and ΔSWC was tested for autocorrelation by calculating Pearson's correlation coefficient with lag time of 1 d for the whole time series of each variable. For all datasets, the autocorrelation analysis is carried out on the daily time series of variances.
When including autocorrelation in the aggregation of uncertainties, correlated daily variances are aggregated to monthly and yearly values as Here, r 1 is Pearson's autocorrelation coefficient with lag 1, N is the number of time steps, and SD 2 and SD are the variance and the SD, respectively, of the corresponding variables ET EC , P, R, and ΔSWC. Only the term associated with autocorrelation with a lag of 1 d is included in Equation 20. All time series of variances reveal very low correlation for lag time > 1 d (data not shown); therefore, we neglect contributions from autocorrelations for higher lag times .
Assuming that the components in the water balance, P, R, and ΔSWC are statistically independent, the uncertainty in ET wb is estimated as the sum of variances:

SD 2
ET,wb = SD 2 + SD 2 + SD 2 ΔSWC The variance associated to ΔSWC is either the sum of the variances of ΔSWC point and ΔSWC TDR, 60-160 cm , the sum of the variances of ΔSWC network and ΔSWC TDR, 60-160 cm , or the sum of the variances of ΔSWC CRNS and ΔSWC TDR, 20-160 cm.

Evaluation of the water and energy balance terms
Below, aggregated measurements for each of the water and energy balance components are presented. Figure 2 displays daily time series of all flux and state variables in the energy and water balance equations. In order to sum daily variances of ET EC , P, R, and ΔSWC to monthly and yearly values, the autocorrelation is evaluated. The correlation coefficient related to the variance of ΔSWC CRNS , variance of P, and variance of ET EC is not significantly different from 0 (p value > .05, data not shown); thus, aggregation of T A B L E 3 Autocorrelation coefficient (Pearson's r) and number of daily values (n) for time series of variances (SD 2 ) of evapotranspiration from recharge (R) and daily change in soil water content (ΔSWC) from capacitance network and time domain reflectometry (TDR) with a lag of 1 d.  Table 3 shows the Pearson's autocorrelation coefficients with lag of 1 d. All correlation coefficients in Table 3 are significantly different from 0 (p value < .05). Therefore, the correlation must be included in the summation of variances, as shown in Equation 20. The SD of ΔSWC point is defined as the mean of the SD of ΔSWC network (Equation 10). Therefore, Equation 20 applies the correlation coefficient related to the variance of ΔSWC network when calculating the variance of ΔSWC point .

Evapotranspiration from eddy covariance
Evapotranspiration shows a highly seasonal behavior with very low values during the cold months (ET in Figure 2a and LE at Figure 2b). The mean yearly ET EC is 453 ± 3 mm ( Table 4). The ET EC in 2015 is substantial higher than in other years, probably because ongoing inspection and maintenance of the EC instrument were lacking this year. Gap filling of EC data (LE data) introduces uncertainty. Missing data values of LE originate mainly from periods with low turbulence during nighttime and from periods with rain Ringgaard et al., 2011). Approximately 40% of the LE values are gap filled, which is in line with other studies (Moffat et al., 2007). Although ET is minor during rain events, where air humidity above the smooth surface of the agricultural field is close to saturation, evaporation of intercepted water immediately after rain events can be substantial and entail underestimation of ET .
Estimates of ET EC and the associated uncertainty represent integrated values within the footprint. According to Schelde et al. (2011), 50% of the flux originates from a distance of <250 m from the EC mast, and 80% of the flux originates from a distance < 800 m. Jensen et al. (2017) substantiate that the variations in footprint at the experimental site are only of moderate importance for the uncertainty of ET EC .

Vadose Zone Journal
F I G U R E 2 (a) Daily water balance components: evapotranspiration from eddy covariance (ET EC ), precipitation (P), irrigation (I), recharge (R), and soil water content (SWC), where the subscript "point" refers to a single capacitance station, subscript "network" refers to a capacitance field sensor network, and subscript "CRNS" refers to a cosmic-ray neutron sensor. (b) Daily values for land surface energy balance components: net radiation (Rn), latent heat flux (LE), sensible heat flux (H), and ground heat flux (G).

Precipitation and irrigation
On average, every second day is a rainy day and the average rain amount on a rainy day is ∼5 mm. The normalized yearly precipitation amount for the five liquid precipitation gauges are in the range of 915-1,088 mm (Table 1). The mean yearly precipitation based on BAPD (2010-2015) is 983 ± 9 mm ( Table 4). The yearly number of irrigation events range from Vadose Zone Journal one to seven, applying between 23 (2012) and 158 mm (2013) during the irrigation season ( Figure 2, Table 4). Figure 3 shows linear regressions between measurements of daily liquid precipitation from the pit and from the other gauges installed at the field site. The slope of the linear regression line is between 0.97 and 1.05 and Pearson's correlation coefficient is between .98 and .99, indicating only minor differences between precipitation datasets. Precipitation amounts from DFIR are slightly smaller than precipitation amounts from the pit. All precipitation gauges are located within a maximum distance of ∼20 m (Figure 1). Because of this, they show quite similar precipitation intensities; however, they do not represent the spatial variability within the EC footprint, and the overall uncertainty is probably underestimated.

Recharge
A limited number of events with large fluxes, typically observed in autumn and winter, dominate the recharge (Figure 2a). During summer, recharge is generally low. The yearly recharge ranges from 514 ± 12 mm in 2010 to 810 ± 4 mm in 2015. The yearly average recharge (2010-2015) is 632 ± 6 mm ( Table 4). The variation between the outflows from the four individual lysimeters is very small.

Soil water content
Agreement between SWC CRNS and the capacitance sensor network, as well as a soil sampling campaign, underlines the reliability of the SWC CRNS at the field site (Andreasen et al., unpublished data, 2020). Figures 4a, 4b, and 4c illustrate the linear regression between measured absolute SWC of the three datasets in the topsoil. The best regression is obtained on SWC network /SWC CRNS (Figure 4b) with a slope of 0.94 and Pearson's correlation coefficient of .86, and on SWC point /SWC network (Figure 4c) with a slope of 0.91 and Pearson's correlation coefficient of .81.
Daily changes in soil water content (ΔSWC) is the variable that enters in the water balance equation . Figures 4d, 4e, and 4f show linear regressions between the different estimation methods of daily ΔSWC. The linear regression is poorer for ΔSWC compared with SWC.
Since the focus in this paper is on relative differences in soil moisture, the uncertainty in the absolute value of SWC is of no relevance for the analysis. Uncertainty in SWC is only estimated for demonstrating the difference in uncertainty in SWC and ΔSWC. The mean uncertainty, calculated from Equation 9, on daily SWC network (2010-2015) (SD SWC,network ) is 0.05 m 3 m −3 . The equivalent value for ΔSWC (SD ΔSWC,network ) is 0.004 m 3 m −3 . Both values are here given as the mean SD over the three sensing depths. Thus, the uncertainty in ΔSWC network is a factor of 10 lower than the uncertainty in SWC network . This is illustrated in Figures 5a  and 5b, where SWC and ΔSWC from the capacitance network and CRNS for a stable and unstable period are shown, respectively. The difference in ΔSWC network from the 29 individual capacitance sensors ( Figure 5b) is much lower than the difference in SWC network (Figure 5a). The capacitance network consists of 29 sensors, in total and the estimated uncertainties include uncertainty due to spatial variability as well as measurement accuracy. As stated in Equation 10, SD ΔSWC,point is defined as SD ΔSWC,network . As expected, the spreading on ΔSWC network from the 29 individual capacitance sensors is much lower for the stable period ( Figure 5a) than for the unstable period (Figure 5b).
Only one CRNS instrument is available, and therefore the uncertainty cannot be assessed directly but is instead estimated as the SD during the stable period. The estimated uncertainty in ΔSWC CRNS is 0.008 m 3 m −3 and relates to both instrumental and data analysis uncertainties. The footprint area of the EC system is of similar size as for CRNS. The CRNS estimates inherently take into consideration the spatial heterogeneity within the footprint. The CRNS yields Note. D water , absolute water balance discrepancy; WBC ratio , water balance closure ratio.

F I G U R E 3 Linear regression between precipitation estimated with the pit gauge and remaining precipitation gauges (mm d −1 ). DFIR, Double
Fence International Reference an integrated measure of SWC in the upper ∼20 cm of the soil column, whereas the individual capacitance sensors represent point measurements at 0-to 5-cm depth, 20-to 25-cm depth, and 50-to 55-cm depth, respectively.

Evaluation of energy and water balances
The sections below analyze the components of the water and energy balance equations and their associated uncertainties. The LE (ET EC ) is a shared component in the water and the land surface energy balance equations. Figure 2b illustrates the daily energy balance components, and Table 5 presents the yearly accumulated energy balance components. The EBC ratio for the period of 2010-2015 is in the range of 0.71-0.91, with an average of 0.79, demonstrating a lack of energy balance closure. This agrees with the widely documented fact that the energy balance does not come to closure when using the EC method (Leuning et al., 2012;Wilson et al., 2002). The EBC ratio of 0.91 in 2015 is considerably higher than the remaining years, which is a result of a 23% higher LE flux this year compared with the average LE flux in 2010-2014. This is probably due to instrumental problems. Figure 6b displays the monthly EBC ratio for the period of 2010-2015. In warm months (approximately April-September), where the energy fluxes are high, the monthly EBC ratio is between 0.6 and 1.0. During the cold months (approximately October-March), where the energy fluxes are low, the energy fluxes are even more in imbalance. However, the energy fluxes during wintertime contribute little to the overall EBC ratio. As indicated by an average yearly WBC ratio of 1.02 (Table 4), the average water inflow (P and I) is 2% less than the average outflow (R and ET). Nonetheless, the WBC ratio is in agreement with other water balance studies (Qu et al., 2016;Wiekenkamp et al., 2016). The average annual water balance discrepancy D water is −18 ± 12 mm, and as the data are assumed to be normally distributed, the discrepancy is therefore clearly within the 95% confidence interval of 2 SD. The year 2012 stands out with a D water of −83 ± 11 mm and a WBC ratio of 1.08. The farmer only reported one irrigation event this year. This is most likely less than the real irrigation amount. However, it was not possible to identify additional irrigation events from the precipitation and SWC records.

Water balance
Monthly WBC ratio values ( Figure 6a) are randomly distributed around a ratio of 1. Since the water balance is unbiased during the irrigation season from May to August, there is no indication that the bias on the annual WBC ratio originates from incorrect recording of irrigation amounts.

Comparison of ET EC and ET wb
Daily ET wb calculated with Equations 17-19 fluctuate highly (data not shown). Uncertainty estimates of the water balance components, mainly the soil water storage term, result in unrealistic negative values of daily ET wb . Hence, on short timescales, direct comparison of ET EC and ET wb is not feasible. However, when summing to monthly and annual values, the storage term becomes of minor importance, and a direct comparison is thus justified.
In Figure 7, monthly values of ET EC and ET wb and their associated uncertainty intervals (±1 SD) are compared. The secondary axis of Figure 7 shows the absolute difference between monthly ET EC and ET wb (gray curve). The ET wb (blue curve) is found using three different methods for estimating ΔSWC: (a) point capacitance sensor (ET wb,point ), (b) capacitance sensor network (ET wb,network ) , and (c) CRNS (ET wb,CRNS ). All three methods for estimating ΔSWC in the root zone result in monthly ET wb that match monthly ET EC remarkably well, and the uncertainty intervals are close in most months. The secondary axis of Figure 7 reveals the highest discrepancy between ET EC and ET wb in spring and summer months. The ET wb is close to zero during winter months and therefore underestimated compared with ET EC (Figure 7). However, in winter 2010-2011, frost and F I G U R E 7 Monthly accumulated evapotranspiration estimated with the eddy covariance method (ET EC ) and the water balance method (ET wb ).
Uncertainty band indicates ±1 SD. Soil water content (SWC) measured by (a) capacitance point sensor (b) capacitance sensor network, and (c) cosmic-ray neutron sensor (CRNS). Secondary axes show the absolute difference between ET EC and ET wb . Blue pillars indicate periods with frost. redirection of snowmelt infiltration disturbed the estimates of the water balance components (Figure 6a).
For the three different methods of estimating ΔSWC, Pearson's correlation coefficient on monthly ET EC and ET wb is between .92 and .96 (Figure 8). The assumption that point-estimated ΔSWC from the single capacitance station is comparable with ΔSWC in the surrounding agricultural area seems justified, as the Pearson's correlation coefficient between the single capacitance station and the capacitance network is close to 1.
In a paired t test, the null hypothesis is that the pairwise difference between the two means is equal (H 0 /μ d = 0). We assume monthly ET to be normally distributed. Through the paired t test, we found that there is no statically significant difference between monthly ET EC and ET wb,point (paired t [df = 70] = 1.15, p value = .26). Likewise there is no significant difference between monthly ET EC and ET wb,network (paired t [df = 29] = 0.04, p value = .97), and monthly ET EC and ET wb,CRNS (paired t [df = 34] = 0.55, p value = .58). Consequently, all results show no significant difference between monthly ET EC and ET wb .
Tables 6-8 specify the individual components of the water balance (P, I, R, and ΔSWC) and their associated uncertainties (SD P , SD R , and SD ΔSWC ), together with estimated absolute ET wb and ET EC and their associated uncertainties (SD ET,wb and SD ET,EC ). All data are summed to yearly values. Notice the difference in data coverage caused by periods with missing data. Yearly accumulated estimates from each method cannot be directly compared, as data coverage (n) differs. The ET wb,point dataset (2010-2015) has a data coverage of 88%, whereas data coverage (2013)(2014)(2015) of the ET wb,network and ET wb,CRNS is 64 and 89%, respectively.
Tables 6-8 show that in all years and with all methods for estimating ΔSWC, propagated uncertainty on ET EC is smaller than the uncertainty on ET wb . Average yearly uncertainty in ET EC,network and ET EC,CRNS is 2 and 3 mm yr −1 , respectively, whereas corresponding uncertainty in ET wb,network and ET wb,CRNS is 23 and 33 mm yr −1 , respectively.
As we assume that yearly ET is normally distributed, the absolute difference between ET EC and ET wb is for most years within the 95% confidence interval of 2 SD. However, the difference between ET EC and ET wb in 2011, 2012, and 2013 when using SWC from the capacitance point sensor is not within the 95% confidence interval. Overall, it can be concluded that the difference between yearly ET EC and ET wb is within the expected uncertainty. Furthermore, as with the yearly ET, there is no statistically significant difference between yearly ET EC and ET wb,point (paired t [df = 5] = 1.19, F I G U R E 8 Linear regression between monthly accumulated evapotranspiration estimated with the eddy covariance method (ET EC ) and the water balance method (ET wb ). Daily change in soil water content (ΔSWC) measured by (a) capacitance point, (b) capacitance sensor network, and (c) cosmic-ray neutron sensor (CRNS). Error bars indicate ±1 SD p value = .29). Likewise, there is no difference between yearly ET EC and ET wb,network (paired t [df = 2] = 0.08, p value = .94) and ET EC and ET wb,CRNS (paired t [df = 2] = 0.41, p value = .72).
Linear regression between all yearly values of ET EC and ET wb leads to a Pearson correlation coefficient of .73 ( Figure 9). As diff ΔSWC,point is negative (−5%), whereas diff ΔSWC,network (+2%) and diff ΔSWC,CRNS are both positive (+5%) (Tables 6-8), there is no strong indication that yearly ET EC is consistently different from ET wb. Altogether, as with the monthly estimates, all results suggest that yearly ET EC and ET wb are in agreement.

Uncertainty in measurements
The data coverage and the measurement footprint influence the uncertainty estimation. In general, the agricultural field is considered to be homogeneous; however, heterogeneity could still play a role in the estimates of the components of the water and energy balances. Measurements of precipitation, recharge, and SWC by capacitance sensors represent pointscale measurements and thus very small footprints compared with the estimation techniques of ET EC and SWC CRNS . However, even though the individual capacitance sensors provide measurements at point scale, the network represents the field scale over which the six sensors are distributed.
The close distance between the independent and replicant measurement devices of precipitation and recharge implies that the spatial variety within the agricultural field is only represented to a minor degree. Local conditions (e.g., small variations in elevation influencing the overland flow and recharge, or structures disturbing the precipitation distribution) could influence the estimated mean and uncertainty of precipitation and recharge, respectively. However, these are local effects and the field-scale uncertainty of both variables is probably underestimated. This may potentially also lead to an underestimation of the propagated uncertainty in ET wb . Bias in the estimates is particularly evident in the estimation of recharge during cold periods. Vásquez et al. (2015) showed that, at the experimental site, redirection of snowmelt infiltration due to frozen topsoil leads to overestimation of recharge and thus low values of ET wb (Figure 7).
Measurement uncertainty of the individual flux and state variables contribute differently to the propagated uncertainty in ET wb . The uncertainties of ΔSWC point and ΔSWC CRNS are given fixed values based on different assumptions, whereas it is possible to calculate the uncertainty of ΔSWC network on a daily basis, thus leading to different aggregated annual values (Tables 6-8). The ΔSWC varies among the years and among the three different estimations methods. The uncertainty of ΔSWC exceeds the yearly estimate of ΔSWC, and furthermore the uncertainty of ΔSWC considerably exceeds the uncertainty of the water balance fluxes. The yearly uncertainty in P and R is about 1-3% of the yearly mean. Thereby, uncertainty in ΔSWC contributes the most to the propagated uncertainty of ET wb , whereas uncertainty in recharge contributes the least (Tables 6-8).
At nighttime, poorly developed turbulence during stable stratification often causes the EC covariance to be based on a few values only , resulting in considerable gap filling of nighttime estimates. Moreover, the EC sensor footprint area changes according to wind speed and wind direction (Foken, 2008) and may be affected by the equipment at the field site and different crops at neighboring fields.
In order to aggregate daily variances of the individual components, autocorrelation with lag of 1 d is considered when computing the total uncertainty (Equation 20).

Vadose Zone Journal
T A B L E 6 Accumulated yearly evapotranspiration (ET) estimated by the eddy covariance method (ET EC ) and the water balance method (ET wb ), respectively, together with the accumulated yearly water balance components (precipitation [P]

Energy imbalance
Despite extensive studying (see reviews by Foken, 2008;Foken et al., 2011;Leuning et al., 2012), the energy imbalance is still a contemporary problem in EC flux measurements. However, the aim of our study was to compare two independent measurement techniques for estimating ET and not to determine what causes the land surface energy imbalance.
As in standard EC studies, we neglected the minor fluxes and storage terms Wilson et al., 2002). Changes in energy storage in vegetation and air and unaccounted changes in the upper few centimeters of soil are generally believed to be small when aggregated to daily values. As a result, storage changes are normally not accounted for in energy flux studies (Eshonkulov, Poyda, Ingwersen, Pulatov, & Streck, 2019). Nevertheless, several authors have shown F I G U R E 9 Linear regression between yearly accumulated evapotranspiration estimated with the eddy covariance method (ET EC ) and the water balance method (ET wb ). Daily change in soil water content (ΔSWC) measured capacitance point, capacitance sensor network, and cosmic-ray neutron sensor (CRNS). Error bars indicate ±1 SD that including storage terms can actually improve the energy balance (Eshonkulov et al., 2019;Heusinkveld, Jacobs, Holtslag, & Berkowicz, 2004;Leuning et al., 2012). According to Eshonkulov et al. (2019), energy storage in the vegetation is the component with the strongest potential for improving the EBC ratio . At our field site, the biomass is very small, as the vegetation is agricultural crops, and therefore it seems justified to neglect both energy storage in vegetation as well as other storage terms.
Studies report better energy balance when improving the estimation of G (Eshonkulov et al., 2019;Heusinkveld et al., 2004). Nevertheless, we assume that G is negligible during data gap periods. However, those periods only account for 7% of the 30-min data points. Disregarding G during those periods potentially leads to errors in EBC ratio . Furthermore, we do not consider change in heat storage above the plates used for measuring G (installed at 0.05 m bgl). Neglecting both factors seems justified, as G on a daily basis only accounts for ∼2% of R n .
The inability to fulfill the assumption of homogeneous land surface conditions within the footprint is likely to be a major cause of inaccurate measurements at many flux stations (Leuning et al., 2012). Even though the landscape is flat, homogeneous, and with short vegetation, which presumably are ideal conditions for the EC method, the imbalance is present (Wilson et al., 2002) exactly as we observe in our study. Mesoscale transport of water in large spatial eddies cannot be captured by a single EC tower (Foken et al., , 2011Stoy et al., 2013). Those large-scale eddies are generated by surface heterogeneity. Given their spatial size and slow motion, they cannot be detected by the typical half-hourly averaging. Thereby, low-frequency contributions are lost during normal data processing techniques because of inadequate temporal averaging periods. This may cause systematic underestimation of H and LE (Charuchittipan, Babel, Mauder, Leps, & Foken, 2014;Leuning et al., 2012). Altogether, it is likely that no single factor is able to completely explain the energy imbalance.
The partition of the energy balance residual on the two turbulent fluxes is unresolved. A common practice to overcome the nonclosure is to adjust the energy fluxes by preserving the Bowen ratio, B = H/LE (Twine et al., 2000). Hereby, it is assumed that measurements of R n and G are unbiased and that both LE and H are biased and underestimated (Twine et al., 2000). However, this approach is heavily debated because unconsidered factors like for example landscape heterogeneity and large-scale eddies might play a role (Foken et al., 2011). As an alternative to preserving the Bowen ratio, other studies suggest that the entire or larger parts of the energy balance residual should be attributed to either LE (Wohlfahrt et al., 2010) or H (Charuchittipan et al., 2014;Ingwersen et al., 2011;Mauder et al., 2018).

Comparison of ET EC and ET wb
The long-term (6-yr) high-quality measurements of the water balance components underpin a closure of the water balance when applying ET EC . Moreover, comparison of the LE flux (ET EC ) against ET wb showed no consistent and biased difference. Studies using different kind of soil sensors Vásquez et al., 2015), soil sampling campaigns (Imukova et al., 2016), and studies based on water balance modeling (Scott, 2010;Wilson et al., 2001) generally obtain a good agreement between ET EC and ET wb , despite an incomplete energy balance. In these cases, LE appears not to be the major component of the energy balance gap, and consequently the energy balance closure problem must be caused by bias in the other energy fluxes or unconsidered energy storage terms.
Our study is in accordance with the results obtained by Schelde et al. (2011) and Vásquez et al. (2015), who compared soil water depletion from the root zone during dry periods based on measurements with vertical TDR probes and EC measurements from the same experimental site as ours. They concluded that correcting ET EC for lack of surface energy balance closure, especially during the growing seasons, would further increase the difference between ET EC and ET wb . Our results also agree with Scott (2010). Based on annual water balances for three small watersheds, he documented that EC can provide unbiased estimates of ET. Imukova et al. (2016) applied the soil water balance method for a winter wheat (Triticum aestivum L.) crop using soil sampling campaigns for observation periods of 14-94 d. They also demonstrated good agreement between ET wb and ET EC and concluded that LE is not the major component of the energy balance gap. Alfieri et al. (2011Alfieri et al. ( , 2012, Evett, Kustas, et al. (2012), and Evett, Schwartz, et al. (2012) evaluated methods for quantifying ET during the growing season for an irrigated semiarid cotton (Gossypium hirsutum L.) field. They compared data from ET EC , weighable lysimeters, and a network of neutron probes and concluded that there are difficulties with all methods. However, they found that generally the EC -method underestimates ET compared with values based on weighing lysimeters. Consequently, according to Evett, Kustas et al. (2012), it is necessary to quality assure ET based on EC with ET based on water balance, as was done in this study.
Unlike our study, studies comparing ET EC with ET from weighable lysimeters often conclude that ET EC is underestimated (Alfieri et al., 2012;Chávez et al., 2009;Ding et al., 2010;Gebler et al., 2015;Mauder et al., 2018;Wohlfahrt et al., 2010). To explain the reason for this is beyond the scope of this study, but the spatial scale is one of the major differences between studies using weighable lysimeters and this study. This study estimates ET wb on a scale that is more compatible with the EC instrument scale. Weighable lysimeters are often on a scale of <1 m 2 . Maybe the weighable lysimeters fail to represent the properties of the field as a whole. Generally, in the above studies, good agreement is obtained between Bowen corrected ET EC and ET from the weighable lysimeters (Chávez et al., 2009;Gebler et al., 2015;Wohlfahrt et al., 2010).

CONCLUSION
This study aims to demonstrate that EC-estimated ET agrees with water balance-estimated ET. Our hypothesis is that the imbalance of incoming and outgoing energy to the surface can mainly be ascribed to other factors than the estimation of LE (ET) by use of the EC method. In that case, the EC method produces reliable estimates of ET. The outcome of this study confirms the hypothesis.
We demonstrate that ET EC is not statistically different from ET wb , and that the uncertainty on the discrepancy between ET EC and ET wb is mostly within 2 SD. At the same time, the water balance is closed to an acceptable degree when using the ET EC . Together, this demonstrates agreement between ET EC and ET wb .
Additionally, we produce long-term (6-yr) high-quality estimates of the water balance components enabling validation of ET EC throughout different seasons and between years. The study has carefully assessed the uncertainties for each individual component of the water balance and therefore enabled estimation of the propagated uncertainty on ET wb , to see if the uncertainty bands of ET EC and ET wb overlaps, and they generally do.
The objective of this study was not to determine what causes the land surface energy imbalance, but rather to compare two independent measurement techniques for estimating ET. Trusting the estimation of the available energy (R n and G) and thereby allocating the imbalance to the turbulent fluxes (LE and H) (Foken et al., 2011), our results suggest that the primary energy balance error is caused by either the estimation of H, unconsidered energy fluxes, or a missing energy storage term.
The scientific community widely agrees that one of the main reasons for the energy balance closure problem is unconsidered advective fluxes (Leuning et al., 2012;Oncley et al., 2007;Wilson et al., 2002). The available methods for estimating energy fluxes result in an energy balance residual. It is unclear how to distribute the residual (Foken et al., 2011) when closing the energy balance in, for example, land surface modeling. The widely used Bowen ratio correction method distributes the presumed underestimation of H and LE according to the Bowen ratio, but at least for some field sites, this method seems invalid (Charuchittipan et al., 2014;Foken et al., 2011;Ingwersen, Imukova, Högy, & Streck, 2015;Mauder et al., 2018). Future work should consider investigating other energy balance corrections, including the method documented in this study, where the entire or larger parts of the energy balance residual should be attributed to H (Charuchittipan et al., 2014;Ingwersen et al., 2011;Mauder et al., 2018).

CONFLICT OF INTEREST
The authors declare no conflict of interest.

DATA AVAILABILITY
Data used for this study are available from the PANGAEA data repository (Denager, Looms, Sonnenborg, & Jensen, 2020).

ACKNOWLEDGMENTS
The Villum Foundation has funded the hydrological observatory, HOBE, and the research reported in this paper. We are very thankful for the opportunities that this donation provides. Additionally, we would like to acknowledge professor emeritus Jens Christian Refsgaard, Geological Survey of Denmark and Greenland, for inspiration and approaches for analyzing the uncertainty aspects of the water balance components. Researcher Mie Andreasen, Geological Survey of Denmark and Greenland, is acknowledged for sharing data from the CRNS method with us.