A New Post‐Hoc Method to Reduce the Energy Imbalance in Eddy Covariance Measurements

Latent and sensible heat flux observations are essential for understanding land–atmosphere interactions. Measurements from the eddy covariance technique are widely used but suffer from systematic energy imbalance problems, partly due to missing large eddies from sub‐mesoscale transport. Because available energy drives the development of large eddies, we propose an available energy based correction method for turbulent flux measurements. We apply our method to 172 flux tower sites and show that we can reduce the energy imbalance from −14.99 to −0.65 W m−2 on average, together with improved consistency between turbulent fluxes and available energy and associated increases in r2 at individual sites and across networks. Our results suggest that our method is conceptually and empirically preferable over the method implemented in the ONEFlux processing. This can contribute to the efforts in understanding and addressing the energy imbalance issue, which is crucial for the evaluation and calibration of land surface models.

• Observed systematic imbalance of energy flux (∼17%) across the network of eddy covariance sites • A theoretically motivated correction method based on available energy variations is proposed • The available energy correction method has conceptual and empirical advantages compared to the method implemented in the ONEFlux pipeline

Supporting Information:
Supporting Information may be found in the online version of this article.
Previous attempts to correct the measured TE are essentially based on assuming closure and positing that Bowen ratio is correctly measured for each timestep (Twine et al., 2000), and this approach has been used in many individual site-based studies.In the standard processing pipeline used to synthesize a data set across the site network, Pastorello et al. (2020) incorporated this Bowen ratio-based closure correction but assumes a closure within a temporal moving window.There are two potential conceptual issues with these Bowen ratio-based closure approaches: Firstly, the closure is based on an assumption that is often violated for example, by imperfect accounting of ground heat storage changes, or net radiation measurement issues which also have a different footprint compared to LE and H. Any factors not related to TE contributing to the imbalance will propagate to biases of the estimated correction factors of the turbulent fluxes.Secondly, the moving window approach assumes that the factors causing the imbalance to vary smoothly on a time-scale of days to weeks (typical moving window size), while this has no theoretical or empirical justification, and may lead to systematically biased correction factors along environmental gradients.An empirical correction method that is theoretically motivated and tied to an observed variable, which is expected to co-vary with the energy imbalance, would be an advancement over current practices.
A variety of factors that could contribute to the observed energy imbalance at flux towers have been identified and discussed (see Foken, 2008;Mauder et al., 2020 for a thorough review).It is useful to distinguish between factors that cause primarily noise, and factors that can explain the systematic bias observed across the network of sites.The former includes footprint differences between the TE and Rn as well as G, or imperfect accounting for heat storage changes of the soil and vegetation (especially for tall vegetation).Regarding systematic biases, several studies based on modeling and observations suggest that the omission of large eddies by the eddy covariance technique could play a major role (Eder et al., 2015;Inagaki et al., 2006;Mauder et al., 2008;Steinfeld et al., 2007;Zhou et al., 2023).Such large eddies develop with a sub-mesoscale circulation driven by thermal gradients in the landscape and modulated by topography (Wanner et al., 2022).Therefore, the differential heating of the land surface, boundary layer growth variations, and emergent characteristics of the non-accounted large scale eddies are directly and indirectly related to changes in AE.As such, here we propose an available energy-based correction method (hereafter, AEC) for eddy covariance based turbulent fluxes.Our method is comparatively simple and broadly applicable to the entire network of flux tower sites and has conceptual advantages compared to the Bowen ratio-based approaches by avoiding the closure assumption and by tying the estimation of correction factors to AE as a causal variable instead of to time (i.e., in the Bowen ratio based closure method (Pastorello et al., 2020;Twine et al., 2000)).We further show empirically that our method can greatly reduce the energy imbalance at flux tower sites across networks and yields improved patterns compared to the Bowen ratio-based corrections implemented in the ONEFlux processing pipeline (hereafter, OFC).

Data
We used 172 eddy covariance sites (from FLUXNET2015, ICOS-Drought2018, ICOS-WarmWinter2020, and AmeriFlux, see Table S1 in Supporting Information S1 for all sites with respective citations in the reference section) where all components of the energy balance equation are available and where there are at least 1-year measurements.All raw data was post-processed following the ONEFlux pipeline (Pastorello et al., 2020), and all half-hourly (for most sites) or hourly data were gap-filled by using the marginal distribution sampling method (Reichstein et al., 2005).Only half-hours or hours with good quality data for Rn, H, LE, and net exchange of carbon dioxide flux (used to filter out the nonturbulent periods) when the quality flag is either 0 (only measured) or 1 (measured with high-confident gap-filled), were reserved.For consistency, data were aggregated to hourly resolution because flux measurements are not available at the half-hourly scale for all sites.Hourly LE measurements (when the quality flags for Rn, H, LE, and net exchange of carbon dioxide flux being 0) were then corrected using the high relative humidity correction method (Zhang et al., 2023), where we also confirmed the small effect of G. To reconcile the complication and uncertainty of the diurnal cycle and the strong G (and storage flux) effects at the sub-daily scale, hourly data were averaged to the daily scale with discarding days when less than half the percentage of good quality within a day.In theory, the effect of G at a daily scale should be minimized and may be neglected, as the heat flux stored in the soil surface during the day should be largely released from the surface during the night, and we further diagnosed the G effect for sites with G measurements (comparing the energy balance closure when setting G to zero across the network, Figures S1 and S2 in Supporting 10.1029/2023GL107084 3 of 13 Information S1) to gain more confidence, we then gap-fill G by setting all missing values to 0 to cover as many days and sites as possible.

Implementation of the AEC
Our proposed correction needs to be applied to time periods that are homogeneous with respect to sensor setup, as data artifacts may affect the energy balance closure.Therefore, we first apply Jung's approach (Jung et al., 2023) to identify breakpoints in the time series associated with marked changes in flux dynamics presumably caused by changes in sensor setup (e.g., maintenance of the system, replacement of instruments, changes in the height of the system) and changes in ecosystem (e.g., severe disturbance).Then for each segment of at least 365 days at one site, the correction factor (FCor) is determined hierarchically by the following steps: 1) we fit a non-linear curve by performing a locally weighted scatterplot smoothing, LOWESS (from statsmodels python package, with the key parameter of "fraction" being 2/3), that de-scribes the variation of TE as a function of AE (Figure 1a): 2) based on the fitted non-linear curve, we then fit a stepwise Ordinary Least Squares regression (OLS, from sklearn python package) for TE versus AE along sorted AE based group (60 values per group), thereby the FCor for each period is 1 over the slope (Figure 1b): (3) 3) We then linearly interpolate the FCor to get the corresponding FCor for each AE (Figure 1c), and FCor at the two tails are extrapolated based on a linear model trained on the nearest 21 values.4) As FCor is sensitive to the non-linear fitting (e.g., at very low and high AE periods), we repeat the upper three steps again by switching TE and AE while fitting the non-linear curve (Figures 1a-1d, and 1e): (5) 5) The final FCor then is determined as the geometric mean of the two FCor calculations (Figure 1f): (7)

Implementation of the OFC
The standardized processing method among the FLUXNET community is the ONEFlux processing pipeline (Pastorello et al., 2020), which adopts data-processing methods that have been well-established and published by the eddy covariance community over the last decades.For TE, a Bowen ratio-based energy balance closure corrected version is provided in the standard ONEFlux processing pipeline (Pastorello et al., 2020).In detail, there are three (sub)methods (the key difference among the submethods is the time window used to determine the reference value), and the correction for each half-hour is implemented according to the three methods hierarchically.Within each time window, the underlying surface conditions are assumed to be stable and the energy balance is supposed to be closed, but in the real world these strict requirements are rarely satisfied due to, for example, vegetation altering the underlying surface conditions (especially during growing seasons) and the reference value is not often close to one (Liu & Foken, 2001;Pastorello et al., 2020;Twine et al., 2000).

Energy Balance Closure and FCor at Three Sites
Here we present three long-running sites as an example (Figures 2 and 3).In general, we can observe clear energy closure problems in the original data at these three sites (Figures 2a-2d, and 2g) from the smaller slope and large negative bias values (with the exception of the AU-Tum site, where the intercept is very high).After applying the AEC, slope, squared Pearson correlation coefficient (r 2 ), and bias (except for the AU-Tum site) are profoundly improved and the intercept and rmsd are reduced.For the AU-Tum site, the TE values are larger than the AE values at low AE conditions, causing the higher intercept and the lower bias, but the AEC can capture well the TE variations as a function of AE regardless of this type of uncertainty in AE.Comparatively, while slope, r 2 (except for the FR-Pue site) and bias are all slightly improved after applying EXC, the magnitudes of intercept are obviously higher than those in the original data, suggesting downward corrections for TE on some days and overcorrection under high AE conditions.On top of that, the rmsd is not well constrained and we can also observe an obviously deteriorated distribution of the values (Figures 2c, 2f, and 2i).
As an example, we demonstrate 1 year at these three sites to clearly present variations of FCor over time and as a function of AE in Figure 3.In general, FCor for AEC shows similar temporal variations to that for EXC, with higher FCor values being observed at higher AE periods, as also suggested by the plot of FCor versus AE.However, the FCor values for OFC exhibits pronounced temporal changes (especially some sudden changes) when the underlying surface changes drastically (e.g., during early and later of the growing seasons when vegetation dominantly alters the surface conditions) and does not follow the daily fluctuations of AE well (Figures 3a,  3c, and 3e).In addition, the FCor values for OFC are overall obviously larger than those for AEC, which may not have a significant effect on the corrected TE magnitude during low TE periods but can markedly overcorrect for TE during higher TE periods.

Overall Energy Balance Closure Across the Network
Across the network of sites, the original data shows a 17% energy imbalance, calculated as 1 minus slope (Figure 4a).The AEC can improve the overall energy balance closure from 0.83 to 0.96 with a relatively higher intercept (2.99 W m −2 compared to −0.62 W m −2 , Figure 4b), and mean bias is reduced by ∼96% (from −14.99 to −0.65 W m −2 ).In addition, r 2 is increased from 0.80 to 0.87 and the rmsd is reduced by ∼48% (from 19.27 to 10.00 W m −2 ).Note that the AEC is applied on top of the high relative humidity correction (as depicted in Figure S3 in Supporting Information S1).Conversely, the implementation of EXC, illustrated in Figure 4c, yields a slope approximating 1 and an enhanced r 2 (0.81); however, the slight increase in r 2 and slight decrease in rmsd are overshadowed by the substantial mean bias (6.27 W m −2 ) and intercept (8.09W m −2 ), indicating a pronounced overclosure.

Variations of EBR Along AE Bins Across the Network
We also evaluate the correction methods by inspecting the variation of site-mean energy balance closure (EBR, the ratio of mean TE to mean AE) across AE bins, which suggests an overall improvement of EBR after applying correction methods (Figure 5).The original EBR is ∼80% across AE bins (consistent with the slope in Figure 4a, and AEC increases the site-mean EBR closer to 100% at middle AE bins, where there are most sites and the majority of data (∼90.66%)across the network.However, applying OFC results in a roughly constant over-closure (the EBR is ∼110%) along the AE bins.In addition, the spatial variations (i.e., the inter-quantile range) of EBR for AEC are slightly lower than those for OFC along the AE bins.

Discussion
Our results demonstrate that the AEC strongly improves the energy balance closure and yields improved consistency between TE and AE after correction.This holds for individual sites (Figure 2) as well as the entire network (Figure 4) where the AEC method rectifies the TE close to the 1:1 line.It also yields larger r 2 with AE compared to the uncorrected fluxes, which indicates that it partially captures the underlying physical process.The fact that AEC works well empirically provides support for its underlying assumption that variations of the energy imbalance are primarily related to variations of AE.This is because the method can only correct the part of the  imbalance that covaries with AE and thereby does not assume or force energy balance closure (because of other factors causing the imbalance (Foken, 2008;Leuning et al., 2012;Maes et al., 2020)).Other factors in addition to AE that would contribute to a systematic imbalance are implicit in the intercept of the TE versus AE relationship, and this intercept is not used to calculate the correction factors.Such factors represented by the intercept include offsets due to calibration issues or footprint mismatches between TE and AE.
The correction factors of AEC suggest an interesting pattern where they first increase approximately linearly with AE up to approximately 100 W m −2 followed by a leveling off or even a mild decrease toward higher AE (Figure 3).The initial increase of the correction factors with AE would be consistent with stronger sub-mesoscale circulation developing due to stronger thermal heterogeneity of the landscape (Mauder et al., 2020;Wanner et al., 2022).The leveling off at high AE might imply that at a certain point of energy input the sub-mesoscale circulation has fully developed, and more energy does not generate more intense circulations, for example, when the thermal heterogeneity does not increase further.Alternatively, the fact that at high AE correction factors stabilize at a high level while being relatively insensitive to changes in AE could suggest that other factors become important that we have not accounted for.Some support for this interpretation is given by the weak trend of increasing under-closure of AEC with increasing AE (Figure 5).For example, changes in dryness (e.g., soil moisture) that impact evaporative cooling and thus spatial surface temperature patterns could be such a factor.
Clearly, we can not assume that all systematic energy balance closure issues scale only with AE (Foken, 2008;Mauder et al., 2020), while our method accounts explicitly only for this part of the energy imbalance (embedding the relative humidity dependent correction proposed in Zhang et al. (2023)).Strictly speaking this implies that our method is incomplete and it indeed shows a weak under-closure, which is more pronounced at high AE conditions.However, a clear advantage of our method is that the correction is directly tied to a theoretically motivated, and accurately measured AE and that associated corrections are therefore directly attributable.This means that the methodology can be extended in the future for second and third order effects, for example, by running the methodology recursively on the residuals with other meaningful variables.At this point we have shown that AE represents a first order effect variable for the energy imbalance, which the AEC method leverages successfully (also be supported by an evaluation on the water-carbon coupling at a daily scale across the network in Figure S4 in Supporting Information S1, where we demonstrate that the water-carbon coupling is preserved), and that the AEC method is conceptually as well as empirically preferable over the EXC.
While the AEC method is a step forward, the holy grail is still missing (to shed the light to get final corrected LE and H): the partitioning of the imbalance to contributions by H and LE.So far, we have no empirical constraints Rn− ) across the AE bins.For each bin, we only reserve sites where there are at least 100 days of data, and then we randomly sampled 100 days without replacement to ensure an equal weight for each site.The last bin, labeled as "All," depicts the site-mean EBR from all data (no sampling) from each site.Red dots and horizontal lines indicate the mean and median values of binned site means, respectively.The right y-axis indicates the cumulative frequency distribution (C.F.D.) of data across the network.The x-axis label consists of the AE bins and the number of sites within the bins.available for this for the entire site network.In the eddy covariance experiments community, promising advancements in comprehension of fundamental transport processes in the atmospheric boundary layer have been observed through complicated field experimental designs, such as multi-tower measurements (Mauder et al., 2010;Oncley et al., 2007), spatially resolved lidar (Eder et al., 2015;Higgins et al., 2013), airborne measurements (Mauder et al., 2007;Paleri et al., 2022), high-resolution large-eddy simulations (Margairaz et al., 2020;Maronga & Raasch, 2013;Zhou et al., 2023), and machine-learning methods (Xu et al., 2018).Yet it is not enough and additional experimental and methodological efforts are essentially needed.Previous modeling studies have suggested that both fluxes are about equally affected (Mauder et al., 2013;Twine et al., 2000) in relative terms (i.e., the Bowen ratio is preserved) or that the sensible heat flux is about twice as much affected (Charuchittipan et al., 2014;Roo et al., 2018), depending on modeling assumptions.Solving this puzzle will need further studies taking into account the effect of dispersive fluxes generated by sub-mesoscale circulations, either from largeeddy simulations (Roo et al., 2018;Wanner et al., 2022) or spatially resolving turbulence measurements (Mauder et al., 2010;Paleri et al., 2022).In addition, lysimeters (which typically provide estimates of evapotranspiration on scales smaller than a flux tower) and balloon soundings (which measured temperature and humidity can be used to estimate H and LE for the vertical profile), can also provide independent reference measurements if they are representative of the flux tower footprint (e.g., Gebler et al., 2015;Mauder et al., 2018;Widmoser & Wohlfahrt, 2018;Wouters et al., 2019).On the scale of the entire network, we can currently only work with assumptions and scenarios on how to partition the TE correction into correction factors for LE and H.Although we lack empirical constraints at the flux tower level, some large-scale constraints on mean LE are available from catchment water balances (Martens et al., 2020).This provides some opportunities for testing hypothesis by propagating different energy imbalance partitioning variants at site level through FLUXCOM (Jung et al., 2019) to yield large scale LE and H patterns for consistency checks with water balance derived estimates.

Conclusions
We present a generalized method to reduce the energy imbalance in eddy covariance measurements across the site network, which is proposed based on the current consensus that sub-mesoscale circulation, fueled by heterogeneity in surface available energy, is the major contributor to the energy imbalance.Our method is based on the covariation of the imbalance with available energy and yields corrections that are empirically and conceptually preferable over the current standard method implemented in the ONEFlux processing pipeline.Additional efforts are urgently needed in the FLUXNET community, as the allocation of the bias to H and LE is a major obstacle, and accurate H and LE estimates are required for land-atmosphere interaction-related studies and land surface model validation.Integrated measurements (by deploying a large number of eddy towers) and independent measurements (e.g., lysimeters and balloon soundings) could help, based on which we will be able to better understand the spatial transport mechanisms in the boundary layer and accurately quantify the dispersive flux.Moreover, we could also use machine learning algorithms and sufficient (ancillary) data to improve our understanding of the energy imbalance by involving other factors and diagnose the energy imbalance impacts on other trace gas flux measurements in the future.

Figure 1 .
Figure 1.Calculation of FCor.The energy imbalance problem in the original daily data (after applying the HRHC correction) is shown in (a).The fitted curves in blue (from Formula 1) and in orange (from Formula 4) are shown in (a), (b), and (d).Note that for the case of Formula 4, we switched the TE and AE back to y-axis and x-axis to simplify the diagram plot.The slopes are estimated by the Ordinary Least Squares regression (using sklearn python package) for every 60-value step along the AE (from Formulas 2 And 5), and the correction factor (FCor) is calculated as 1 over the slope, shown in (b) and (d), respectively.Based on the response curve of FCor to AE, shown in (c) and (e) respectively, FCor is interpolated and extrapolated for each corresponding AE observation.The final FCor is the geometric mean of the FCor from (c) and (e), shown in (f), and the comparison of corrected TE and AE is shown in (g).Note that the correction starts from the ascendingly sorted values of positive AE where there is a significantly positive relationship (p < 0.1) between AE and TE, because the correction is based on an underlying assumption that variations of the energy imbalance are primarily related to variations of AE.

Figure 2 .
Figure 2. Evaluation of the corrections at three long-running sites (AU-Tum, DE-Tha, and FR-Pue).The abbreviations (Ori, AEC, and OFC) after the site codes indicate the original data, available energy-based corrected data, and the corrected data in the ONEFlux pipeline, respectively.The black lines are the standard linear regression associated with the 95 confidence interval for the regression estimation.rmsd denotes the root mean squared difference, and bias indicates the difference between the average of TE and AE.

Figure 3 .
Figure 3. Variability of FCor in time and versus AE, along with their distributions at three sites (AU-Tum, DE-Tha, FR-Pue) in individual years as examples.(a), (c), (d) show the temporal variability of FCor (left y-axis) and AE (right y-axis), and the right panels show the variations of FCor as a function of AE. (b), (d), (f) show the distributions of AE (the top sub-plots) and FCor (the right sub-plots).

Figure 4 .
Figure 4. TE versus AE before and after correction across the network of sites.Each dot represents one site.The black lines are the OLS regression associated with the 95 confidence interval for the regression estimate.