New data‐driven method for estimation of net ecosystem carbon exchange at meteorological stations effectively increases the global carbon flux data

The eddy covariance (EC) flux stations have great limitations in the evaluation of the global net ecosystem carbon exchange (NEE) and in the uncertainty reduction due to their sparse and uneven distribution and spatial representation. If the EC stations are linked with widely distributed meteorological stations using machine learning (ML) and remote sensing, it will play a big role in effectively improving the accuracy of the global NEE assessment and reducing uncertainty. In this study, we developed a framework for estimating NEE at meteorological stations. We first optimized the hyperparameters and input variables of the ML model based on the optimization method called an adaptive genetic algorithm. Then, we developed 566 random forest (RF)‐based NEE estimation models by the strategy of spatial leave‐out‐one cross‐validation. We innovatively established the Euclidean distance‐based accuracy projection algorithm of the R square (R2), which could test the accuracy of each model to estimate the NEE of the specific flux at the weather station. Only the model with the highest R2 was selected from the models with a prediction accuracy of R2 > 0.5 for the specific meteorological stations to estimate its NEE. 4674 out of 10,289 weather stations around the world might match at least one of the 566 NEE estimation models with a projected accuracy of R2 > 0.5. The NEE estimation models we screened for the meteorological stations showed a reliable performance and a higher accuracy than the former studies. The NEE values of the most (96.9%) screened meteorological stations around the world are negative (carbon sink) and most (65.3%) of those showed an increasing trend in the mean annual NEE (carbon sink). The NEE dataset produced at the meteorological stations could be used as a supplement to the EC observations and quasi‐observation data to assess the NEE products of the global grid. The NEE dataset is publicly available via the figshare with https://doi.org/10.6084/m9.figshare.20485563.v1.


| INTRODUC TI ON
The net ecosystem carbon exchange (NEE) is the key carbon flux component within terrestrial ecosystems and plays an essential role in a better understanding of the global carbon cycle and landatmosphere interaction (Shiri et al., 2022). Accurate estimation and validation of NEE of the terrestrial ecosystems in regions or globally are of great significance in evaluating the function of the regional carbon source and sink. However, the NEE estimation has several common issues, such as the poorer modelled performance than the gross primary production (GPP). The main reason for this difference is that NEE is associated with both GPP and ecosystem respiration (RE) and remote sensing data cannot readily capture the parameters related to RE (Ichii et al., 2017;Tramontana et al., 2016).
The eddy covariance (EC) flux measurements have been providing detailed time series of the carbon fluxes, energy fluxes and atmospheric conditions across a large range of biomes and climate types. However, the EC measurements are site-scale observations which only represent the fluxes from the tower footprint up to several square kilometres (Gockede et al., 2008). Moreover, the observational constraints, such as a low number and unevenly distributed observation stations, limit the validation and extrapolation of the NEE estimation at large scales (Jung et al., 2009).
The ongoing efforts of the FLUXNET community and continuous improvement of the spatiotemporal resolution of remote sensing data have encouraged the application of the data-driven machine learning (ML) method such as the random forest (RF, Shiri et al., 2022), artificial neural networks (ANNs, Evrendilek, 2014), support vector regression (SVR, Ichii et al., 2017), cubist (Xiao et al., 2008;Xiao et al., 2011) or model trees ensemble (MTE, Jung et al., 2009 to estimate the terrestrial ecosystems' carbon dioxide, water and energy fluxes from a site scale to the regional and global scale (Xiao et al., 2019). The accuracy of the ML model is generally better than linear regression, ecosystem model, remote sensing inversion and other model methods, which has been proved in the application research of related geosciences . However, various uncertainties still exist in the ML upscaled output, for example the hyperparameters' setting of the ML, data quality, uneven spatial distribution of the EC stations and the representativeness of the training examples.
As with other NEE estimation models at a global or regional scale (e.g. process-based biophysical models and atmospheric inverse models), the evaluation and validation of the NEE estimation were only conducted at limited EC stations. When applied in regional or global extrapolation, it lacks a validation of the simulation results at regions without flux stations. More widely distributed meteorological stations have the potential to deliver more reliable NEE datasets to offset the limitation of the NEE validation in regions without flux stations .
The input variables and hyperparameter settings seem two factors that substantially improve the performance and reduce the computation time of the ML and should also be carefully considered . These two factors are critical because the choice of the feature subsets affects the appropriate hyperparameters and vice versa (Huang & Wang, 2006). Previous studies only individually optimized these two, either feature subset selection or hyperparameters, which greatly limits the potential for enhancing the ML performance (Ichii et al., 2017;Tramontana et al., 2016). A synergistic optimization of these two factors to search for every possible combination is a computationally expensive task. The genetic algorithm (GA) has been widely used in previous studies to find the most efficient and accurate model combination automatically and has been proven to be an effective method so as to solve this problem (Tao et al., 2019). In addition, due to the dual effects of natural and human activities, the observation timeseries of NEE and the environmental variables (e.g. the precipitation, the NDVI and EVI) often have complex time series characteristics such as the non-linear, nonstationary, lagged response and multi-time scale Tramontana et al., 2016) and thus the prediction of the long-term NEE dynamics appears to be a difficult modelling task (Friedlingstein et al., 2020;Jung et al., 2019). By decomposing the time series data into separate time series components representing the long-term trend variation, seasonal variation and residual variation (i.e. remaining information in the time series), the complexity of the time series might be effectively reduced while protecting a large part of the small-scale information (Horemans et al., 2020) (Table S1.1), which were collected from the FLUXNET2015 Dataset (Pastorello et al., 2020). The daily data were aggregated to 8-day values according to the following criteria: (1) The corresponding quality flags of NEE (i.e. NEE_VUT_REF_QC) should be >0.75 (Yang et al., 2020).

| Global meteorological station observations
The global meteorological station data used in this study were derived from the Global Surface Summary of the Day, which provides daily weather observations beginning in 1929. These daily observations were screened and aggregated to 8-day values according to the same criteria (2), (3), (4) and (5) with the above-mentioned EC stations. Finally, the observation data from 10,289 global meteorological stations were available, which could be used for a subsequent analysis.

| Explanatoryvariables
We obtained four types of explanatory variables (

| ME THODOLOGY
To improve the accuracy of the NEE prediction and to effectively fill the NEE data in the flux tower observation gap area, a three-step methodology framework with validation and extrapolation experiments was implemented in this study (Figure 1

| Time series decomposition
The NEE variations may be affected by the different time series components of the driver variables. Therefore, we decomposed each time series to explain the variables (i.e. 8-day continuous variables of the meteorological and remote sensing data over time in Table 1) into the major components: long-term trend variation, seasonal variation and residual variations, by using the Prophet model performed in Python. Then, these components are used as part of the explanatory variables to train the ML model in our study.
The Prophet model, which was recently developed by Facebook (Taylor & Letham, 2018), was designed for the analysis and forecasting of the time series data based on an additive model. Compared with the traditional time series decomposition methods, the Prophet model has no requirements regarding the regularity of the measurements' spacing and excellent adaptability to the change points in the data. It is extremely robust to the missing values, trend shifts and a large number of outliers and could achieve better results than the other traditional methods. The Prophet model has been widely evaluated in various research including the atmosphere and air quality assessment (Belikov et al., 2019).

| Adaptive genetic algorithm
The GA is a random global search optimization algorithm that simulates the biological evolution raw of the natural selection and genetic processes such as selection, cross-over and mutation to identify the optimal solutions (Whitley, 1994). In this paper, we first encoded the ML hyperparameters and the subset of the explanatory variables in a chromosome (also known as individual) using the binary of '0' and '1' that are analogous to the genes, in which '1' indicates that the hyperparameters or variables were selected. value (to produce a group of individuals more suitable for the environment, which makes the population evolving to a better area in TA B L E 1 List of the explanatory variables used for the ML training. The 'Type of Variability' indicates how the values of various variables change for a given pixel. '8-day' is the time step of this study. 'Static' variables mean never changing over time but can be used to illustrate some specific characteristics of NEE. The 'Monthly' and 'yearly' variables refer to a change in month and year, respectively.

F I G U R E 1
The framework for the global net ecosystem carbon exchange estimation at the weather stations.
the search space). In this way, the population continues to reproduce and evolve and finally converges into a group of individuals who are most suitable for the environment, thus obtaining a global optimal solution for the given problem ( Figure 2e). The probability of cross-over and mutation in the standard GA is a constant, while the improvement of adaptive genetic algorithms (AGA) includes the adaptive adjustment of the genetic parameters according to the fitness function, which will maintain the population diversity, improve the computational efficiency and speed up the convergence of the algorithm. In addition, we also adopted the Elitist Preservation (Leung & Liang, 2003) strategy so as to directly copy the best individual in the population evolution process to the next generation without a genetic operation, which could effectively prevent the loss of the best individuals in the next generation and improve the global convergence ability of the standard GA.    (Table S3.2). Moreover, the Taylor diagram (Taylor, 2001) was used to compare the performance of the different algorithms.

| Algorithm training
In the training process, the spatial leave-out-one cross-validation (SLOOCV) was applied so as to develop a series of spatially extrapo-

| TransferabilityevaluationandNEEprediction for the meteorological stations
The grid-based NEE data did not evaluate the accuracy of each grid due to the sparse flux stations. Therefore, the NEE estimation at the weather stations will greatly expand the number of the global EC stations to evaluate the grid products. However, not all meteorological stations can be used to predict the NEE, and it is necessary to preevaluate the precision of the weather stations. The magnitude of R 2 has been selected as a metric to measure the precision of the meteorological stations in this study. We established the R 2 prediction model of the meteorological stations according to the Euclidean distance (ED), and we selected the model corresponding to the maximum R 2 value for each meteorological station to predict the NEE of the station. The framework was constructed using the following five steps.

| Step 1-Calculation of ED at the flux stations
After applying the SLOOCV to all flux stations, each IGBP classification and climate zone, we developed a series of prediction models based on 189 flux stations. The selection of the matched model for each weather station from the prediction models (i.e. transferability evaluation) depends on the relation of the geographic similarity and R 2 between the test sets and the training sets. Here, the geographic similarity was quantified using the ED between each test set and training set in the attributed space (Yang et al., 2008).
The ED is defined as: where d is the ED, x shows the explanatory variables in the training set, and y represents the corresponding variables in the test set. N represents the sample size of the variables for each station. For a specific flux station, the R 2 value (obtained from the NEE prediction model when it is a test set in the SLOOCV) and the ED together construct the MLR database (Figure 3a).

|
Step 2-R 2 estimation model (M ~ R 2 ) According to the EC between the training set and the test set, an estimation model of R 2 was constructed using a multivariate statistical model (MLR), which is expressed as: where M ∼ R 2 demonstrates the R 2 values of the meteorological stations, the a 0 , a 1 , … , a n stand for the regression coefficients and d 1 , d 2 , … , d n illustrate the ED of the factors influencing the NEE flux factors between the training set and test set.

|
Step 3-ED database between the explanatory variables of the flux stations and the same variables at the meteorological stations The method in step 1 has been migrated to the meteorological stations so as to calculate the ED between an influencing factor in the training set and the same influencing factor at the meteorological stations. This produces a large database (Figure 3b).

| Step 4
The R 2 migrated to the meteorological stations was obtained by means of the MLR model (Figure 3b), based on the ED database.
Here, the R 2 thresholds (low: R 2 < 0.5; moderate: 0.5 ≤ R 2 < 0.75; high: R 2 ≥ 0.75) were used to assess the transferability of each meteorological station, that is which NEE prediction models could be migrated to the current station or how many available prediction models could be matched at the current station.

| Step 5
With respect to the meteorological stations that could be connected with an applicable RF model, the NEE dataset of the meteorological stations is constructed in order to analyse the mechanisms behind the carbon dynamics.

| 1Modelperformance
The Taylor diagram summarized the performance of six models through a combination of the correlation coefficient (R), root-meansquare error deviation and standard deviation between the NEE estimation and observations (Figure 4). The RF model showed a better performance than the other models and was selected for further analysis as a result.
In this study, a total of 566 NEE inversion models were constructed based on the RF model with the SLOOCV strategy. For the 8-day NEE estimation (Table 3) (189). The NEE prediction model indicated a variation in performance for each IGBP type and climate zone ( Table 3). Among the 10,829 meteorological stations, 4674 stations met the migration conditions (≥0.5). Overall, the proportion of (2) M ∼ R 2 = a 0 + a 1 d 1 + a 2 d 2 + … +a n d n ,

F I G U R E 4
Taylor diagram of the simulated net ecosystem carbon exchange computed from six independent models with the observed data. The red line indicates the reference observations. the meteorological stations satisfying R 2 ≥ 0.5 was 45.5%, which is distributed over the world, while the remaining 54.5% of the meteorological stations could not find any matching RF model to predict the NEE (i.e. R 2 < 0.5), which is mainly spread over Central America, most parts of Africa, western and eastern Europe and southern Asia (Figure 5a). 42.8% of the meteorological stations could detect more than three RF models for the NEE prediction

| Spatiotemporalpatternsofthe meteorological site NEE
The spatial pattern of the averaged annual NEE at the meteorological stations was comparable to the results of the NOAA CarbonTracker, version CT2019B (Jacobson et al., 2020), which denotes that our methods and results are reliable for the NEE estimation at the meteorological station. The NEE values of most meteorological stations around the world are negative, indicating that these stations are carbon sinks. There are four areas with significant carbon sinks including Eastern America, Central Europe, and East and South Asia.
The long-term trend of NEE was determined by means of a linear regression method (Figure 6b). In general, most of the meteorological stations showed a decreasing trend in the mean annual NEE located in North America, Europe and Northern Asia. According to the results of the spatial pattern and trend analysis, the eastern part of America is a region where carbon dynamics change dramatically.

TA B L E 3
The model performance.
In the second column, N refers to the number of flux stations. In the RMSE, R 2 and MAE columns, the evaluation and standard deviation (between brackets) of the random forest output per IGBP and climate zone were shown. The last column illustrates the number of flux stations that satisfy R 2 ≥ 0.5 and the proportion (in parentheses). The list of acronyms refers to Table S3.1.
types, NEE showed better outcomes in DBF, MF, boreal and temperate continental, which also have a high proportion of meteorological stations that could be migrated. In contrast, the relatively poor performance in EBF and tropical forest is explained by difficulties in capturing the small seasonal dynamics of NEE in the EBF and tropical regions. Moreover, the limitations of the availability of the explanatory variables caused by the large amount of cloud coverage may also lead to low performance in these ecosystems. Difficulties F I G U R E 5 Global distribution of the transferability (a) and number of the matched models (b) at the meteorological stations. In (a), the metrics for transferability are represented by the magnitude of R 2 , where R 2 ≥ 0.5 refers to the fact that the stations meet the prediction model migration criteria and could be used to predict NEE. In (b) N represents the number of the matched models. N = 0 indicates that the meteorological stations could not be linked with any type of model (i.e. the prediction accuracy of the meteorological stations is R 2 < 0.5). N = 3 indicates that the meteorological stations could be connected with three types of models (i.e. the prediction accuracy of this meteorological station is R 2 ≥ 0.5).
in data-driven estimation of CO 2 fluxes in EBF and tropical regions are also reported in previous studies (Ichii et al., 2017;Tramontana et al., 2016). An effective way to improve the prediction accuracy of NEE in these areas is to increase the number of the EC stations in these areas.
Our results showed that the NEE performance is slightly better than for previous studies, and these studies have also pointed out more challenges in reproducing the NEE magnitude and seasonality than the other carbon components (e.g. GPP; Ichii et al., 2017;Tramontana et al., 2016;Xiao et al., 2011). The main reason for these differences in the model performance is likely as follows: Firstly, the NEE in some stations experienced substantial disturbances and fire. Ueyama et al. (2013) added the fire history at the fire-disturbed stations into the SVR-based modelling and found that the disturbance history was important to estimate NEE. Other disturbance information, including both natural and anthropogenic factors, such as land use change, turnover of the soil organic matter, hurricanes, afforestation, and extreme water stress, might improve the performance of the NEE estimations.
These disturbances may reduce the carbon exchange rate, while the MODIS data are not sensitive to capturing these disturbances, which could lead to a serious overestimation of the carbon exchange rate. Secondly, NEE is difficult to estimate because it is connected to the GPP and RE. It is particularly the RE that is difficult to estimate at a global scale due to the limited understanding of the complex interactions of the physical, chemical and biological processes and the remote sensing-based data unable to capture the parameters related to the RE (Ichii et al., 2017;Jägermeyr et al., 2014).
Furthermore, our migration model has an overall accuracy of 80.4% and is reliable. We thus assessed the transferability of 10,829 meteorological stations using the migration model, of which 4674 stations met the migration conditions. The DBF, MF and boreal have a relatively high proportion of stations to be migrated. We believe that the NEE accuracy of the meteorological stations is higher than the corresponding pixels in the global grid results that are generated by other upscale methods. Firstly, the meteorological data directly observed by the meteorological stations are used as input variables TA B L E 4 Transferability evaluation by the multivariate statistical model (MLR) at all meteorological stations over PFT and climate zone. N refers to the number of meteorological stations in different groups, R 2 < 0.5 represents the low transferability, 0.5 ≤ R 2 < 0.75 refers to the medium transferability, and R 2 ≥ 0.75 refers to the high transferability. The coloured metrics showed the percentage of different groups of the coefficient of determination (R 2 ) value (the darker the blue, the more stations in the group). The acronyms in the first columns are the same as for Table 3. in the data-driven model to estimate NEE instead of using the grid data based on reanalysis. The grid data such as reanalysis data are integrated data which assimilate the meteorological prediction data generated by atmospheric dynamics process, various ground-based observation data measured by meteorological stations, and satellite remote sensing data. It is the optimal reflection of atmospheric conditions but not the true reflection. Grid data has inherent uncertainty in the production process. In contrast, the meteorological variables obtained from meteorological stations are direct measured by related instruments and represent the actual situation of the near-surface.

| CON CLUS IONS
In this study, we estimated the NEE at the meteorological stations with 8-day intervals by combining the remote sensing data and FLUXNET2015 data with ML algorithms. The main results of the study were as follows: (1) The RF-based NEE estimations showed reliable outcomes and performed better than other similar studies.
F I G U R E 6 (a) The spatial pattern of the averaged annual NEE (g C m −2 year −1 ) at the meteorological stations. Negative fluxes (blue colours) represent the CO 2 uptake by the land biosphere, whereas the positive fluxes (red colours) indicate the CO 2 release from the land biosphere to the atmosphere. (b) The trend of the average annual NEE.
(2) Among the 10,829 meteorological stations, a total of 4674

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors declare that they have no competing interests.

PEER R E V I E W
The peer review history for this article is available at https:// Supporting Information S1. Eddy covariance study sites used for this study.    were optimized synchronously using the AGA algorithm.

Supporting Information S4. IGBP plant functional types and
Köppen-Geiger climate zones.