A Spatially Explicit Uncertainty Analysis of the Air‐Sea CO2 Flux From Observations

In order to understand the oceans role as a global carbon sink, we must accurately quantify the amount of carbon exchanged at the air‐sea interface. A widely used machine learning neural network product, the SOM‐FFN, uses observations to reconstruct a monthly, 1° × 1° global CO2 flux estimate. However, uncertainties in neural network and interpolation techniques can be large, especially in seldom‐sampled regions. Here, we present a three‐dimensional (latitude, longitude, time) gridded product for our SOM‐FFN observational data set consisting of uncertainties (pCO2 mapping, transfer velocity, wind) and biases (pCO2 mapping). We find that polar regions are dominated by uncertainty from gas exchange transfer velocity, with an average 48.7% contribution. In contrast, for subtropical regions, wind product choice contributes an average 50.0%. Regions with fewer observations correlate with higher uncertainty and biases, illustrating the importance of maintaining and expanding existing measurements.


Introduction
As fossil fuel emissions continue to increase, carbon dioxide (CO 2 ) levels rise in our atmosphere, leading to changes in the air-sea carbon flux and alteration of the ocean carbon system (Landschützer et al., 2023;Sarmiento & Gruber, 2006).The ocean is a crucial component of global climate and climate change mitigation through its role in uptake of atmospheric CO 2 .It has absorbed roughly 30% of anthropogenic carbon emissions, totaling about 2.9 ± 0.4 GtC yr 1 in the last decade (Friedlingstein et al., 2023).The long-term ocean response to atmospheric pCO 2 varies depending on emission scenarios, with high emissions leading to growth of carbon sink and low emissions causing a decline as the ocean equilibrates with the atmosphere (Fay & McKinley, 2013;Gruber et al., 2023;Randerson et al., 2015;Ridge & McKinley, 2021).In order to make accurate predictions on the future of our global climate system and improve understanding of the oceans role as a climate regulator, we must quantify the amount of carbon exchanged at the air-sea interface, which is done using a combination of models and data-based estimates.In recent years though, we've seen a divergence between the two methods in the global carbon budget analyses (Friedlingstein et al., 2023;Hauck et al., 2020), which is a worrisome development.
In 2015, 196 countries signed the UNFCCC Paris Agreement, an international endeavor to mitigate climate change.With increased worldwide efforts to quantify the global stocktake, closing the carbon budget and constraining the uncertainty of our estimates is increasingly essential, as coastal regions could be a valuable carbon sink for many nations.However, ocean carbon is not currently a value that can be measured by satellites, so in order to span both spatial and temporal resolutions, other methods of calculation must be employed (Rödenbeck et al., 2015).Observation-based products are useful tools to statistically interpolate the partial pressure of carbon dioxide (pCO 2 ) observations in order to resolve global-scale products (Fay et al., 2021;Rödenbeck et al., 2015).One product that closes the surface pCO 2 data gaps and is used in the Global Carbon Budget (Friedlingstein et al., 2023) is based on a two-step neural network approach, the self-organizing map feed-forward neural network (SOM-FFN) (Landschützer et al., 2013).The SOM-FFN reconstructs pCO 2 at a 1°× 1°resolution and a monthly timescale.From the resulting interpolated surface pCO 2 maps, the air-sea exchange is derived.
Despite the importance of understanding the air-sea carbon flux, observations are limited, and uncertainty in airsea transfer can be high (McKinley et al., 2023;Roobaert et al., 2019;Woolf et al., 2019).While the SOM-FFN estimates have proven to provide robust estimations on mean and seasonal timescales, uncertainties and biases in interpolation techniques can be large, especially in remote or seldom sampled ocean regions and on longer timescales (Gloege et al., 2021;Hauck, Nissen, et al., 2023).These uncertainties, consisting of mapping or extrapolation uncertainties as well as those associated with the kinetic transfer of gas across the air-sea surface, have the potential to impact our ability to balance regional and global carbon budgets.Despite this, they are seldom taken into account in synthesis efforts as they vary in time and space.
Here, we develop a three-dimensional (time, longitude, and latitude) gridded uncertainty product and analyze the regional and seasonal implications of the output.This allows for evaluation of high-uncertainty regions and identification of areas in the methods that may be further constrained.The open-source product provides tools to improve computations of air-sea CO 2 fluxes and uncertainties in support of regional carbon budgets.The identification and analysis of sources of uncertainty is important for improving ocean carbon flux products, and will contribute to a more accurately calculated global stocktake of ocean carbon.

The SOM-FFN Method
The SOM-FFN uses two stages to establish a relationship between independent predictor data and observed pCO 2 to output a monthly 1°× 1°resolution for global sea surface pCO 2 and the air-sea flux.While the method and its extension have been extensively described in (Landschützer et al., 2013(Landschützer et al., , 2014)), we provide a brief summary here highlighting key features.First, a neural network clustering algorithm (i.e., a self-organizing map) defines a set of biogeochemical provinces that share common relationships.These relationships are defined by clustering known driver data of pCO 2 (sea surface temperature (SST) (Huang et al., 2020), sea surface salinity (SSS) (Good et al., 2013), mixed layer depth (MLD) (Montégut et al., 2004)) and pCO 2 climatology (Takahashi et al., 2009) to create dynamical provinces designed to follow the seasonal CO 2 cycle in the ocean (Landschützer et al., 2013).Second, a feed-forward network (FFN) derives a non-linear, continuous relationship between pCO 2 measurements and driver data (SST, SSS, MLD) as well as chlorophyll-a (GlobColour, http://globcolour.info) and atmospheric CO 2 (Dlugokencky et al., 2021) for each biogeochemical province defined by step 1.This relationship is used to fill surface ocean pCO 2 gaps where no direct measurements exist.We calculate pCO 2 from Surface Ocean CO 2 Atlas (SOCAT) fugacity of CO 2 observations (Bakker et al., 2016).
We conduct our runs from 1982 to 2022, and calculate mean values for years 1990-2022 to constrain our analysis to temporal periods during which we see more consistency between different pCO 2 products (Fay et al., 2021;Rödenbeck et al., 2015).

Air-Sea Flux Calculation
Using the methods developed in (Landschützer et al., 2013(Landschützer et al., , 2014(Landschützer et al., , 2016)), incorporating a standard quadratic bulk parameterization developed in (Garbe et al., 2014;Ho et al., 2006;Wanninkhof, 1992), we calculate the flux density (in molC m 2 yr 1 ) from our reconstructed surface ocean pCO 2 using: where S CO 2 is the gas solubility of CO 2 in seawater, f ice is fraction of sea ice cover (a dimensionless value between 0 and 1, (Rayner et al., 2003)), and pCO atm 2 is the pCO 2 at the air-sea interface.Atmospheric pCO 2 at 100% humidity was calculated following (Landschützer et al., 2013) using the atmospheric molar fraction of CO 2 (xCO2) data from the NOAA Marine Boundary Layer product (Dlugokencky et al., 2021), the NCEP sea Geophysical Research Letters 10.1029/2023GL106636 level pressure, and the water vapor correction following (Dickson et al., 2007).A positive flux is CO 2 outgassing from ocean into atmosphere, and a negative flux is ocean sink.Our baseline run adopts the mean gas transfer velocity following (Naegler, 2009), where we adjust the scaling factor to reach a global mean transfer velocity of 16.5 cmh 1 .The gas transfer velocity of CO 2 , k w , can be defined using Where Sc is the dimensionless Schmidt number, (a) represents the gas-transfer coefficient, and < U 2 > represents wind speed at 10m height (m/s).

Uncertainty Quantification and Data Incorporation
From these equations, we identify three major sources of bias and uncertainty highlighted in recent literature (Gloege et al., 2021;Hauck, Nissen, et al., 2023;Roobaert et al., 2018): (a) biases and uncertainty stemming from the mapping of unobserved ocean regions associated with the gradient in pCO 2 (mainly driven by surface ocean pCO 2 ), (b) uncertainty associated with k (the gas exchange transfer velocity), and (c) the uncertainty associated with choice of wind product used in Equation 2. Considering each source independent, we calculate overall uncertainty through: Where μ represents bias or deviation from the true model pCO 2 , and σ represents the random, unresolved uncertainty terms associated with the flux equation components.This statistical-based equation was developed following the linear model (e.g., Peterson et al., 2001), with further detail in the supplement.
To estimate σ pCO 2 we employ the approach by (Gloege et al., 2021), using the residuals from the MPIM-HAMOCC model (Ilyina et al., 2013), run in hindcast mode to represent realistic climate of the most recent decades.We subsample the MPIM-HAMOCC model along SOCAT measurement locations (Figure S1 in Supporting Information S1) and reconstruct the known model field with driver data from the model.Using the subsampled model data set, we run the SOM-FFN to interpolate a 1°× 1°surface ocean pCO 2 , and repeat this 5 times, each realisation with a different, randomly chosen training and validation data set (see (Landschützer et al., 2013)).We then calculate the standard deviation of the difference between reconstructions and SOM-FFN output to quantify σ pCO 2 .We estimate μ by adopting the approach by (Hauck, Nissen, et al., 2023), using the mean pCO 2 deviation of the 5 reconstructions from the true model pCO 2 .While this represents a small ensemble, we chose it to be comparable to the ensembles of the other σ terms.σ k is quantified using the standard deviation across 4 commonly used quadratic parameterizations of the gas transfer coefficient ( (Ho et al., 2006; Sweeney  et al., 2002), and NCEP-NCAR Reanalysis 1 (Kanlay et al., 1996).The supplement contains further specifications on data preparation and selection.
For analysis, regions are determined using the open-ocean biomes defined by Fay and McKinley (2014), which are classified using SST, spring/summer chlorophyll-a concentrations, ice fraction, and maximum MLD (Fay & McKinley, 2014).We combine their 17 biomes into 6 larger regions.While we focus on major uncertainty terms related to the extrapolation of sparse data and gas transfer formulation, there are terms we neglect here due to negligible influences on larger-scale fluxes.This includes the effect of scaling of the gas exchange in ice covered regions to the ice-free ocean area and the choice of SST product on gas solubility.

Results
Globally averaged, the μ term is negligibly small ( 0.007 molC/m 2 /yr).This is somewhat contradicting toward results in (Hauck, Nissen, et al., 2023) that show despite compensating pCO 2 errors globally, errors in higher wind speed regions cause biases in the flux as well.We explain this difference through both the use of a different model and potentially the use of a single realisation compared to the mean of several runs used here.This difference is also reflected in the σ pCO2 uncertainty below, which shows that single realizations can differ substantially from each other.At regional scales, however, we observe larger variation ±0.5 molC/m 2 /yr (Figure 1a), which are comparable to errors observed in Gloege et al., 2021, although estimated from different model types and scenarios (i.e., large ensembles vs. hindcast simulations).This gives us confidence we can faithfully pinpoint biases in our reconstruction.Figure 1b further shows zonal averages, demonstrating higher bias in regions with a lower density of observations (observation density Figure S1 in Supporting Information S1).While our reconstruction bias doesn't directly impact the uncertainty quantification globally due to compensating errors, considering the influence of data sparsity on the flux calculations is critical for understanding the variability in our calculations, particularly on regional scales.
Figure 2 shows maps of the 2022 SOM-FFN data product, which is available through the NOAA National Centers for Environmental Information (Jersild et al., 2023).We see (a) air-sea flux as well as (b) flux uncertainty and (c) percentage of mean flux, all averaged from 1990 to 2022.We observe higher flux uncertainty in polar regions and the Pacific equatorial, but find the Southern Ocean and equatorial coastal regions dominate in proportional flux uncertainty.Part of this is due to lower regional fluxes, making small amounts of uncertainty more significant.In the Southern Ocean, where we have almost 45% uncertainty throughout the region, we also have a lower number of observations contributing to increased variance in the SOM-FFN method, which may contribute to higher uncertainty.We also see higher uncertainty in polar regions and areas (seasonally) covered in sea-ice, which provides important motivation to continue to increase our observations of the harder-to-observe but climatically pivotal regions.
Many of the regional errors balance each other, leading to overall smaller uncertainty when looking at the globally integrated air-sea CO 2 flux.The global average uncertainty from 2012 to 2022 is ±0.31GtC/yr, which is lower but comparable to Global Carbon Budget 2023 (Friedlingstein et al., 2023) estimation of ±0.4 GtC/yr.That being said, while we do investigate uncertainty in the contemporary fluxes, we employ different methods and do not account for uncertainties in river fluxes or coastal regions as they do in the Global Carbon Budget.Temporally, the range of average uncertainty spans from a minimum of ±0.21 GtC/yr in 1997 and a maximum of ±0.42 GtC/yr in 2021, with seasonal variation.
Seasonally, we find average increases in uncertainty in January in both hemispheres (Figure 3), with a global average increasing from ±0.25 GtC/yr in July to ±0.37 GtC/yr in January.Although opposite seasons, we have different sources of uncertainty dominating these average values (Figure 4).We further break down into six ocean regions to compare average seasonal uncertainty at regional scales (Figure 3, Table 1).Uncertainty is greater in high latitudes for both hemispheres, with year-round averages of ±0.35 GtC/yr for the Southern Ocean and ±0.51 GtC/ yr for the northern hemisphere high latitudes (see Table 1), both averaging an uncertainty around 40% of total flux.
Using our calculation from Equation 3, we identify the contributions of each source of uncertainty as k-dominated (driven by the k w gas transfer velocity coefficient), pCO 2 -dominated, or wind-dominated.Figure 4 shows the regional and seasonal breakdown of each of these drivers on overall uncertainty.We find that regardless of season, uncertainty in polar regions (averaged poleward of 60°north and south) is driven by the uncertainty associated with k w , contributing 48.7% to total uncertainty (Table 2).In contrast, the Subtropical region centered around the 30°latitude has wind-driven uncertainty contributing about 50% to overall uncertainty (Figure 4, Table 2) However, smaller regions do experience seasonal variation.The North Atlantic, which sees a strong increase in uncertainty during wintertime, experiences a transition seasonally from mixed sources to winddominated (Figure 4).Increased wintertime wind speed drives this seasonal uncertainty and is a strong contribution to increased standard deviation in flux in this region.

Conclusions
Here, we provide an explicit spatial quantification of the uncertainty in the air-sea CO 2 exchange from a commonly used observation-based method and thus provide a step forward in our ability to (a) understand modeldata differences, (b) provide a realistic outlook regarding global and regional stocktake efforts and (c) provide explicit spatial and temporal information where efforts need to be undertaken to reduce present day uncertainties.
While previous studies have investigated uncertainties in the air-sea CO 2 exchange (Gloege et al., 2021;Hauck, Nissen, et al., 2023;Roobaert et al., 2018) we, for the first time, combine these well established terms to provide a view of current biases and uncertainties in the air-sea CO 2 exchange for an observational product on both a temporal and spatial scale.This new addition provides great opportunities for future research focusing on both temporal or regional-scale analysis.
We find that effective biases (μ) that result from our single model reconstruction by interpolation of sparse observations are small at global scale when the mean of multiple realizations are investigated, but can be substantial on regional scales.Global flux uncertainty is strongly seasonally dependent, with higher average values in January than in July.However, these regionally averaged patterns are driven by different sources of uncertainty, as we see increased austral summertime uncertainty in the gas-exchange-dominated Southern Ocean and increased boreal wintertime uncertainty in the northern hemisphere.The Northern Atlantic shift seasonally to wind-driven uncertainty drives the elevated boreal wintertime uncertainty averages, leading to the interesting global uncertainty correlations regardless of the opposite seasons.Year-round, polar regions are dominated by the gas exchange transfer velocity uncertainty, whereas the subtropics are driven by wind-driven uncertainty.pCO 2driven uncertainty dominates in the non-polar regions with lower observational availability, such as the southern Indian Ocean and the southern hemisphere open Pacific ocean.
While we include the main uncertainty and bias terms in our calculation that have been identified in recent literature (Gloege et al., 2021;Hauck, Nissen, et al., 2023;Roobaert et al., 2018), there are additional sources of uncertainty that are not considered here, for example, related to coastal systems, varying river input, the cool skin effect (Takahashi et al., 2009;Watson et al., 2020) or the optimal ensemble size for model subsampling.Most of these are either still under debate or cannot be resolved spatially and temporally at present.Furthermore, measurement uncertainty is not included, as Landschützer et al., 2014 shows that this term is small compared to the much larger uncertainty from extrapolating sparse measurements.
Our study highlights future areas where process research is needed to reduce uncertainty and biases, for example, the quantification of the influence of sea ice.We demonstrate the need to continue improving parameterizations and understanding the impact of variability in carbon flux estimations in order to improve our ability to quantify marine carbon budgets.

Figure 1 .
Figure 1.Reconstruction Bias and zonal averages averaged 1990-2022 (a) Total bias between reconstruction and model truth, shown in molC/m 2 /yr.(b) Average zonal bias, over same time period and units.Values indicate magnitude of flux bias.

Figure 2 .
Figure 2. SOM-FFNv2022 air-sea CO 2 flux output averaged from 1990 to 2022.(a) Total air-sea flux density, shown in molC/m 2 /yr, generated using the standard observation-based SOM-FFN run.(b) Average uncertainty calculated from Equation 2, shown in molC/m 2 /yr.(c) Percentage uncertainty calculated for each point as flux uncertainty divided by the absolute value of the total flux.

Figure 3 .
Figure 3. Seasonal Uncertainty, broken down by region.Note that Pacific equatorial region includes both east and west Pacific.(a-f) Shows climatology of CO 2 flux (molC/m 2 /yr) broken down by regions, defined based on (Fay & McKinley, 2014).Time series shows seasonal average with shaded uncertainty of the mean.(g) Shows mean July uncertainty averaged across 1990-2022.(h) Shows mean January uncertainty averaged across 1990-2022.

Figure 4 .
Figure 4. Breakdown of uncertainty sources, compared seasonally.(a) July uncertainty sources (averaged 1990-2022) color indicating dominating source of uncertainty (k, pCO 2 , and wind).(b) Percentage each source contributes to total uncertainty, averaged latitudinally.(c, d) Same as (a, b) but for January.

Table 2
Uncertainty Sources by Percentage of Influence on Total Uncertainty