Uncertainty sources for measurable ocean carbonate chemistry variables

The ocean carbonate system is critical to monitor because it plays a major role in regulating Earth's climate and marine ecosystems. It is monitored using a variety of measurements, and it is commonly understood that all components of seawater carbonate chemistry can be calculated when at least two carbonate system variables are measured. However, several recent studies have highlighted systematic discrepancies between calculated and directly measured carbonate chemistry variables and these discrepancies have large implications for efforts to measure and quantify the changing ocean carbon cycle. Given this, the Ocean Carbonate System Intercomparison Forum (OCSIF) was formed as a working group through the Ocean Carbon and Biogeochemistry program to coordinate and recommend research to quantify and/or reduce uncertainties and disagreements in measurable seawater carbonate system measurements and calculations, identify unknown or overlooked sources of these uncertainties, and provide recommendations for making progress on community efforts despite these uncertainties. With this paper we aim to (1) summarize recent progress toward quantifying and reducing carbonate system uncertainties; (2) advocate for research to further reduce and better quantify carbonate system measurement uncertainties; (3) present a small amount of new data, metadata, and analysis related to uncertainties in carbonate system measurements; and (4) restate and explain the rationales behind several OCSIF recommendations. We focus on open ocean carbonate chemistry, and caution that the considerations we discuss become further complicated in coastal, estuarine, and sedimentary environments.

Carbon dioxide (CO 2 ) plays a dominant role in the acidbase chemistry of the ocean through its reaction with water to form carbonic acid and through the buffering provided by the reversible dissociation reactions of that molecule to bicarbonate and carbonate ions.These reactions enable scientists to characterize seawater carbonate chemistry from multiple chemical perspectives, including: seawater acidity (quantified herein as pH on the total hydrogen ion scale, or pH T ), information from the titration of seawater with acid (total alkalinity content, A T ), the total seawater dissolved inorganic carbon content (C T ), the fugacity or partial pressure of aqueous CO 2 in seawater (i.e., fCO 2 or pCO 2 , respectively), and the amounts of various forms of inorganic carbon (e.g., the substance content of free and ion-paired carbonate in seawater, CO 2À   3 Â Ã T ).Two of these measurements can be used alongside information about seawater composition and chemical thermodynamics to calculate many other aspects of seawater carbonate and acid-base chemistry.
The ability to calculate seawater carbonate chemistry variables from one another is challenged by the complexity of seawater acid-base chemistry (Fig. 1).In addition to two carbonate chemistry measurements, fully constraining the 18+ equations describing acid-base reactions in seawater requires information about 10+ equilibrium constants, and the total contents of the 5+ competing major (i.e., those appearing in Fig. 1) and minor acid-base pairs (Supplementary Text S1; Dickson 2011).The equilibrium constants are typically calculated from published functions of temperature, pressure, and practical salinity (S p ), while the conservative competing base pairs such as total fluoride, borate, and sulfate are usually estimated from S p (as is [Ca 2+ ], which is needed when also calculating carbonate mineral saturations).Phosphate and silicate must also be measured or estimated.Within these calculations there are many measured and calculated terms, each of which has associated uncertainties (Dickson and Riley 1978;Millero 1995;Orr et al. 2018).An additional significant uncertainty comes from the assumption that no other acid-base species significantly impact the seawater acidbase chemistry calculations.This assumption has long been known to be inaccurate in seawater environments with significant contents of organic acids that exchange protons during A T titrations such as coastal or estuarine waters (Cai et al. 1998), but the limits of this assumption in the open ocean have long been the subject of debate (e.g., Millero et al. 2002) and remain such today (e.g., Fong and Dickson 2019;Hunt 2021;Sharp and Byrne 2021).The uncertainties on these calculations are both quantitatively meaningful for many applications and complex.The uncertainties vary Fig. 1.Schematic illustrating the measured quantities (left side) and the flow of information in seawater carbonate chemistry calculations (see Table 1 for definitions).Calculations are made using computer programs and some versions include additional sets of measurements and acid-base reactions (e.g., Xu et al. 2017;Sharp et al. 2021).Physical measurements are shown with pink-colored backgrounds, total substance contents of species that participate in seawater acid-base chemistry are in light green, thermodynamic constants are in gray, and carbonate chemistry variables are in yellow.The circular arrows on the solver reflect the iterative approach that is required to solve for (or with) A T .
with seawater physical properties (e.g., temperature, S p , and pressure) and composition ( Álvarez et al. 2020) and with the combination of constraints provided (i.e., which measured carbonate chemistry variables are used as inputs for calculations, see Orr et al. 2018).
Open ocean carbonate chemistry measurements made in recent decades have revealed consistent patterns in disagreements between measured and calculated variables (e.g., Fig. 2; Supplementary Text S2; also McElligott et al. 1998;Millero et al. 2002;Carter et al. 2013;Williams et al. 2017;Carter et al. 2018;Fong and Dickson 2019;Álvarez et al. 2020).Recent research has shown that these "consistent inconsistencies" vary between cruises to a greater degree than was apparent from the collections of cruise datasets used for the earlier studies ( Álvarez et al. 2020;Takeshita et al. 2021).Nevertheless, the patterns seen in "inter-consistency," that is, the concordance between measurements and calculations of carbonate chemistry variables (see Table 1), have raised new questions and challenges for the carbonate chemistry measurement community.
Given the challenges and complexity of constraining carbonate chemistry, members of the seawater carbonate chemistry community gathered in 2019 and annually thereafter as part of the Ocean Carbonate System Intercomparison Forum (OCSIF).These efforts were supported by the Ocean Carbon and Biogeochemistry (OCB) program.OCSIF discussions and activities center around identifying the remaining factors limiting carbonate chemistry inter-consistency and, wherever possible, mitigating or quantifying these sources of uncertainty with informally coordinated research among independent laboratories.In addition, OCSIF endeavored to provide recommendations regarding data product internal consistency adjustments based on carbonate chemistry intercomparisons, adjustments to seawater pH measurements used in calculations of fCO 2 and C T , community reference material (RM) needs, and the value of inter-laboratory comparison exercises.In this article, the current members of OCSIF and collaborators: • summarize recent progress toward quantifying, understanding, and reducing uncertainties in seawater carbonate chemistry measurements; • highlight remaining opportunities to improve concordance between measurements from differing laboratories and improve carbonate chemistry inter-consistency; and • restate-and explain the rationales behind-the recommendations issued by OCSIF in recent years through community contacts and presentations at meetings.

Fig. 2.
A two-dimensional histogram showing the number of measurements in color that fall within small bins for the differences between measured pH recalculated at in situ seawater conditions on the total hydrogen ion scale (pH T in situ) and values of this variable that were calculated from A T and C T (y-axis) plotted against the measured pH T (x-axis).Data are taken from the 2022 release of the Global Ocean Data Analysis Project version 2 (GLODAPv2.2022;Lauvset et al. 2022).Calculations are made using the CO2SYSv3 code for MATLAB ® (van Hueven et al. 2011; Sharp et al. 2021) with carbonic acid dissociation constant parameterizations from Lueker et al. (2000), hydrogen fluoride (HF) thermodynamic constant parameterization from Perez and Fraga (1987), and the total boron (B T ) to S p ratio (B T /S P ) from Lee et al. (2010).The slope of the black regression line highlights the disagreement between the measured and calculated pH T values and how it changes as the seawater composition changes, even when averaged across the 29 cruises with overdetermined pH T measurements made by laboratories worldwide using purified indicator dyes (see the text for discussion of indicator dye purification).
Because inter-consistency is limited by its nature to carbonate system variables that can be measured, the content of this manuscript is primarily focused on those measurable quantities (specifically, calcium carbonate saturation states are not specifically discussed).

Uncertainty sources by measurable variable
Uncertainty is rigorously defined as "a parameter associated with the result of a measurement that permits a statement of the dispersion (interval) of reasonable values of the quantity measured, together with a statement of the confidence that the (true) value lies within the stated interval" (Ellison and Williams 2012).An example might be that one has 95% confidence that the true A T is within 2 μmol kg À1 of a given measured, calculated, or estimated value.Orr et al. (2018) provide plausible uncertainty estimates for most terms we show in Fig. 1, and we restate and very slightly update their work in Supplementary Text S3.In our companion paper (Carter et al. in prep.), we show that a recent data product composed of seawater carbonate chemistry data shows disagreements between measured and calculated values that imply that the community is not collectively achieving the climate quality measurement standards that were articulated by Newton et al. (2015) and used as uncertainty estimates by Orr et al. (2018).This is attributed to a variety of known and unknown sources of uncertainty related to the many seawater carbonate chemistry constraints needed for carbonate system calculations (Fig. 1).However, even for this focused analysis, the large number of potential sources of uncertainty challenges efforts to identify the most impactful sources of uncertainty.This is because an error in one constraint on the carbonate chemistry can appear indistinguishable from errors in others, even with measurements of four constraints (García-Ib añez et al. 2022).We therefore examine each constraint individually here, highlighting recent advances in measurement methods and metrology and noting areas where additional research is needed.

Total scale seawater pH (pH T )
The quantity pH is a measure of the hydrogen ion (H + ) activity or the "acidity" of a solution, expressed as an H + activity or concentration on an appropriate scale.Seawater pH is commonly measured using spectrophotometric or electrometric approaches and is reported on a variety of scales that include or exclude the interactions of H + with the common conservative anions in seawater sulfate and fluoride

Content
Short for substance content of the specified molecule expressed as an amount per specified mass of solution.

Carbonate chemistry constant
A constraint for carbonate chemistry that is typically inferred from measurements of salinity and temperature using published relationships.

Intercomparison
Used herein to refer to a comparison of measurements of similar seawater made by multiple research laboratories, short for "inter-laboratory comparison experiment." Inter-consistency Used herein as a measure of the (dis)agreement between a measured value of a carbonate chemistry variable and the value of the same variable calculated from other carbonate chemistry variable measurements.Short for inter-carbonate-systemvariable-consistency.

Internal consistency
Used herein to refer to a measure of the (dis)agreement between measurements of a single quantity made by multiple cruises as proximal locations within a data product, and is short for internal consistency within a data product.
Organic A T The content of organic chemical species in seawater that accept protons during an A T titration.pH T The acidity of seawater on the total hydrogen ion scale, expressed as Àlog 10 ([H + T ]) where [H + T ] is a measure of the sum of the free H + and HSO À 4 substance contents in seawater expressed in mole kg À1 of seawater.

RMs
Used herein as shorthand for "CO 2 -in-seawater reference materials," which are commonly, but incorrectly, known within the marine chemistry community as "certified reference materials" or CRMs.The Dickson group's RMs (Dickson et al. 2003) are an example, but other RMs are being produced by labs throughout the world.

Total boron
The sum of the dissolved boric acid and borate contents of seawater, often estimated from salinity.Uncertainty "A parameter associated with the result of a measurement that permits a statement of the dispersion (interval) of reasonable values of the quantity measured, together with a statement of the confidence that the (true) value lies within the stated interval."Unidentified A T Contributions to A T from an unknown source, or from any source that is not independently estimated, including organic A T .(Dickson 1984(Dickson , 1990;;Marion et al. 2011;Dickson et al. 2015).
Here we refer to pH measured on the "total hydrogen ion" scale (expressed as pH T ), defined as the sum of the substance contents of H + and HSO À 4 because modern spectrophotometric and "ion-sensitive field effect-transistor" (ISFET) approaches are calibrated on this scale.Environmental ISFET approaches are often calibrated by comparison to spectrophotometric measurements (Martz et al. 2010;Bresnahan et al. 2014), which are in turn calibrated with measurements of 2-amino-2-hydroxymethyl-1,  and Tris-HCl buffers (Clayton and Byrne 1993;Liu et al. 2011;Müller and Rehder 2018) that have been characterized using a Harned cell potentiometric measurement (DelValls and Dickson 1998), which is the primary method of pH measurement (Buck et al. 2002).Thus, uncertainties in the Harned cell measurements and from the assumptions involved in assigning pH values to the buffers from Harned cell measurements propagate to both approaches (see also later discussion).Furthermore, uncertainties in the spectrophotometric measurements are built into most ISFET measurements via calibration.
The most common modern seawater pH T measurement approach for discrete samples involves adding a small amount of pH-sensitive indicator dye to the sample, and then using the optical properties of the mixture to infer the relative amounts of the variably protonated forms of the indicator dye and, through that inference, estimate the H + content.The indicator dye constants of Liu et al. (2011) are the most commonly used set for laboratory measurements made for the open ocean S P range (20 < S P < 40) using meta-cresol purple (mCP) indicator dyes that have been purified of optical impurities.There have been several additional refinements of this method over the last decade.For example, Soli et al. (2013) determined the pressure-dependence of the mCP calibration coefficients up to 827 bar; DeGrandpre et al. (2014) reported values as a function of temperature in a 0.7 M NaCl solution; Lai et al. (2016) characterized the properties of mCP in freshwater (S = 0); Loucaides et al. (2017) extended the characterization of mCP to hypersaline and subzero conditions; and Müller and Rehder (2018) performed a metrologically traceable characterization of mCP over the S P range 5-20, directly linked to primary pH measurements.A history and more complete description of spectrophotometric pH T measurements is provided as Supplementary Text S4.

pH T uncertainties
The sources of uncertainty for the spectrophotometric pH T measurement remain an area of active research, and there has been recent progress addressing several potential sources.There are numerous potential contributors to uncertainties in measured spectrophotometric pH T including uncertainties and variations in the optical properties of the indicator dye, sample handling practice variability, equipment variability, and uncertainties inherent to the procedure for adjusting the measured values to account for the impact of the added indicator dye on the pH T of the seawater sample.
The challenges for spectrophotometric pH T measurements begin with the definition of the scale itself.The total pH scale was calibrated by DelValls and Dickson (1998), from measurements of Tris buffers in artificial seawater for nominal practical salinities of 20-40 and for temperatures from 0 to 45 C.This range has since been extended by Müller and Rehder (2018) for S P from 5 to 40 and temperatures from 5 to 45 C.However, it has recently been demonstrated that the extrapolations of Harned Cell electromotive forces to zero buffer content used for this study do not correspond to the expected values for compositions of pure artificial seawater, and it is thought that this causes negative errors in the derived pH T of up to about 0.01 (Clegg, pers. comm.).Generally, it is difficult to estimate an uncertainty for the mCP calibration that Liu et al. (2011) provided using Tris buffers.Such buffers will not provide a well-defined pH T value because the Tris species in the solution change the activity coefficients of other acid-base species away from what they would be in seawater of the same nominal ionic strength.This effect has been estimated experimentally-by extrapolating data measured at various Tris levels to a solution without Tris-to contribute an error of 0.005 in pH T at a S P of 35 (Müller and Rehder 2018).This is comparable to the change calculated from speciation calculations by Clegg et al. (2022).It is, however, of insufficient magnitude to account for yet unexplained differences between measured and calculated pH T (Fig. 2; McElligott et al. 1998;Carter et al. 2013Carter et al. , 2018;;Fong and Dickson 2019).
The optical properties of the indicator dye must also be both well-constrained and reproducible, and Yao et al. (2007) found that the use of different commercially-available indicator dyes yielded significantly different values of measured pH T .They further showed that the differences could be attributed to impurities in the indicator dye solutions that result in lower measured pH T .Liu et al. (2011) and Rivaro et al. (2021) used high performance liquid chromatography (HPLC) and Patsavas et al. (2013) used flash chromatography to purify indicator dyes from a variety of manufacturers and showed that a single research group could purify indicator dyes from multiple sources and produce a consistent product.More recently, Takeshita et al. (2021) organized an inter-laboratory comparison of purified mCP indicator dyes and found that seven of nine batches of purified indicator dye obtained from four suppliers across two countries produced pH T measurements that agreed within 0.003 across pH T values, while two of nine batches produced quantifiably different pH T values (attributed to remaining impurities).These findings show that indicator dye purification can be highly effective, but that it is possible to have incomplete indicator dye purification, and it is therefore desirable to further assess that an indicator dye is sufficiently pure for its intended purpose.However, these variations between different purified indicator dyes are also too small to account for differences between measured and calculated pH T (Fig. 2).
Comparisons of measurements of identical seawater samples made by different laboratories are also revealing.Takeshita et al. (2021) also reported an intercomparison where samples of a single seawater batch and pre-made stock solutions of a subset of the indicator dyes were distributed to three independent laboratories.Each laboratory measured the pH T of the sample with various distributed indicator dye stock solutions.The standard deviation, calculated from the measurements made using a subset of the indicator dye stock solutions, constitutes a form of intermediate precision of s Zi = 0.0013 (appropriate only for a small number of expert laboratories, each using well-maintained instrumentation and identical purified mCP indicator dye stock solutions).However, the reproducibility of seawater measurements across the community more broadly is the more important precision metric for many oceanographic analyses, and this can best be obtained by interlaboratory comparisons.In one such comparison, conducted in 2017, the calculated standard deviations of pH T measurements made by 19 laboratories with individually acquired purified indicator dyes used on two separate seawater samples provided by the Dickson laboratory (with different pH T values) were 0.0081 for 33 separate analyses of a sample similar to a typical CO 2 -in-seawater RM (pH T ≈ 7.91), and 0.0105 for 33 analyses of a "high-fCO 2 " sample with pH T ≈ 7.54 (unpubl.results).
It is unclear whether the larger reproducibility uncertainty seen from the larger interlaboratory comparison should be attributed to greater variations in the purity of the indicator dyes used or to variations in the sample handling approaches and the spectrophotometric equipment used.Therefore, sample measurement practices and equipment remain a concern.There are three main sources of additional uncertainty during a pH T analysis: sample handling, spectrophotometer performance, and data processing (including the adjustment of the measured pH T value to account for the perturbation to pH T induced by the addition of the indicator dye).
Regarding sample handling, measurements on samples with a low pH T -high fCO 2 -may show erroneously high measured pH T due to unintentional outgassing of CO 2 .Discrepancies between the reference laboratory and the participating laboratories in the larger inter-laboratory comparison experiment were positive on average (0.004 pH T ) for the high-fCO 2 samples whereas they were nearly 0 on average for the moderate-fCO 2 samples, and the discrepancies at high fCO 2 were not well correlated with the discrepancies at moderate fCO 2 (unpubl.results).This suggests that, for some research groups, CO 2 may have been preferentially lost from the high-fCO 2 samples while handling the samples (e.g., transferring it from the bottle into the spectrophotometer cell).
There may be contributions to the measurement uncertainty that result from the spectrophotometer used, either for calibration of the optical parameters of the indicator dye and/or for measurement of pH itself.The indicator dye optical parameters depend on the spectrophotometer bandpass, such that the published values for an indicator dye may not be compatible with spectrophotometers with a significantly different bandpass (DeGrandpre et al. 2014).Although the results reported by Takeshita et al. (2021) suggest that it is reasonable for high-quality well-maintained spectrophotometers to behave very similarly to one another, there is anecdotal evidence that long-term changes in lamp intensity may compromise seawater pH T measurements (Fong 2021).
The perturbation to pH that occurs when indicator dye such as mCP or fixatives such as mercuric chloride (HgCl 2 ) is added to seawater is potentially more complex (Chierici et al. 1999) than the simple empirical, typically linear, adjustments (Clayton and Byrne 1993;Carter et al. 2013) that are used to counter the impacts of the perturbation on the analysis.Li et al. (2020) developed code to simulate the expected indicator dye perturbation for a given sample based on the chemical properties of the seawater and the indicator dye solutions and showed that these adjustments are unlikely to be a major contributor to pH T measurement uncertainties except in low-salinity environments where these uncertainties can be minimized by matching the ionic strength of the indicator dye solution to that of the sample.Using similar simulations, Fong (2021) confirmed that nonlinearities in the empirical indicator dye perturbation adjustments can result in large systematic errors at the high substance contents of indicator dye required in short pathlength (1 cm) cells, but that these errors are small when using 10 cm cells per standard operating procedures (SOPs) (Dickson et al. 2007).Perturbations to pH T with the addition of the fixative chemical HgCl 2 are a related concern.SOPs do not currently mention adding HgCl 2 to seawater pH T samples, but it is nevertheless common practice for some laboratories to poison samples as per SOPs for C T and A T sampling, particularly for samples that will not be quickly measured.Carter et al. (2013) show that HgCl 2 additions have a negligible immediate impact on the measured pH T for seawater collected from the open ocean, but HgCl 2 addition impacts have also been found to induce readily measurable shifts in pH T in coastal and sedimentary environments with hydrogen sulfide H 2 S present (Millero 1991;Cai et al. 2020).

pH T recommendations
The uncertainty that arises from how pH T is derived from Tris buffer solutions affects all seawater pH T measurements, and thus should be a priority to quantify and correct.Experimental (DelValls and Dickson 1998;Müller and Rehder 2018) and modeling (Clegg et al. 2022) approaches suggest a similar magnitude of bias.However, how this would be implemented to redefine mCP calibration in a way that is consistent with natural seawater is not clear.A consensus on the magnitude of this bias and an appropriate correction are needed.In addition, the calibration of the pH T scale for salinities below 20 should be revised.
Unlike many other carbonate chemistry measurements, the presumption with spectrophotometric pH T is that the calibration is inherent to the indicator dye batch, and that analysts do not need to regularly recalibrate their measurements.An approach is therefore needed for estimating the uncertainties in mCP calibrations and their translations to pH T , as is a demonstration that multiple laboratories can obtain consistent determinations of the optical properties of a single mCP batch (within uncertainties).Once this is in place, there will be a need for an SOP for confirming that the behavior of a particular indicator dye batch justifies using its associated calibration.In addition, more work is needed on the calibration of the pH T scale, using Tris buffers in artificial seawater, for salinities below 20, and in the extrapolation of pH T to zero buffer content at all salinities.Speciation models may be helpful here, but are still at an early stage of development (e.g., Clegg et al. 2022).Collectively, these steps would allow calibrated pH T values of known uncertainty to be assigned to the widely distributed CO 2 -in-seawater RMs at specified temperatures so that they can be used to demonstrate both accuracy and precision, thus enabling better assignment of uncertainties to sample analyses conducted at sea.Currently, these RMs do not have assigned pH T values and can only be used to assess dayto-day consistency.(Note: we use the general term RM, avoiding the commonly used acronym "CRM" because the RMs are not traceable to a primary standard and therefore do not technically qualify as "certified reference materials.")HPLC is currently the standard method for verifying the purity of indicator dyes, but it requires expensive equipment and expertise that is not always available to research teams.Similarly, purified mCP indicator dye is expensive and not commercially available.Given these challenges, Douglas and Byrne (2017) developed a simple spectrophotometric method by which the pH T determined using an impure indicator dye can be corrected to that which would be obtained with a purified indicator dye, but questions remain regarding the accuracy of corrections derived from this approach (Supplementary Text S4).It has been suggested that this method could instead provide means to assess indicator dye purity (Takeshita et al. 2021).Given the outstanding questions for, and lack of uniformity in, purification approaches, OCSIF advocates for more work to develop SOPs for verifying indicator dye purity and to improving high-quality pH T measurement accessibility.
Commercially available RMs exist for both spectrophotometer wavelength accuracy and absorbance, and day-to-day system consistency can be tracked using routine measurements of either seawater RMs or Tris buffer (Paulsen and Dickson 2020).However, given that pH T values of Tris buffers are strongly temperature sensitive and the pH T values of RMs are uncalibrated and sensitive to gas exchange (i.e., sample handling), strong buffers that are only weakly temperature dependent (e.g., phosphate salts) have also been proposed as a potential approach for checking equipment consistency over time and between groups, but this has not yet been attempted in an inter-laboratory comparison experiment.
We suggest that the pH T measurement temperature might also be rethought.Initially, the optical properties of mCP indicator dye were only well characterized at 25 C (Clayton and Byrne 1993) so this became the default, but not universally used, temperature for these measurements.However, this is a comparatively high temperature relative to the mean ocean temperature and raising seawater temperature during analysis can lead to the formation of bubbles that can interfere with spectrophotometric measurements.Now that indicator dye characteristics have been extended for a range of temperatures, it may be sensible to measure and report pH T routinely at lower temperatures, for example, at 20 C (some groups already do this).Woosley (2021) showed that the uncertainties in conversion to in situ pH T tend to be lower the closer the measured temperature is to in situ temperature.As an added advantage for measuring at 20 C, this would unify the default reporting temperatures for pH T and fCO 2 .
As a summary of several recommendations, there is a general need to update the SOPs for spectrophotometric pH T analyses (SOP 6b in Dickson et al. 2007).The updated SOP should include guidance on the purity specifications for the indicator dye, preparation and storage of indicator dye solutions, spectrophotometer specifications and assessment, use of automated systems (e.g., Carter et al. 2013), sample handling and storage practices, data processing (i.e., calculation of pH T from absorbance measurements and indicator dye perturbation adjustments), the use of unpurified indicator dyes, and data quality control (QC) procedures.The uncertainties related to CO 2 exchange with samples implied by the results of the interlaboratory comparison experiments imply that further effort should be devoted specifically to developing and testing SOPs that could minimize this opportunity for gas exchange.

Seawater alkalinity (A T )
Seawater A T is operationally defined as the excess of proton acceptors over proton donors.A T is a conservative quantity with respect to water mass mixing and does not change with temperature or pressure.Uniquely among the measurable carbonate chemistry variables, it is unaffected by the air-sea exchange of CO 2 (see Supplementary Text S1 and Dickson (1981) for a formal definition of total titration seawater A T ).It is measured by adding a solution of known HCl content to a sample, monitoring pH throughout the addition, and analyzing the response to infer the A T .Within the oceanographic community, titration approaches vary between open and closed cells and single-step and multi-step titrations.There are also a variety of approaches for analyzing the pH response during titration (see Sharp and Byrne 2020).

A T uncertainties
Bockmon and Dickson (2015) showed that A T measurements of a single seawater batch had relatively less variability between laboratories than alternative carbonate chemistry variables.Importantly, however, they also note that only 20% of laboratories returned values within the "climate quality" ≈ AE 2 μmol kg À1 A T content limits (Newton et al. 2015) of the reference laboratory measurements.
Despite A T titrations being an established method with an SOP that has not much evolved in 15 years (Dickson et al. 2007), a RM, and a record of comparative consistency between laboratories, seawater A T measurements are a current focal point for carbonate chemistry inter-consistency discussions due to growing attention to the impacts on A T measurements of dissolved alkaline organic chemical species and other unidentified alkaline chemical species prevalent in coastal waters and potentially in open ocean seawater (Cai et al. 1998;Kim et al. 2006;Hern andez-Ayon et al. 2007;Muller and Bleie 2008;Kim and Lee 2009;Kuli nski et al. 2014;Yang et al. 2015;Ko et al. 2016;Fong and Dickson 2019;Song et al. 2020;Kerr et al. 2021).It has been long understood that organic species or other unidentified proton acceptors in seawater contribute to A T when they participate in proton exchange reactions during the course of an A T titration from seawater conditions (pH T ≈ 7.4-8.2) down to the titration endpoints (commonly pH T ≈ 3.0-4.5)(Brewer et al. 1986;Cai et al. 1998).While the operational definition of A T is referenced to a pH T of 4.5, many common approaches continue to add acid below this pH T and fit the full pH T response to acid additions to improve measurement precision.The impact of unidentified A T on the measured A T varies somewhat with the pH range titrated and the approach used to calculate the A T (Sharp and Byrne 2020).Accounting for unidentified A T in carbonate chemistry calculations requires knowing the total substance contents of all of the acid-base pairs as well as their acid dissociation behaviors and how these behaviors vary with solution chemistry (Ulfsbo et al. 2015;Sharp and Byrne 2020).However, unlike the other non-carbonate contributions to A T (Fig. 1), the contribution to A T from organic bases and other unidentified acid-base systems in seawater is neither routinely measured nor is it able to be estimated from seawater S p .The acid dissociation behaviors of the unidentified chemical species are also unknown.These uncertainties in the unidentified A T therefore become uncertainties in calculated A T and carbonate chemistry variables calculated from A T .
Within the literature, unknown contributions to A T are often referred to simplistically as the "organic component of A T " because organic chemical species often have pK A values in the range of the seawater A T titration and are found in measurable amounts in seawater systems, particularly in coastal environments and inland water bodies (Cai et al. 1998;Hansell et al. 2009).We use the term "unidentified A T " to reflect that these chemical species are, as yet, poorly understood and may include an inorganic component.Nevertheless, it seems likely that the majority of unidentified A T comes from organic chemical species and/or uncertainty in total boron content calculations from S p (Sharp and Byrne 2021).
There are two main approaches to quantifying unidentified A T contributions.First, the influence of unidentified A T can be measured using "back titration" approaches (Cai et al. 1998;Hern andez-Ayon et al. 2007;Muller and Bleie 2008;Yang et al. 2015;Song et al. 2020), which acidify and sparge the sample with CO 2 -free air to remove all carbonate A T , return the seawater to the original pH, and then titrate the sample as usual.However, these analyses are work-intensive, require specialized setups, are difficult to perform with the precision required to constrain the modest unidentified A T contents believed to reside in the open ocean (by contrast, these methods can have a much better signal-to-uncertainty ratio in the predominantly coastal environments with large unidentified A T contents), and are sensitive to uncertainties in the other known noncarbonate chemical species that also participate in acid-base reactions during the back titrations, borate/boric acid in particular (Sharp and Byrne 2021).Second, some have relied on over-determining the carbonate chemistry in seawater and then calculating the unidentified A T as the difference between the measured and calculated A T .These estimates are of course complicated by the numerous other uncertainties that limit carbonate chemistry interconsistency (see Fong and Dickson 2019;Álvarez et al. 2020).With the second approach, researchers have estimated the likely impact of unidentified A T on A T measurements in the open ocean as ranging between ≈ 2-10 μmol kg À1 (Millero et al. 2002;Patsavas et al. 2015a;Fong and Dickson 2019).
The use of RMs has certainly enhanced the lab-to-lab intercomparability of the global oceanographic A T record (Dickson et al. 2003).However, it remains an open question whether RMs have specifically mitigated the impacts of unidentified A T on this record because it is unknown whether unidentified A T is uniform in the open ocean and to what degree it is present in RMs (Sharp and Byrne 2021).Also, it is neither recommended nor universal that research groups adjust their measurements to reflect offsets between their measurements of RMs and the assigned RM values (see Supplementary Data S1).We note that there is some evidence that research groups that measure A T using endpoint titrations tend to generate lower A T measurements than groups that use equivalence point titrations, as has been seen in inter-laboratory comparison exercises (Bockmon and Dickson 2015) and data product inter-consistency exercises (Olsen et al. 2019).To quantify the latter, the average adjustment applied during GLODAPv2.2022secondary QC for the 37 cruises with adjusted A T that used a single-endpoint A T titration (to pH ≈ 4.5) is 3.5 μmol kg À1 greater (upward) than the average adjustment applied to the 19 adjusted cruises that used a full titration to a lower (≈ 3) pH.This pattern is consistent with expectations if there is some unidentified A T with a pK A near or between the pH values of the titration endpoints in seawater (Sharp and Byrne 2020), which is superimposed against large cruise-to-cruise variability from other sources of uncertainty.Multiple groups have attempted to quantify the unidentified A T in RMs composed of seawater collected off of the coast of Southern California (Dickson et al. 2003) and found values ranging from À5 to 20 μmol kg À1 (Sharp and Byrne 2021).Negative unidentified A T values are not physically possible and are attributed the large uncertainties in the approaches for quantifying unidentified A T .Recently, Hunt (2021) used two titration endpoints, each consistent with common A T measurement approaches within the community, and found differences in A T measurements on seawater RMs that were greater than 8 μmol kg À1 , suggesting the impact of unidentified A T on measured A T is meaningful and meaningfully different between different titration approaches even for RMs.
Unidentified A T has long been assumed to be a small contribution to A T in the open ocean, yet, given the large uncertainties in organic contents and pK A values, it could have a meaningful impact on carbonate chemistry calculations (Fong and Dickson 2019).Furthermore, it could explain different patterns in inter-consistency (here represented as measured pH T À calculated pH T ) observed in the measurements from research groups using A T titration approaches with different endpoints.To support this claim, we use code produced and distributed by Sharp and Byrne (2020) that simulates the impact of alkaline organic chemical species on A T titrations using a variety of titration approaches.We use this model to explore the potential impact of organic A T and methodological variations on the inter-consistency of all cruises with overdetermined carbonate chemistry measurements within GLODAPv2.2022.For this comparison, we also rely on a new metadata product that tracks the methods used for carbonate chemistry measurements on cruises within GLODAPv2.2022with overdetermined carbonate chemistry systems (provided as Supplementary Data S1).We further use the conceptual framework of Fong and Dickson (2019) whereby the carbonate chemistry inter-consistency of an entire research cruise is quantified by plotting the mean disagreement between measured pH T and pH T calculated (here: both at in situ conditions) from A T and C T vs. the slope of a line (e.g., the black line in Fig. 2) fit between such offsets and the measured pH T (plotted for many cruises in Fig. 3).Ideally, all cruises would fall upon the origin of Fig. 3, suggesting that the measurements and the calculations are consistent in their means and that the average consistencies do not vary as the carbonate chemistry varies.Instead, it can be seen first that the various overdetermined cruises in GLODAPv2.2022fall into two general populations: a large number of cruises measured globally using methods that conduct full titrations of seawater to low pH (≈ 3, including those labeled herein as "closed" cell nonlinear fit and "open" cell gran fit titrations) and another large population of cruises with A T measured by titrating only to the approximate endpoint of the A T at a pH of ≈ 4.5 ("single-endpoint" titrations).The second population is primarily, but not entirely, collected in the northwest Pacific.These populations have distinct mean values when plotted using the conceptual framework of Fong and Dickson (2019), and the calculations of Sharp and Byrne (2020) show that these populations could be meaningfully more coherent if we were to remove the estimated impacts of 10 μmol kg À1 of unidentified A T with a pK A of 4 (i.e., moving the means according to the arrows in Fig. 3).Fong and Dickson (2019) showed that many sources of uncertainty can effectively "move" cruises on this figure, but very few sources of uncertainty can move cruises relative to one another, and thus bring collections of cruises closer together.While these assumptions regarding organic A T are plausible, they are also ad hoc and intended only to motivate additional research into this topic.We do not propose any adjustments to data based on these calculations.We also caution that the more focused geographic range of the "single-endpoint" titration cruises could perhaps result in a difference in the compositions of the water considered, on average, by these two populations of measurements.

A T recommendations
As the previous discussion shows, further research into the contents, compositions, chemical properties (e.g., pK A values), and biogeochemical behaviors of organic and other unidentified A T constituents is urgently needed.This source of uncertainty is currently the largest "known unknown" for carbonate chemistry inter-consistency.
The accessibility of the measurement techniques outlined by SOPs for A T and other carbonate chemistry variables is an important concern.Huang et al. (2012) and Mos et al. (2021) examined alternative A T sample preservation strategies besides the SOP approach of storage in comparatively expensive and difficult-to-ship Pyrex Corning borosilicate glass with a linear coefficient of expansion of 32.5 Â 10 À7 K À1 (Dickson et al. 2007).This study confirmed that some deviations from SOPs can result in significant A T (and other variable) variability during storage and that following SOPs regarding HgCl 2 addition might not eliminate all biological activity for some coastal samples.However, biological activity is not thought to be a significant issue for A T samples collected according to SOPs and measured within ≈ 1 d of collection, which represents the majority of open ocean A T measurements.
Many laboratories currently purchase pre-prepared and calibrated acid titrant (HCl) from the Dickson Laboratory.However, the COVID-19 pandemic highlighted that it is desirable to have multiple laboratories producing critical reagents for carbonate chemistry analyses.The preparation of HCl for use as an A T titrant is simple perhaps aside from the complication that it is important to prepare a solution with an ionic strength that approximately matches that of the target seawater.However, climate quality A T measurements require that the HCl content of the titrant be well-constrained and consistent over time.SOPs for calibrating A T titrant and for checking consistency of the titrant over time would therefore be beneficial.An approach for acid calibration is given in the appendix from Paulsen and Dickson (2020), though in practice some laboratories currently prepare their own acid titrant and then calibrate it using measurements of RMs.SOPs for this practice would therefore be helpful as well.
As RMs are expensive and it is economical to use RMs for A T after they have been used for a C T or pH T analysis, a common question is "how long are RMs stable for A T after they have been opened, provided care is taken to eliminate evaporation?"Unpublished work from multiple OCSIF groups has produced evidence of contamination of carbonate chemistry measurement systems with HgCl 2 -resistant microorganisms.These organisms are believed to be responsible for generating measurable deviations in A T after 1-2 d from first analysis except when care is taken to routinely replace all components of an analytical system that come into contact with seawater.This is mentioned only as a concern for A T because RMs become unsuitable for analysis for pH T or C T much more rapidly after opening the sample due to potential gas exchange.

Total dissolved inorganic carbon (C T )
Seawater C T is the sum of all dissolved inorganic carbon species contents: [CO 2(aq) ], [H 2 CO 3 ], HCO À 3 Â Ã , and CO 2À 3 Â Ã .C T is most frequently measured by acidifying a seawater sample and quantifying the CO 2 that exits the sample under sparging using coulometric, infrared, or cavity ring detectors (e.g., Goyet and Snover 1993;Johnson et al. 1993;Smith et al. 2017).

C T uncertainties
C T content measurements are believed to have fewer complications than measurements of other carbonate chemistry variables because C T is physically and unambiguously defined, is a quasi-conservative seawater property that does not change with temperature or pressure, has an associated RM, and has been measured using similar SOPs for many decades.
C T measurements also tend to show strong consistency between research groups, and Bockmon and Dickson et al. (2015) found that more than a quarter of laboratories participating in an inter-laboratory comparison exercise manage to match the reference laboratory measurements to within AE 2 μmol kg À1 (i.e., the "climate quality" threshold of Newton et al. 2015) for the subset of measurements of samples that were equilibrated with atmospheric fCO 2 at the time of bottling.However, Bockmon and Dickson (2015) and the unpublished follow-on results from 2017 showed that many laboratories return C T content measurements that are lower than the reference laboratory measurements for samples with high fCO 2 , such that the average disagreement (participating laboratory minus reference laboratory) for all laboratories was À4.95 μmol kg À1 in 2015.As with the similar comment made for pH T , this is consistent with a loss of CO 2 during sample handling and suggests that sample handling remains an issue for some laboratories.Given this apparent loss of CO 2 in high-fCO 2 samples, it is prudent to evaluate the potential for CO 2 loss during routine C T analysis.
There are three main opportunities for CO 2 loss from a high-fCO 2 seawater sample (and we note that fCO 2 is frequently elevated after cold deep samples are brought to the surface and warmed to laboratory conditions) during a standard C T shipboard analysis: (1) while the sample is in the rosette bottle and being transferred to the sample bottle, (2) while the sample is in the sample bottle, and (3) after the sample has been opened but before it has been taken into the stripping chamber.In Supplementary Text S5, we investigate these opportunities for CO 2 loss in the C T analysis SOPs (Dickson et al. 2007) by comparing measurements of many samples collected in sequence from a single deep (i.e., high-fCO 2 ) rosette bottle over 15 min and by comparing samples collected according to SOPs to samples collected in syringes with and without headspace.We do not find strong evidence to suggest that (1) creates a significant problem provided the sample is taken within the first several liters of seawater removed from the rosette bottle, though we do see evidence of modest C T loss as samples are drawn in sequence over multiple minutes and clear C T loss if the sample is among the last seawater collected from the bottle.A stronger and statistically robust statement is impossible without more replicates of this experiment.Regarding ( 2) and ( 3), we find that the syringes without headspace contained more C T than bottle samples with headspace when the seawater fCO 2 greatly exceeded atmospheric values, and that the difference agreed with what would be expected from an initially ≈ 1-2% headspace which collapses and increases its pressure as the seawater warms and expands (Supplementary Fig. S2).Repeating the experiment with a 2.5% headspace intentionally added to the syringes led to the syringes approximately agreeing with the bottles (within uncertainty after being adjusted for HgCl 2 dilution, Supplementary Fig. S3), implying that the source of the disagreement is related to sample storage (i.e., (2) above, rather than ( 3)).The offset between syringe and bottle measurements grew to > 2 μmol kg À1 for the seawater samples with the highest fCO 2 .
The implications of this apparent CO 2 loss for carbonate chemistry lab-to-lab inter-comparability and variableto-variable inter-consistency are unknown at present because many, but not all, pH T and fCO 2 measurements use similar sample storage approaches with headspaces that have been shown to have similar apparent changes during storage (e.g., Carter et al. 2013).Thus, small CO 2 losses in the various measurable carbonate chemistry variables may compensate for one another in inter-consistency comparisons between discrete carbonate chemistry measurements.Also, the consistency of C T measurement practices between groups and over time implies that our findings are not a significant problem for most analyses that have been performed using the existing body of C T measurements made following SOPs: analyses which are primarily aimed at quantifying spatiotemporal changes in inventories.This potential loss of C T to the headspace should nevertheless be borne in mind as a concern for comparisons between discrete measurements made at laboratory conditions and sensors returning measurements in situ, particularly when those sensors are calibrated at depth against pH T measured from bottle samples or calculated from A T and C T .These findings also cannot explain the tendency for CO 2 loss during the intercomparison experiment because the seawater samples analyzed by all groups were stored according to the same methods.This implies that gas exchange opportunity (3) also remains a concern for the comparability of measurements between different laboratories.

C T recommendations
As discussed in the previous section, there are several opportunities for CO 2 loss between sample collection and the end of an analysis.Measurements at sea and from intercomparison experiments suggest that CO 2 loss remains a modest source of uncertainty for C T measurements broadly.We therefore advocate renewed attention to developing and testing sample handling and collection practices to minimize this loss.Routine access to RMs with high C T could help address or better quantify the component of lab-to-lab variability stemming from sample handling during analysis.We further note that the comparatively few samples available for the headspace tests discussed in detail in Supplementary Text S5 were measured on one setup over only two research cruises, and further replication would help confirm whether the signal is robust.There is more diversity in how pH T samples are stored following collection (with various laboratories relying on syringes, cuvettes, or bottles), but we note that CO 2 loss during sample storage could equally be a concern for pH T and discrete fCO 2 measurements.
There is a great need for faster, cheaper, autonomous, pressure tolerant, small sample volume, and reagent-free approaches for measuring C T .These approaches will be especially valuable as CO 2 removal technologies are deployed, and small changes in C T will need to be tracked over large spatial areas to verify the effectiveness of carbon drawdown and sequestration.As new C T measurement approaches are developed to address these needs (e.g., Fassbender et al. 2015;Wang et al. 2015;Steininger et al. 2021;Ringham 2022), it will remain important that the community continues to ensure the new approaches produce comparable results to the historical approaches and that, when improvements are made, it is understood how the improvements have affected the measured values.

Fugacity of CO 2 (fCO 2 )
Seawater fCO 2 is a measure of the effective partial pressure, or fugacity, of CO 2(aq) after accounting for the nonideal behavior of the CO 2 molecule.It is typically measured by equilibrating a gas phase headspace with a seawater sample and then measuring the partial pressure of CO 2 in the headspace.Seawater fCO 2 is unique among the carbonate chemistry variables (perhaps excepting CO 2À   3   which is often considered in the context of the carbonate mineral super/ undersaturation) in that the dominant application of fCO 2 measurements to date-air-sea CO 2 flux calculations-requires precise knowledge of the difference between the measured seawater value and a reference value, specifically the value in the overlying atmosphere.
The dominant modality of fCO 2 measurement is from autonomous or underway sampling systems where abundant seawater is available to pass through equilibration chambers.By contrast, discrete fCO 2 measurement is limited by the amount of water in a sample bottle, is only routinely measured by a small number of research teams, and has had less attention paid to developing and standardizing the methods of measurement.Thus, there are greatly fewer fCO 2 observations in the GLODAPv2.2022product (and none of these new additions to the product have yet been subjected to secondary QC due to the lack of historical crossover information).fCO 2 uncertainties.The focus on air-sea fluxes has led to a disproportionate fraction of fCO 2 measurements being made at or near the ocean surface, with comparatively few measurement groups providing measurements of interior ocean fCO 2 .Thus earlier efforts have examined underway or discrete fCO 2 inter-consistency with discrete samples (Chierici et al. 2004;Ribas-Ribas et al. 2014;Chen et al. 2015;Salt et al. 2016;Woosley et al. 2017;Sulpis et al. 2020;Wanninkhof et al. 2022).Nevertheless, continued work by a small number of laboratories has produced a much greater quantity of discrete interior fCO 2 data over the last two decades than was available for earlier studies examining discrete fCO 2 interconsistency (Lee et al. 1997(Lee et al. , 2000;;McElligott et al. 1998;Wanninkhof et al. 1999;Millero et al. 2002;Raimondi et al. 2019).García-Ib añez et al. ( 2022) took advantage of the data in the GLODAPv2.2022data product to assess discrete fCO 2 inter-consistency for > 19,000 fCO 2 seawater samples collected globally from cruises with overdetermined carbonate chemistry.They found that > 94% of the measurement sets were inter-comparable within 3% of the fCO 2 value calculated from pH T and C T , and 88% were within 4% of the value calculated from A T and C T .While García-Ib añez et al. ( 2022) concluded that inter-consistency has improved in recent years, they show that the climate goal of 0.5% uncertainty (Newton et al. 2015) is not currently achievable for calculations of fCO 2 from other measured variables when using the adjustment limits of the GLODAPv2 data product (Olsen et al. 2019) as assumed measurement uncertainties.Similarly, they argue that the weather quality goal of 2.5% is only currently achievable with these uncertainties when using the C T and pH T measurement pair.The GLODAPv2 adjustment limits are not equivalent to uncertainty estimates (Lauvset et al. 2022), yet these estimates are not inconsistent with the range of discrepancies reported in inter-laboratory comparison experiments (Bockmon and Dickson 2015).
OCSIF echoes suggestions by García-Ib añez et al. ( 2022) that efforts should be directed at generating additional discrete fCO 2 observations to allow inter-comparison to be tested in the modern era with spectrophotometric pH T measurements with purified indicator dyes (not available for most historical discrete fCO 2 comparisons).We reiterate that it could be useful to unify the measurement temperature of discrete seawater pH T and fCO 2 analyses to allow direct inter-comparison without selectively calculating temperature impacts.T is a comparatively new addition to the measurable carbonate chemistry variables (Byrne and Yao 2008).As such, it lacks a commonly accepted SOP for the measurement or RM for its calibration, and measurement practices based on spectrophotometric methods have rapidly evolved since its development.Furthermore, equations for converting spectrophotometric absorbances to CO 2À 3 Â Ã T are calibrated using CO 2À 3 Â Ã T values calculated with C T and A T that, by their nature, contain the uncertainties and inconsistencies inherent to carbonate chemistry calculations.
CO 2À 3 Â Ã T measurements in seawater rely upon the complexation of an added lead titrant with dissolved chloride and carbonate and subsequent quantification of that complexation using ultraviolet spectrophotometry (Byrne and Yao 2008).Over the course of methodological development, lead chloride (Easley et al. 2013) and lead perchlorate (Patsavas et al. 2015b) titrant have been tested, remedies for instrumental inconsistencies have been proposed (Sharp et al. 2017), and the range of valid measurement conditions has been widened (Sharp and Byrne 2019) 2022) call for the development of an SOP and RM for these measurements to move toward methodological consistency and ensure broader adoption of this method by the observational and experimental ocean acidification community.We echo this call.Indeed, the spectrophotometric-based method for CO 2À 3 Â Ã T can be adapted to autonomous and unattended systems as was done for pH T (Ma et al. 2019).

Carbonate chemistry constants
Carbonate chemistry constants describe the reactions (and sometimes contents) of various acid-base pairs in seawater, including the carbonate chemistry species (CO 2 , HCO À 3 , and CO 2À 3 ), boron species (boric acid and borate), phosphate species, sulfate species (HSO À 4 ), fluoride (F À ), and others.These various constants are essential for inter-converting between the measurable and unmeasurable (e.g., saturation states and often the substance contents at in situ conditions) aspects of seawater carbonate chemistry (Fig. 1).The carbonic acid dissociation constants (K 1 , K 2 ) have been variously quantified by numerous research groups over the years for a variety of solutions including natural seawater (e.g., Lueker et al. 2000) and artificial seawater (e.g., Goyet and Poisson 1989) of varying salinities.This research topic remains active.For example, Schockman and Byrne (2021) recently re-measured the product K 1 K 2 using spectrophotometric seawater pH T measurements to benefit from the high precision of spectrophotometric measurements.There are recommendations of several other dissociation constants (K HSO4 , K HF ) and a more in-depth description of these constants given by Woosley (2021).

Carbonate chemistry constant uncertainties
There are uncertainties remaining in the thermodynamic carbonate chemistry constants that have a non-negligible impact on carbonate chemistry calculations, as recently quantified by Orr et al. (2018).We here only briefly discuss uncertainties in several constants that have been much discussed in recent literature including the carbonic acid dissociation constants (K 1 and K 2 ) and the total boron-to-S P ratio (B T /S P ), and we note that the unknown equilibrium constants of minor acid-base systems are also relevant to our earlier discussion of unidentified A T (see Supplementary Text S1 for equations for and definitions of these terms).
Several authors have recently argued for additional research refining carbonate chemistry coefficients, and we echo these calls.For example, Álvarez et al. (2020) and García-Ib añez et al. (2022) made the case for the need to refine K values, especially K 2 and particularly for high-fCO 2 waters.Sulpis et al. ( 2020) demonstrated large inter-consistency issues for measurements in low temperature surface seawater and suggested empirical adjustments to K 2 intended to reduce these issues between underway fCO 2 and GLODAP discrete C T and A T .In addition, we note that the pressure and temperature dependencies of the constants have not been quantified or evaluated in many years (Culberson and Pytkowicz 1968), and these relationships are important to re-examine (Raimondi et al. 2019) now that in situ pH T measurements are being calibrated and validated against discrete seawater pH T measurements made at 20-25 C and standard pressure (101,325 Pa).
Boron participates in seawater acid-base chemistry through the borate/boric acid acid-base pair, and uncertainties in the total boron content of seawater B T , which is typically estimated from S p , contribute to uncertainties in carbonate chemistry calculations.To date, there have been two independent determinations of the total B T /S P ratio, both using the spectrophotometric curcumin method (Uppström 1974;Lee et al. 2010).Lee et al. (2010) improved on the original method of Uppström (1974) and their group has demonstrated that the B T /S P ratio is remarkably stable in the ocean and somewhat higher than the values of Uppström (1974) even when considering low-salinity samples that are heavily influenced by riverine inputs (Olafsson et al. 2020).Sharp and Byrne (2021) attempted to measure the B T /S P ratio using a back titration method.However, their efforts were hindered by the potential presence of excess A T in their seawater samples, and they further demonstrated that the use of back titration methods to quantify low levels of unidentified A T in open ocean seawater samples will be difficult without a wellconstrained B T /S P ratio.
It has been suggested that carbonate chemistry coefficients will need to be recalculated if the accepted B T /S P ratio is updated from the earlier values of Uppström (1974), which were in use when most carbonate chemistry constants were originally determined (Orr et al. 2018).However, though the choice of the B T /S P ratio has a significant impact on carbonate chemistry calculations, Woosley (2021) showed that the choice of the ratios used by various investigators when determining the carbonate chemistry constants K 1 and K 2 has a negligible influence on the values of those constants and that the ratio of Lee et al. (2010) produced a better agreement between measured and calculated pH T in two batches of surface seawater of similar compositions.Wanninkhof et al. (2022) found greater inter-consistency between surface fCO 2 measurements and calculations from A T and C T using the values from Uppström (1974), while Schockman and Byrne (2021) suggested the inter-consistency with fCO 2 was better with Uppström (1974) at lower fCO 2 and better with Lee et al.

Carbonate chemistry constant recommendations
The choice of carbonate chemistry equilibrium constants has a significant impact on calculated carbonate chemistry values (Woosley 2021), yet it is difficult to evaluate relative uncertainties in the constant sets because deficiencies in sets of constants could be countered by other deficiencies in our understanding of seawater carbonate chemistry.Raimondi et al. (2019, and several references therein) find that the K 1 and K 2 values of Lueker et al. (2000; for measurements between 2 and 35 C and S P of 19-43) show a comparatively high degree of inter-consistency for overdetermined measurements when assuming no impacts from unidentified A T and when compared to many alternatives.However, the uncertainties in inter-consistency calculations are large relative to the differences between some constant sets.Furthermore, inter-consistency does not imply accuracy.OCSIF therefore refrains from issuing new recommendations regarding the best choice of constants and instead notes that the subsets of constant sets that have been recommended in recent literature (by, e.g., Jiang et al. 2022;Sutton et al. 2022;Woosley and Moon 2023) are among those that we show in our companion paper to be comparably inter-consistent within uncertainties over the specified S p and T ranges.We further note that research is mixed regarding the optimal set of constants for various conditions (Woosley 2021;García-Ib añez et al. 2022;Woosley and Moon 2023) and does not always assess the most recently developed constants (e.g., Schockman and Byrne 2021).OCSIF therefore recommends that criteria be established for the selection of ideal carbonate constant sets over given property ranges and that research be continued to compare new and existing constant sets according to these criteria.OCSIF also reiterates the recommendation that uncertainties should be incorporated into any analysis that uses these calculations (see Orr et al. 2018).It is also important that all constants used, including non-carbonate chemistry constants, are clearly stated.The mixed recent findings related to the optimal choice of the B T /S P ratio highlight the need for reduced uncertainties in the B T /S P ratio, ideally obtained using an independent method.We also echo the call by García-Ib añez et al. ( 2022) and others to refine K 2 in particular to improve fCO 2 inter-consistency.
Modern understanding of seawater carbonate chemistry comes primarily from measurements that yield key quantities such as the equilibrium constants K 1 and K 2 as functions of S P and temperature.However, a first-principles understanding of seawater and other natural waters containing the salts present in seawater, can be obtained from chemical modeling (Millero Roy 1997;Pierrot and Millero 2016;Clegg et al. 2022).Such models, based upon the calculation of activity coefficients of individual solute species as functions of solution composition, are not yet sophisticated enough to replace empirically-derived carbonate chemistry equilibrium constants.However, they have the potential to predict how carbonate chemistry evolves when shifting between various compositions of seawater and also for low-S P environments.These models, especially when integrated with similar models being developed to provide a theoretical grounding for the definition of seawater pH T , will become useful tools as oceanographers and limnologists increasingly focus on the chemistry of highly variable nearshore environments.

Uncertainties
Given the many known and possibly unknown sources of uncertainty for carbonate chemistry inter-consistency, OCSIF recommends: (1) quantifying uncertainties regularly using inter-laboratory comparison studies and over-determined carbonate chemistry measurements, (2) considering these uncertainties during analyses, (3) reporting uncertainties whenever possible, and (4) remaining cognizant that relatively modest well-quantified uncertainty contributions (known unknowns) are not evidence that presently-unquantified uncertainty contributions (unknown unknowns) are negligible (Thompson and Ellison 2011).This final caution should be emphasized for any application that is combining different variables and using carbonate chemistry calculations to convert the observations to a single variable type (e.g., Carter et al. 2018;Bushinsky et al. 2019).We next give several concrete examples of recommendations that go beyond individual variables and their measurement practices and that come from this appreciation of the uncertainties that remain within the seawater carbonate chemistry.

Data QC and synthesis efforts
OCSIF recommends avoiding the use of carbonate chemistry measurement inter-consistency as a basis for all but the largest inter-consistency disagreements, defined as those that exceed the combined uncertainties of both the measured and calculated values being compared (with measured and calculated value uncertainties to be quantified in the companion paper).More context is supplied for this recommendation in Supplementary Text S6.In light of this, these adjustments have not been used for new cruises added to the GLODAPv2 products since 2020 (Olsen et al. 2020).It is also recommended that any data product that is released with adjusted values should also provide a version without adjustments, or with the means to readily generate an adjustmentfree product.This will allow users to quickly quantify the impact of the proposed adjustments on their analyses and thereby gain a greater understanding of the full set of uncertainties in the collected measurements.

Adjustments applied to data prior to calculation of other parameters
Given our lack of understanding of the reasons behind offsets between measured and calculated values (e.g., Figs. 2, 3), OCSIF currently recommends against the use of adjustments intended to counter these offsets when using measured constraints to calculate other carbonate chemistry parameters.OCSIF instead recommends that uncertainties revealed by the offsets be included in the uncertainty estimation strategy employed for the calculated values (see Supplementary Text S7 for an example and additional reasoning for this recommendation).Furthermore, as is a standard practice among most data-providing communities currently, if calculated or otherwise derived values are provided, then all information needed to reproduce these values should be given, including all adjustments and carbonate chemistry constant assumptions.

RMs for CO 2 -in-seawater measurements
For reasons detailed in Supplementary Text S8, OCSIF members joined the call, issued in Spring 2022 by the International Ocean Carbon Coordination Project, for diversification of seawater carbonate chemistry RM production.Key challenges for new RM programs are that the RMs must have a known and well-quantified uncertainty in their assigned values, RM stability must be demonstrated for a known and specified shelf life, and novel cross-calibration measures between RM production centers would need to be developed and implemented.Quantifying the uncertainty, ensuring the values are consistent between production centers, and verifying RM stability are all time and labor-intensive steps, so it is likely that additional dedicated funding will be needed to develop and sustain these programs.Some additional RM-related recommendations are also provided in Supplementary Text S8.

Coastal ocean measurement SOP development
For the reasons given in Supplementary Text S9, OCSIF recommends coastal analytical carbonate chemistry be a focus for future research and intends to make coastal issues a priority in future discussions.General recommendations for progress toward addressing carbonate system uncertainties in the coastal ocean include: isolating and identifying contributors to unidentified A T in coastal environments and describing their proton binding behavior; investigating the effects of sample collection and treatment techniques (e.g., filtration, preservation with HgCl 2 , indicator dye addition for pH T measurement) on coastal seawater carbonate chemistry; providing tools to understand and compute carbonate chemistry speciation in anoxic environments; and defining carbonate chemistry constant values in environments that differ significantly (in salinity or composition) from the open ocean.

Measurement documentation and SOPs
Understanding and quantifying uncertainties begins with a clear documentation of measurement methods and use of SOPs.SOPs have long been developed for the four main measurement variables (Dickson et al. 2007), but for some variables (notably pH T ) the methods have evolved significantly since their last updates and should be revisited.For others (A T ), new methods have been developed and their comparability to the standard method may need to be assessed and documented.Thus, we recommend always providing clear documentation of exact methods used and estimates of uncertainties where possible.Method details important to document include the instrumentation used, calibration methods, indicator calibration equations, exact methods of end point detection, adherence to or deviation from SOPs, and so on.For spectrophotometric measurements, archiving of raw absorbances is important because it allows for recalculation if improved indicator dye calibrations become available.We also strongly encourage use of FAIR (findable, accessible, interoperable, and reusable) data principals (Wilkinson et al. 2016).

Inter-laboratory consistency assessments
Inter-laboratory comparison experiments provide one of the most insightful ways for individual laboratories to gauge how their measurements compare to the measurements produced by the community, and for the community to gauge the consistency of the measurements that are available for global analyses.We contend that these checks should be performed regularly with broad participation.
We summarize our recommendations in Table 2.

Fig. 3 .
Fig. 3.The overdetermined cruises in a variant of the GLODAPv2.2022product (without any adjustments applied) with at least A T , C T , and pH T measurements plotted using the conceptual framework introduced by Fong and Dickson (2019).The y-axis shows the average disagreement between measured and calculated pH T (at in situ conditions) for collections of data.The x-axis shows the slope of a line fit between this same discrepancy and the measured pH T for each cruise.The area of each open dot is proportional to the number of measurements on the cruise with the size of the symbols in the legend corresponding to 1250 measurements.The color of the dots indicates the methods used to measure A T on each cruise.The filled dots are the averages of the same quantities for collections of cruises, though the dot size has no special meaning.The arrows indicate how these filled dots would move on the plot if the impact of 10 μmol kg À1 of an unidentified acid-base species with a pK A of 4 on A T were removed.The arrow for the yellow-filled circle is shorter than the other arrows due to the higher pH titration endpoint (Sharp and Byrne 2020).
Carbonate ion contentThe amount of total carbonate ion content in seawater ( CO 2À 3 Â Ã T ) has a strong response to ocean acidification(Feely et al. 2009), and the related carbonate mineral saturation states are thought to be closely linked to certain species outcomes(Doney et al. 2020) and climate feedbacks(Ilyina et al. 2009).CO 2À 3 Â Ã

Table 1 .
Lee et al. (2010)l boron divided by the salinity, with a ratio value often taken as given byUppström (1974)orLee et al. (2010)of seawater chemistry that can be used to constrain the substance contents of various dissolved carbonate chemical species.This term is used herein to refer to the limited subset comprised of pH T , C T , A T , fCO 2 , .

Table 2 .
Summary of OCSIF recommendations and areas of needed research.See the text for further justification or clarification.Quantify the bias in pH T from determinations in Tris-buffer solution Develop approach for quantifying overall pH T uncertainty Revise calibration of pH T scale below 20 S P Develop methods and establish SOPs for verifying dye purity and calibrations Assign pH T (at specified temperature) to distributed seawater RMs Develop temperature-insensitive buffers for use in indicator dye measurement comparisons between laboratories Measure and report pH T at 20 C Update existing SOPs for pH T measurements A T Further investigate the composition and chemical properties of unidentified A T Develop and validate more transportable, cheaper, and affordable storage solutions for A T samples Refine and expand SOPs for quantifying A T acid titrant chemical properties C T Consider CO 2 loss during sample storage when comparing in situ measurements to discrete bottle measurements Develop and validate new C T measurement strategies fCO 2 Obtain more discrete seawater fCO 2 measurements for inter-comparison calculations Report and measure fCO 2 and pH T at the same temperature (recommended: 20 C) into carbonate chemistry calculation uncertainties Continue to assess inter-consistency and accuracy of new and published constant sets Report all constants used in calculations Reduce the uncertainty in the B T /S P ratio Reduce uncertainty in K 2 Assess pressure dependencies of carbonate system constants Develop, refine, and validate chemical speciation models for carbonate system calculations in non-traditional seawater compositions General Quantify carbonate chemistry calculation and measurement uncertainties Assess the impacts of uncertainties in carbonate chemistry measurements or calculations Report uncertainties in carbonate chemistry measurements and calculations Quantify all elements of uncertainty Data adjustments Avoid using inter-consistency as the basis for cruise adjustments except when disagreements are large Avoid data adjustments to counter apparent offsets between measured and calculated values of unknown origin Ensure that offsets of unknown origin are reflected in uncertainty estimations RMs Diversify RM production centers Develop a certified seawater pH T RM Develop RMs with varied carbonate chemistry compositions Coastal oceans Explore and validate alternative seawater preservation strategies (to HgCl 2 ) that could improve seawater RM accessibility Assess carbonate chemistry uncertainty specifically in coastal environments Develop and update SOPs for carbonate chemistry measurements Conduct and engage in more inter-laboratory comparison studies