An international intercomparison of stable carbon isotope composition measurements of dissolved inorganic carbon in seawater

We report results of an intercomparison of stable carbon isotope ratio measurements in seawater dissolved inorganic carbon (δ13C‐DIC) which involved 16 participating laboratories from various parts of the world. The intercomparison involved distribution of samples of a Certified Reference Material for seawater DIC concentration and alkalinity and a preserved sample of deep seawater collected at 4000 m in the northeastern Atlantic Ocean. The between‐lab standard deviation of reported uncorrected values measured with diverse analytical, detection, and calibration methods was 0.11‰ (1σ). The multi‐lab average δ13C‐DIC value reported for the deep seawater sample was consistent within 0.1‰ with historical measured values for the same water mass. Application of a correction procedure based on a consensus value for the distributed reference material, improved the between‐lab standard deviation to 0.06‰. The magnitude of the corrections were similar to those used to correct independent data sets using crossover comparisons, where deep water analyses from different cruises are compared at nearby locations. Our results demonstrate that the accuracy/uncertainty target proposed by the Global Ocean Observing System (±0.05‰) is attainable, but only if an aqueous phase reference material for δ13C‐DIC is made available and used by the measurement community. Our results imply that existing Certified Reference Materials used for seawater DIC and alkalinity quality control are suitable for this purpose, if a “Certified” or internally consistent “consensus” value for δ13C‐DIC can be assigned to various batches.

Over the past 200 years, the oceanic uptake of fossilfuel-derived CO 2 , with depleted values of δ 13 C, has caused time-dependent depletion of seawater δ 13 C-DIC. This " 13 C Suess-effect" signal (Keeling 1979) is a particularly useful tracer for estimation of the anthropogenic carbon (C ant ) accumulation in ocean waters, either through examining the vertical or along-isopycnal distribution of δ 13 C-DIC or through assessments of the air sea δ 13 C-DIC disequilibrium (e.g., Tans et al. 1993;Gruber and Keeling 2001;Körtzinger et al. 2003;Olsen et al. 2006;Quay et al. 2007). Also, δ 13 C-DIC has been used as a tracer to quantify ocean biological processes such as net community production (e.g., Emerson et al. 1997;Gruber et al. 2002;Quay et al. 2009). Because of its utility as a tracer, δ 13 C-DIC is listed as an Essential Ocean Variable (EOV) by the Global Ocean Observation System (GOOS). Use of the tracer, however, depends on the ability to compare measurements made in different locations at different times by multiple measurement groups. This requires measurements to be very accurate as well as precise, with between-lab data consistency being particularly important for resolving the small temporal changes associated with the 13 C Suess effect. According to δ 13 C-DIC data compilations (e.g., Schmittner et al. 2013;Eide et al. 2017), δ 13 C-DIC values in ocean waters range typically from −6.56‰ to +3.10‰, and based on experience and recommendation of leading researchers, an "accuracy/uncertainty" goal of AE0.05‰ in δ 13 C-DIC measurements has been called for in the GOOS EOV specification. Although not stated explicitly in the EOV specification, we consider this goal refers to "reproducibility" or closeness of agreement between independent results obtained on identical test material but under different conditions (e.g., different operators, different apparatus, different laboratories, and/or after different intervals of time). Reproducibility can be estimated from the between-lab standard deviation in collaborative studies (e.g., Belouafa et al. 2017). Unlike the measurement of DIC, for which most measurement groups follow standard operating procedures [SOPs] (Dickson et al. 2007), there is currently no SOP for δ 13 C-DIC analyses. There are also no agreed-upon standardization procedures, and liquid or soluble Certified Reference Materials (CRMs) that can be distributed among measurement groups are not available. A variety of analytical methods are in use, including detection by mass spectrometry and, in recent years, laserbased optical spectroscopy. The result is that the reproducibility of measurement made by different groups worldwide, or by the same group over time, is not well known. Becker et al. (2016) compiled and examined historical data for δ 13 C-DIC collected from the North Atlantic Ocean over the years 1981-2014 and used "crossover analysis" (Tanhua et al. 2010) of measurements reported from nearby locations at different times to assess offsets between data sets. Offsets between individual data sets ranged from −0.39‰ to +0.17‰, and likely provide a rough estimate of the reproducibility of historical data sets.
A more direct assessment of measurement reproducibility can be derived from intercomparison exercises in which identical replicate samples are sent to multiple labs for analysis (a so-called "ring-test" or "round robin test"). Intercomparisons of this type have been conducted for oxygen and hydrogen stable isotope compositions of water (δ 2 H-H 2 O and δ 18 O-H 2 O; e.g., Walker et al. 2015;Wassenaar et al. 2018;Verma et al. 2018), seawater nutrient concentrations (Aoyama et al. 2016), and seawater DIC and alkalinity concentrations (Bockmon and Dickson 2015;Verma et al. 2015). However, to the authors' knowledge, there has been only one published interlaboratory comparison study of δ 13 C-DIC measurements on natural waters (van Geldern et al. 2013). In their study, five groups measured a wide variety of natural water samples, including replicate samples of seawater. Results from four groups agreed to within AE0.07‰ with one group's result showing a larger discrepancy (1σ standard deviation of AE0.47‰ when results from all five groups were included).
As has occurred for other stable isotopic systems (e.g., 18 O/ 16 O in water; Walker et al. 2015), the introduction of methods based on optical spectroscopy is likely to lead to a rapid increase in the number of groups measuring δ 13 C-DIC on seawater samples. Given this, and the status of δ 13 C-DIC as an EOV, there is an urgent need to evaluate the reproducibility and traceability of current measurements as a basis for recommendations concerning future data quality control (QC) and assurance procedures. In this study, we present results of a worldwide seawater δ 13 C-DIC intercomparison exercise involving 16 participating laboratories. The results are used to assess the likely reproducibility of historical and current data and provide a basis for proposing steps that would lead to future improvements in seawater δ 13 C-DIC data quality.

Test waters and their suitability
Two supplies of seawater were used for the intercomparison study: (1) "Certified Reference Material for oceanic CO 2 measurements (Batch 157)" supplied by the University of California, San Diego, Scripps Institution of Oceanography; (2) samples of deep ocean seawater (DSW) collected in May 2017 from the northeastern basin of the Atlantic Ocean at depths of >4000 m during the 2017 Go-Ship A02 trans-Atlantic cruise (McGovern et al. 2017; GO-SHIP; http://www. go-ship.org/). One sample of "Certified Reference Material for oceanic CO 2 measurements (Batch 157)" and four samples of DSW (two for some groups) were distributed to 16 laboratories in the United States, Canada, Germany, France, Norway, Australia, Japan, and Russia for δ 13 C-DIC analysis. The participating groups were provided with the batch number of the Certified Reference Material but were not informed of the sampling location and depths at which the DSW was collected.
The Certified Reference Material is certified, prepared, and distributed for the QC and assessment of accuracy for seawater DIC and alkalinity measurements (Dickson et al. 2003;Humphreys et al. 2016). However, it has not been certified for its δ 13 C-DIC value, so we will refer to it from now on as "RM." The preparation and storage procedures of the RM have been tested extensively for DIC and alkalinity (Dickson et al. 2003) so that it is can be expected that RM's δ 13 C-DIC value should also show high bottle-to-bottle reproducibility within each batch (A.G. Dickson, personal communication, 9 August 2016;Humphreys et al. 2016). On this basis, we regard RM from the same batch contained in different bottles as identical replicate samples suitable for use in a ring test. This RM preparation and preservation procedure is very similar to the sampling procedure for δ 13 C-DIC recommended by McNichol et al. (2010). Based on the time of bottling, the RM samples were typically stored for 16-22 months prior to analysis.
The DSW samples were collected and stored in accordance with slightly different protocols, as follows: after rinsing precleaned, 160 mL borosilicate serum bottles 3 times, water samples were introduced into the bottles from the bottom through Tygon tubing, allowing for overflow prior to closure. Care was taken to avoid introduction of airborne CO 2 during the filling procedure. The bottles were capped immediately with flat butyl septa with PTFE coating, crimped with aluminum seals, and 1.6 mL of the water sample was removed using a syringe and replaced with CO 2 free air, which had been passed through a sodium hydroxide trap. Finally, 0.1 mL of saturated mercuric chloride solution was injected into each bottle for preservation of the samples, which were stored in the dark at room temperature (20-23 C) prior to distribution. The storage time for the DSW samples following collection on R.V. Celtic Explorer ranged from 4 to 10 months.
In total, 52 DSW samples were collected from 3 × 10-L Niskin bottles at two nearby stations. Information about these DSW samples is presented in Table 1. Saunders (1986) had previously noted "remarkable uniformity" of the temperaturesalinity relationship in waters below ca. 3000 m in the northeastern Atlantic Ocean where "the Mid-Atlantic Ridge, European and African continental rise, Sierra Leone rise and the Rockall Plateau enclose the deep water in the sampling region, permitting significant exchange only south of 15 N." Saunders also noted high uniformity of dissolved oxygen concentrations and used this as the basis for assessment of the accuracy of historical salinity and oxygen measurements. For our purposes, it is sufficient that the 52 samples collected from three separate Niskin bottles are representatives of one homogeneous water sample. This is supported strongly by the identical values of salinity, temperature, and dissolved oxygen concentrations corresponding to the three Niskin bottles (Table 1). This is also supported by t-test results of all the δ 13 C-DIC measurement results derived from different participating groups, which are discussed below.
The deep water below 4000 m contained near-zero concentrations (0.03 pmol/kg) of the anthropogenic compound CFC-12 (CCl 2 F 2 ) (T. Tanhua, personal communication, 31 March 2018) and earlier measurements from further south (ca. $38 N; Tanhua et al. 2007) also showed near-zero concentrations of CCl 4 . The latter is an anthropogenic compound that was introduced into the environment around 1910. Taken together, these findings imply that this deep water has not been impacted significantly by C ant , has high spatial and temporal uniformity, and has likely been stable in terms of its δ 13 C-DIC composition for at least hundreds of years. This not only allows us to use the DSW samples for the ring test, but also allows us to compare the δ 13 C-DIC measurement results from this intercomparison exercise with historical (and future) δ 13 C-DIC data from the same region and water mass.

Participating laboratories and methods
In most oceanographic and hydrogeological studies, δ 13 C-DIC of water samples is measured by isotope-ratio mass spectrometry (IRMS) coupled with various front-end peripherals (e.g., Salata et al. 2000;Torres et al. 2005;Assayag et al. 2006;Waldron et al. 2014). In recent years, laser-based optical spectroscopy such as Isotope Ratio Infrared Spectrometer (IRIS) and cavity ring-down spectroscopy (CRDS) have also been used for detection (e.g., Bass et al. 2012;Call et al. 2017). A brief summary of the methods used by the participating groups is presented in Appendix S1 (Supporting Information). All the methods for δ 13 C-DIC measurements applied by different groups are based on the traditional CO 2 conversion technique in which DIC in seawater is converted to CO 2 by adding H 3 PO 4 , followed by the extracted and equilibrated gaseous CO 2 being introduced into detectors for δ 13 C-CO 2 analysis. Several laboratories corrected for the CO 2 partitioning between the seawater and headspace during gas extraction and equilibration (e.g., Gillikin and Bouillon 2007). In this study, 14 groups used IRMS systems for δ 13 C-CO 2 analysis, 1 group measured δ 13 C-CO 2 using CRDS, and 1 group used both IRMS and IRIS for δ 13 C-CO 2 determination. Supporting Information Appendix S1 shows that a wide variety of internal reference materials such as NaHCO 3 , Na 2 CO 3 , and so forth, as well as international calibration materials in both solid and gas phase were used by the different groups to standardize their results to the Vienna Pee Dee Belemnite (VPDB) scale, and also for internal data QC. The reported measurement precisions of participating laboratories ranged from 0.03‰ to 0.40‰ (AE1σ).

Results and assessment
The δ 13 C-DIC results reported by the participating laboratories are shown in Table 2. Both lab 1 and lab 12 reported procedural problems during their analyses (e.g., exposure to laboratory air during sample transfer), so that their results for RM and DSW are likely not representative of their normal operations. The raw (uncorrected) δ 13 C-DIC results reported by the participating laboratories are plotted in Fig. 1a.
Use of a Shapiro-Wilk test (Shapiro and Wilk 1965) with these results showed that δ 13 C-DIC values of RM are normally distributed, whereas those of DSW are not (W = 0.90889, p = 0.1121, n = 16 for RM δ 13 C-DIC results; W = 0.34614, p = 2.218 × 10 −14 , n = 56 for DSW δ 13 C-DIC results). We calculated the first quartile (Q 1 ), third quartile (Q 3 ), and the interquartile range (IQR) for RM and DSW δ 13 C-DIC results for outlier detection. A single δ 13 C-DIC value for DSW sample 101107-a determined by lab 1 and two DSW results for samples 101155-a and 101155-b determined by lab 3 lie outside the interval of [Q 1 -1.5IQR, Q 3 +1.5IQR] and were treated as outliers (Rousseeuw and Hubert 2011). After elimination of these three outliers from the data set, the Shapiro-Wilk test for the DSW results showed that the remaining δ 13 C-DIC data for DSW are also normally distributed (W = 0.97588, p = 0.3564, n = 53).
In order to test whether all DSW samples can be considered representative of the same water, three t-tests were conducted between the δ 13 C-DIC results of DSW samples taken from the three Niskin bottles. p-Values of 0.29, 0.79, and 0.32 for 99% confidence interval indicate that there is no significant difference in the means of δ 13 C-DIC results of DSW samples taken from the three Niskin bottles. Subsequently, all δ 13 C-DIC results of DSW taken from the three Niskin bottles will be considered together.
Statistical properties for the δ 13 C-DIC results for RM and DSW are reported in Table 3. Figure 1a shows that there are Table 2. Raw δ 13 C-DIC results reported by participating laboratories for RM and DSW samples, and corrected DSW δ 13 C-DIC results, where "corrected" refers to values that have been corrected using Eq. 1 based on RM analyses. All δ 13 C-DIC results are reported in per mill (‰) vs. VPDB and have been rounded to two decimal places. systematic between-lab differences of δ 13 C meas values that are reflected in the results of both RM and DSW analyses. This is reflected in the significant correlation between residuals (i.e., δ 13 C meas -δ 13 C ave ) of RM with those for DSW, as shown in Fig. 2. Here, δ 13 C meas refers to the δ 13 C-DIC result for a particular sample as reported by an individual participating laboratory and δ 13 C ave denotes the average of all δ 13 C-DIC results for that sample reported by the participating laboratories (i.e., the "alllab average"). Based on this empirical relationship between residuals for RM and DSW, we corrected reported δ 13 C-DIC values for DSW samples using the following equation: where δ 13 C DSW-corr denotes the corrected DSW δ 13 C-DIC values, δ 13 C DSW-meas is the reported DSW δ 13 C-DIC result from a participating laboratory, δ 13 C RM-meas is the reported RM δ 13 C-DIC result from the same participating laboratory, and δ 13 C RM-ave is the "all-lab average" of RM δ 13 C-DIC results (i.e., from all participating laboratories).
Here, we emphasize that although δ 13 C RM-ave can be considered as a "consensus" value, it does not represent a "Certified" value, and hence the correction procedure does not necessarily make the results more accurate. The resulting δ 13 C DSW-corr values are shown in Table 2 and visualized in Fig. 1b. The correction based on RM results reduces the standard deviation for the DSW δ 13 C-DIC results from 0.10‰ to 0.06‰, which we take as our estimate of between-lab reproducibility. Furthermore, if we were to remove the results for which analytical problems were reported by lab 1 and lab 12, the average value and standard deviation of all corrected DSW results are +0.88‰ and 0.05‰, respectively (Table 3).

Comparison of within-lab and between-lab precision with prior estimates
The 16 participating laboratories reported within-lab analytical precision (AE1σ) ranging from 0.03‰ to 0.40‰ with a median value of 0.10‰. These levels of precision are comparable with previous reports in the literature that range  Seawater δ 13 C-DIC intercomparison from 0.03‰ to 0.23‰ (e.g., Olsen et al. 2006;Quay et al. 2007;McNichol et al. 2010;Humphreys et al. 2016). In this study, the between-lab reproducibility (δ 13 C stdev ) for both RM and DSW, before correction based on RMs but after outlier removal, was 0.11‰. The only previously published assessment by van Geldern et al. (2013) involved only five groups and reported between-lab reproducibility (δ 13 C stdev ) as low as 0.07‰ for results from four laboratories but reaching 0.47‰ when results from all five laboratories were included. During this ring test, the absolute maximum between-lab differences for uncorrected RM and DSW values were 0.32‰ and 0.48‰, respectively, which is comparable to the typical AE2σ precision for seawater δ 13 C-DIC measurements in most oceanographic studies (e.g., Humphreys et al. 2016). A report of an unpublished intercomparison exercise of seawater δ 13 C-DIC conducted in the 1990s stated that "if results from two laboratories were excluded, the remaining 10 laboratories showed between-lab differences up to 0.3 per mill" (http:// unesdoc.unesco.org/images/0012/001206/120608Eo.pdf). This is comparable to the maximum between-lab differences in our study and suggests that there may have been little or no improvement in the overall quality of seawater δ 13 C-DIC data over the past two decades. Significantly, the 0.11‰ betweenlab reproducibility is a factor of 2 worse than the target uncertainty level of AE0.05‰ proposed for GOOS, which implies that current data QC procedures, based on individual laboratories' calibration of aqueous samples using solid and/or gasphase standards, are inadequate and must be improved (for methods, see Supporting Information Appendix S1).

Comparison of deep seawater sample analyses with historical data
The A02 hydrographic section across the North Atlantic Ocean has been occupied several times since the 1990s and samples from the same locations have been collected and analyzed for δ 13 C-DIC in 1994 (Koltermann et al. 1996(Schott et al. 1999), 1999(Friis et al. 2003(Rhein 2005, and most recently with this GO-SHIP data set collected in 2017. The mean and between-lab standard deviation of the raw (i.e., uncorrected) previously reported deep seawater data collected at approximately the same location and depth as our samples are +1.00‰ and 0.10‰, respectively. After application of secondary QC (2nd QC) procedures and adjustment of the historical data based on "crossover analysis" (Tanhua et al. 2010;Becker et al. 2016), the mean value for deep seawater from this location was reported to be +0.95‰. The 2nd QC procedure is based on the assumption that the ocean's deep water values are not changing, and recommendation of adjustments to data sets requires selection of a reference or "core" cruise (see Becker et al. 2016). In the case of δ 13 C-DIC, the choice of a "core" or reference data set is generally subjective rather than based on use of a specific calibration or measurement procedure. The level of agreement between our all-lab average of uncorrected values for DSW (+0.87 AE 0.10‰) sampled in 2017, with the average of adjusted historical data from the same location collected from multiple cruises (+0.95‰), also suggests that the reproducibility of δ 13 C-DIC data using approaches currently employed by experienced laboratories is of order 0.1‰ and a factor of 2 poorer than the GOOS specification.
Effect of sampling, sample storage, and analysis methods Table 4 shows the average offset of each lab's reported DSW δ 13 C-DIC results (without RM-based correction) from the all-lab average (+0.87‰). With the exception of lab 13, each lab's offset was smaller than its 2σ within-lab precision indicating that systematic biases are relatively small relative to measurement precision. We could detect no consistent link or pattern that connected the methodology used by the participating laboratories and the magnitude or sign of their offsets. The general interlaboratory agreement therefore demonstrates that, for a typical seawater sample with DIC concentration of 2050-2200 μM, the use of a wide range of δ 13 C-DIC determinations and standardization methods (e.g., different front-end peripherals; various equilibration times after CO 2 extraction; a variety of calibration procedures; different internal reference materials and international calibration materials in solid and gas phase) does not necessarily lead to major interlaboratory differences. Specifically, we note that the average (uncorrected) DSW results from two labs that used CRDS and IRIS for quantification were both within 0.01‰ of the all-lab average that was determined largely using IRMS. This suggests that there is no fundamental challenge in terms of accuracy involved with the use of these newer techniques. The intercomparison study did not specifically address the impact of sampling and sample storage on reproducibility, however as noted above, the RM and DSW samples were subject to different sampling protocols and storage periods. The correlation and similar magnitudes of residuals observed for the two types of samples (Fig. 2) suggest that sampling and sample storage procedures are not a dominant cause of between-lab offsets. However, a controlled study to examine the potential impact of sample collection, storage, and transfer procedures on data quality would be useful to clarify this issue.
Value and importance of a reference material Figure 1a shows that there were systematic between-lab differences of the order of 0.10‰ in the results for both RM and DSW analyses. Until now, systematic analysis-related differences in δ 13 C-DIC data measured by different groups and/or collected on different cruises have been identified and corrected using 2nd QC based on crossover analysis (Tanhua et al. 2010;Lauvset and Tanhua 2015;Becker et al. 2016) and/or using some form of correction based on simultaneously measured parameters such as dissolved oxygen, DIC, or temperature. This 2nd QC procedure can reportedly achieve internal consistency of carbon stable isotope data from different cruises of the order of 0.02‰ (Becker et al. 2016). However, the approach is effectively a "consensus" approach based on forcing agreement between data collected in the same geographic region, rather than a more "absolute" approach based on a certified technique or widely accepted reference material.
The approach cannot be used effectively in locations where there are strong spatial gradients of δ 13 C-DIC data or where temporal changes are expected so that Lauvset and Tanhua (2015) recommend the "use of CRMs if at all possible." This step is presently not possible for δ 13 C-DIC analyses.
Our study demonstrates that use of a CRM-based data correction procedure could improve between-lab reproducibility to match the target level (AE0.05‰) proposed by GOOS. The result is encouraging, as the target uncertainty was attained despite the use of a wide variety of analytical and standardization approaches, and the samples themselves being subject to several of the variables that can compromise their integrity, including overseas transportation. Application of RM-based corrections procedure also brought the DSW δ 13 C-DIC results from all groups involved in this study into reasonably close agreement (to within ca. 0.1‰) with historical δ 13 C-DIC values from the same water mass after application of 2nd QC (Becker et al. 2016).
We therefore conclude that provision of CRMs in the form of aqueous-phase samples would make a significant contribution to data quality, and it is essential if the GOOS accuracy/ uncertainty specification for δ 13 C-DIC is to be met in the future. The existing "Certified Reference Material for oceanic CO 2 measurements" produced by Scripps Institution of Oceanography would be appropriate, if they could be certified for δ 13 C-DIC. We note that our intercomparison was for deep waters only, and that it remains to be investigated whether at least two CRMs covering the upper and lower range of typical oceanic δ 13 C-DIC values would be required to assure that the GOOS goal can be attained throughout the water column.

Comparison of RM-based corrections and 2nd QC
The magnitude of offsets of DSW δ 13 C-DIC from the consensus mean in this study (with and without RM correction) can be compared with the between-cruise offsets identified during the 2nd QC of historical North Atlantic data (Becker et al. 2016). The latter compared deep ocean data (>1500 m) reported by 5-6 different labs at "crossover" locations from 14 different cruises. The between-cruise offsets for deep water samples ranged from −0.39‰ to +0.17‰, with an average offset of +0.14‰, and the final recommended corrections to the δ 13 C-DIC results from individual cruises ranged from −0.30‰ to +0.25‰, with an average of +0.11‰. In our study, the deviations of individual labs' uncorrected DSW results from the all-lab average value ranged from −0.18‰ to +0.18‰, with an average of +0.08‰. The individual lab deviations from the all-lab average for the RM samples ranged from −0.20‰ to +0.12‰, with an average of 0.00‰. Student T tests indicate no significant difference between the mean DSW offset in this study and the mean crossover offsets reported by Becker et al. (2016). Similarly, there was no significant difference between the mean of our RM-based corrections for individual groups and the mean of the recommended Hence, the overall level of data quality and between-lab bias in our intercomparison study is consistent with the systematic offsets identified through comparisons of in situ data collected on different cruises. However, corrections of data based on measurement of well-characterized reference material is clearly a more reliable approach to improving consistency of data than reliance on the assumption that deep water values have constant, unchanging δ 13 C-DIC values.

Conclusions and recommendations
Our δ 13 C-DIC intercomparison study involving 16 groups worldwide showed between-lab reproducibility of uncorrected raw values (AE0.11‰) comparable to that reported from the only previous published interlaboratory comparison of seawater analyses (van Geldern et al. 2013), which was limited to only five groups. The level of between-lab reproducibility was also not statistically different from the magnitude and variability of offsets between historical cruise data sets detected by crossover analysis. Reports of an unpublished study conducted in the 1990s suggest that the level of reproducibility may not have changed significantly over the past two decades. The average δ 13 C-DIC value for samples of deep seawater samples collected during this study are also consistent (within 0.1‰) with historical data from the same location after adjustment by 2nd QC based on crossover analysis (Becker et al. 2016).
Our results imply that the use of different sampling and analytical methods, and/or standardization procedures, including the use of new optical spectroscopy detection methods, does not necessarily lead to major systematic differences between laboratories. However, our results also show that the accuracy/uncertainty goal proposed by GOOS (AE0.05‰) is not being met with current approaches.
Correction of our raw data based on common measurement of an RM demonstrates that provision of an aqueous phase reference material for δ 13 C-DIC would result in significant improvement in reproducibility and makes the GOOS goal attainable. Our results confirm earlier suggestions that the existing Certified Reference Materials used for seawater DIC and alkalinity QC are suitable for this purpose, but only if these can be assigned "Certified" or consensus δ 13 C-DIC values that are reproducible between batches. An alternative to use of an aqueous-phase CRM would be centralized distribution of a readily soluble carbon-containing compound, coupled with a SOP for its introduction into the aqueous phase. The latter approach could have the advantage that large quantities could be produced, reducing the difficulty of assuring batch-to-batch reproducibility. However, the effectiveness of this approach has not yet been tested.
The number of groups measuring δ 13 C-DIC in ocean waters is likely to increase significantly as more accessible lower cost and more portable instrumentations become available. Hence, although the between-lab agreement reported here might be considered encouraging, or at least consistent with the data quality of past decades, there is no guarantee that this will be propagated into the future and the situation could even worsen as less experienced groups enter the field. Even the current between-lab reproducibility limits key applications of the δ 13 C-DIC tracer signal. Therefore, we recommend strongly that the δ 13 C measurement community work together rapidly to establish a procedure for the preparation and distribution of liquid or soluble CRMs for δ 13 C-DIC. In the meantime, we recommend that stable isotope analysis ring tests of the type described here be repeated periodically and extended (e.g., to evaluate effects of sampling and sample storage and the potential for use of a solid-phase RM).