Sub-Permil Interlaboratory Consistency for Solution-Based Boron Isotope Analyses on Marine Carbonates

Boron isotopes in marine carbonates are increasingly used to reconstruct seawater pH and atmospheric pCO 2 through Earth’s history. While isotope ratio measurements from individual laboratories are often of high quality, it is important that records generated in different laboratories can equally be compared. Within this Boron Isotope Intercomparison Project (BIIP), we characterised the boron isotopic composition (commonly expressed in  11 B) of two marine carbonates: Geological Survey of Japan carbonate reference materials JCp-1 (coral Porites ) and JCt-1 (giant clam Tridacna gigas ). Our study has three foci: (i) to assess the extent to which oxidative pre-treatment, aimed at removing organic material from carbonate, can influence the resulting  11 B; (ii) to determine to what degree the chosen analytical approach may affect the resultant  11 B, and (iii) to provide well-constrained consensus  11 B values for JCp-1 and JCt-1. The resultant robust mean and associated robust standard deviation ( s*) for un-oxidised JCp-1 is 24.36 ± 0.45‰ (2 s* ), compared with 24.25 ± 0.22‰ (2 s* ) for the same oxidised material. For un-oxidised JCt-1, respective compositions are 16.39 ± 0.60‰ (2 s* ; un-oxidised) and 16.24 ± 0.38‰ (2 s* ; oxidised). The consistency between laboratories is generally better if carbonate powders were oxidatively cleaned prior is emerging between (a). (d) z -score associated with the mean laboratory  11 B compositions shown in (b). An absolute z -score below or equal to 2 is considered acceptable, absolute z -score values between 2 and 3 are of likely questionable quality, and absolute values beyond 3 suggest that results are outside the satisfactory range, as indicated by the stippled horizontal lines

This article is protected by copyright. All rights reserved The boron isotope system is a non-traditional light stable isotope system with only two isotopes, 10 B and 11 B. The boron isotope ratio of any substrate is usually presented relative to an isotope reference material distributed by the National Institute of Standards and Technology in delta notation: (1) where NIST SRM 951 (or NIST SRM 951a) represents a boric acid isotopic reference material powder. NIST SRM 951 and NIST SRM 951a are essentially isotopically identical with a certified 10 B/ 11 B of 0.2473 ± 0.0002. In recent decades, boron isotope ratios measured in biogenic carbonates have emerged as a valuable tool to determine past seawater pH, a key variable to reconstruct atmospheric CO 2 concentrations and other marine carbonate system parameters (Vengosh et al. 1991, Hemming andHanson 1992). Boron isotope ratios in marine carbonates can be used as a pH indicator because of several key characteristics. First, boron behaves conservatively in seawater with a residence time of ~ 14 Ma (Lemarchand et al. 2000) and a resultant homogenous bulk seawater  11 B of 39.61 ± 0.04‰ (Foster et al. 2010). Boron in seawater occurs as two aqueous species, boric acid (B(OH) 3 ) and borate ion (B(OH) 4 -) (Dickson 1990). The relative abundance of each species is pH-dependant (Vengosh et al. 1991, Hemming andHanson 1992), resulting in the increasing proportion of borate ion under increasing pH conditions. Importantly, the  11 B of borate ion is isotopically depleted compared with boric acid as a function of equilibrium isotope fractionation between the two species (Zeebe 2005, Klochko et al. 2006. The ratio of borate ion to boric acid increases significantly in the pH range of modern and palaeo-seawater (ca. 7.7 to 8.3 on the total pH scale). Since most marine calcifiers only incorporate borate ion into biogenic carbonates, it follows that their boron isotopic ratio provides direct information on ambient seawater (Vengosh et al. 1991, Hemming andHanson 1992) or internal calcifying fluid pH (Rollion-Bard et al. 2003, Allison et al. 2010, McCulloch et al. 2012. While commonly applied to foraminifera (Hönisch and Hemming 2005, Foster 2008, Rae et al. 2011, in recent years the pH sensitivity of the boron isotope system has been explored in a variety of marine biogenic carbonates, including brachiopods (Lecuyer et al. 2002, Penman et al. 2013, Jurikova et al. 2019), corals (Hönisch et al. 2004, Reynaud et al. 2004, Wall et al. 2016 Accepted Article This article is protected by copyright. All rights reserved et al. 2017, Wu et al. 2018), molluscs (Heinemann et al. 2012) and coralline algae (Cornwall et al. 2017, Donald et al. 2017, Anagnostou et al. 2019).
The first investigations into the pH-dependent fractionation of 11 B/ 10 B during incorporation into CaCO 3 were published in the late 1980s (e.g., Balz et al. 1986, Oi et al. 1991. However, for many years significant offsets between individual laboratories (on the order of 2 to 11‰) permitted only limited comparability of δ 11 B data between institutions (e.g., Aggarwal et al. 2009). This disagreement is not surprising since boron is a contamination-prone light stable isotope system that requires clean reagents and careful sample handling during purification and analysis, as well as a boron-free air handling system (e.g., Rosner et al. 2005). The latest study comparing the reproducibility condition of measurement of boron isotopic data reported good agreement for solutions of dilute boric acids and seawater samples, yet also revealed elevated interlaboratory isotopic offsets for identical carbonate sample materials on the order of up to ~ 1.5‰ (2s) (Foster et al. 2013). Only four laboratories participated in that study, and since then considerably more research groups have begun publishing carbonate-derived boron isotope data. For this reason, we present a timely update on the interlaboratory comparability of boron isotope data in commonly used marine carbonate reference materials.
Besides comparing different sample handling and mass spectrometric approaches, a further sample preparation step in the measurement procedure was tested within the frame of BIIP. We assessed the impact of oxidative cleaning techniques on biogenic carbonates, such as those frequently performed for other geochemical analyses (Boyle 1981, Barker et al. 2003. We present boron isotope results generated in ten individual laboratories, which reveal an unprecedented level of consistency of carbonate  11 B results between laboratories. Given the comparison of cleaned and uncleaned material, we also identify potential pitfalls during the processing of carbonate samples for boron isotopic approaches that potentially compromise the high level of analytical agreement that is emerging between laboratories.

Accepted Article
This article is protected by copyright. All rights reserved Two powdered and homogenised biogenic carbonates originally produced by the Geological Survey of Japan were analysed in ten different laboratories for our boron isotope interlaboratory comparison study (Table 1). The first carbonate used is JCp-1, a modern Porites sp. coral colony sampled 2 metres below mean sea level on the northeast coast of Ishigaki Island, Ryukyu Islands, Japan (24°33'N, 124°20'E). JCp-1 is entirely aragonitic and all surfaces of the corals in contact with the biological tissue were removed prior to processing (Okai et al. 2002). As outlined in the original publication, crushed coral material was washed with deionised water and dried prior to further grinding and homogenisation. The grain size fraction < 250 mm of JCp-1 material was sieved and distributed by the Geological Survey of Japan.
The second reference material used was also prepared by the Geological Survey of Japan (Inoue et al. 2004). Reference material JCt-1 is derived from a fossil mid-Holocene giant clam Tridacna gigas sampled near Kume Island, Japan (26°N, 126°E) in the central Ryuku Islands, Japan. It is also entirely aragonitic. Further details on powder preparation of JCt-1 were not provided in Inoue et al. (2004).
None of the powders were bleached prior to packing (Hathorne et al. 2013). Previously published trace elemental ratios presented by Hathorne et al. (2013) are reported for comparison in Table 1.
Notably, Sr, Mg, Ba, B and Li have higher ratios relative to Ca in JCp-1, while JCp-1 has approximately fifty times higher U/Ca than JCt-1. At 460 mol mol -1 , the molar B/Ca ratio in JCp-1 is ~ 2.4 higher than in JCt-1 which has 191 mol mol -1 (Hathorne et al. 2013).
Due to changes in CITES regulations (i.e., Convention on International Trade in Endangered Species of Wild Fauna and Flora; www.cites.org), neither of these biogenic carbonate materials are currently available for international distribution by the Geological Survey of Japan, but they remain common reference materials in many laboratories (e.g., Farmer et al. 2016, Lazareth et al. 2016, Raddatz et al. 2016, Stewart et al. 2016, Jurikova et al. 2019. Efforts are on-going to find suitable replacements and two isotope standard solutions artificially produced with carbonate matrices (NIST RM 8301 (Coral) and NIST RM 8301 (Foram)) will soon become available as consistency reference materials for boron isotopic and trace metal isotope studies (Stewart et al. in press).

Accepted Article
This article is protected by copyright. All rights reserved

Analytical and mass spectrometric approaches
Nine out of the ten laboratories participating in this study used an MC-ICP-MS-based approach to determine the  11 B of the JCp-1 and JCt-1 reference materials; one used N-TIMS. With the exception of the N-TIMS approach, for which boron was not separated from the aragonitic matrix, elemental purification was carried out in all laboratories (Table 2). In eight laboratories, boron was purified using Amberlite TM IRA743 exchange resin on microcolumns (Gonfiantini et al. 2003, Foster 2008, Aggarwal et al. 2009, Paris et al. 2010, Louvat et al. 2011, Rae et al. 2011, Voinot et al. 2013, McCulloch et al. 2014, Roux et al. 2015 or using a batch method (Douville et al. 2010, Wu et al. 2018, and one laboratory employed the sublimation technique for boron purification (Wang et al. 2010). The boron total procedural blank ranged from below 8 pg to about 3000 pg between laboratories (Table 2). Sample ionisation during N-TIMS spectrometric measurement is achieved via heating of Re-metal filaments in a high-vacuum source chamber. For the MC-ICP-MS approaches, sample introduction was achieved using either: (i) a quartz spray chamber (Gonfiantini et al. 2003, Aggarwal et al. 2009, Douville et al. 2010, Wang et al. 2010, Voinot et al. 2013, McCulloch et al. 2014, (ii) a PFA spray chamber (Foster 2008, Rae et al. 2011 direct injection (d-DIHEN) (Paris et al. 2010, Louvat et al. 2011, Louvat et al. 2014. Some of the laboratories used ammonia introduced via a second gas inlet into the spray chamber as an additional gas to aid washout between individual measurements (e.g., Foster 2008). None of the laboratories in this study used hydrofluoric acid to aid boron washout, although recent studies have shown this to be an effective alternative to an ammonia add gas (Misra et al. 2014, Rae et al. 2018). All laboratories used an (isotope-) calibrator-sample bracketing technique to derive  11 B.
Except for the MC-ICP-MS method with direct injection as introduction system, on peak zeros were subtracted from respective ion beams in all MC-ICP-MS based approaches. This approach is necessary because of the typically poor washout of boron compared with other isotope systems and the relatively small signal sizes, requiring tight control over memory effects during sample introduction.

Accepted Article
This article is protected by copyright. All rights reserved In contrast to an earlier interlaboratory comparison study (Gonfiantini et al. 2003), participating laboratories were required to have a demonstrable record of producing  11 B high-quality data.
Every participating laboratory was sent 2 g of powder of each of the two reference materials. A minimum of six test portions of each reference material, weighing at least 5 mg each, were analysed in each laboratory. Three of these test portions were processed without any oxidative cleaning, and the other three underwent oxidative cleaning using either NaClO or H 2 O 2 in dilute NH 4 OH ( Table 2). Each laboratory reported 2-10 results for each test portion digest, either as individual filament analyses (e.g., N-TIMS) or simply as repeat measurements of the same powder preparation (e.g., MC-ICP-MS). The key aim of our study was to assess consistencies and potential discrepancies between techniques with particular focus on analytical problems that could be improved in future studies. Therefore, the reported  11 B data were compiled and statistically analysed by the first author, while the origin of each data set was kept anonymous as much as feasible.

Statistical data treatment
First, the average  11 B value of each laboratory for the four individual sample sets (presenting either previously oxidised or un-oxidised JCp-1 or JCt-1 boron isotope results) was determined.
This provides a total of ten independent laboratory mean  11 B values for un-oxidised JCp-1 and nine mean  11 B values for oxidised JCp-1 (Table 3). For JCt-1 reference material powders, nine  11 B means from both un-oxidised and oxidised powders were reported (Table 4). Subsequently, the robust mean and associated robust standard deviation were calculated for each of the four data sets. To do so, we followed the ISO 13528:2015 data treatment procedure for normally distributed data sets as outlined in approach 2 of Srnková and Zbíral (2009). The procedure of deriving the robust mean and robust standard deviation is iterative and the statistical analysis is repeated until no change in the calculated robust mean X * and its robust standard deviation s * is observed. The approach is outlined below.
An initial robust average X * is calculated from the median of each laboratory's mean  11 B (hence n being either 9 or 10). The associated initial robust standard deviation (representing the Median

Accepted Article
This article is protected by copyright. All rights reserved Absolute Deviation, MAD) s * is derived by multiplying the median of all laboratories' offsets from X * from the interlaboratory median by 1.483. Calculation of s * = 1.483  MAD is a robust scaling factor applied in statistic applications for normally distributed data sets following the argument that the median absolute deviation covers 50% (between ¼ and ¾) of the standard normal cumulative distribution function (see 13528:2015(E) 2015). Next, a  value is calculated via multiplication of the initial robust standard deviation with a factor 1.5. Then, (X * - as well as (X * +  are calculated. If any laboratory's mean  11 B falls below (X * -, the actual laboratory mean  11 B is replaced with (X * -. If any laboratory's  11 B mean falls above (X * + , the actual laboratory mean  11 B is replaced with (X * + . Laboratory mean  11 B values larger than (X * - and smaller than (X * +  are kept, representing the vast majority of  11 B values presented here. This exercise led to exclusion of the following mean  11 B values: un-oxidised JCp-1 powders from laboratories 1 and 4 (laboratory numbers refer to corresponding numbers shown in Figures 1 and 2), un-oxidised JCt-1 powders from laboratories 1, 4 and 10, oxidised JCp-1 powders from laboratories 1 and 6, and oxidised JCt-1 powders from laboratories 6 and 10. For all four data sets (i.e., un-oxidised and oxidised JCp-1 and JCt-1), an updated X * and s * was then calculated and the above screening procedure repeated, resulting in no further exclusion of data. The resultant robust means and robust standard deviations discussed in the text and shown in Table 5, as well as in Figures 1 to 4 have been derived in this manner. We reiterate that the robust standard deviation is calculated using only the mean  11 B per laboratory for each of the four data sets. As a measure of the integrity of reported average  11 B from each laboratory we used a z-score: in which x i represents the individual laboratory average  11 B, X * the robust mean, and s * the robust standard deviation. An absolute z-score below or equal to 2 is considered to be acceptable, absolute z-score values between 2 and 3 are of likely questionable quality, or in the case of laboratory 10 reflect on a carbonate specific constant offset between N-TIMS and MC-ICP-MS (see also Foster et al. 2013). A z-score value beyond 3 suggests that results are outside the satisfactory range. Given that s * is used for determining the z-score for each laboratory mean, this approach may systematically exclude certain laboratory results as outliers (i.e. those with most distinct  11 B relative to X * ). However, given the distribution of our data sets, those mean

Accepted Article
This article is protected by copyright. All rights reserved laboratory  11 B that fell beyond a z-score of 3 are relatively clear cases of questionable quality (Figures 1, 2).

Results
Throughout Figures 1 to 3, the order of laboratories is kept the same in the respective panels, chosen so that Figure  We note that the few obvious outliers (identified via |z| > 3) in our interlaboratory comparison were all shifted towards lower reported  11 B (Figures 1 and 2). Laboratory 4 only reported  11 B for un-oxidised JCp-1 and JCt-1 reference material powders, and submitted ratios fall outside the z-score reliability threshold. Laboratory 6 provided results for oxidised standard  11 B for both JCp-1 and JCt-1 that also fail this data screening criterion. Although  11 B from Laboratory 6 for oxidised reference materials can be flagged as outliers, the un-oxidised mean  11 B for both reference materials of Laboratory 6 agree well within the range of  11 B reported from the majority of other laboratories.

Accepted Article
This article is protected by copyright. All rights reserved The various mass spectrometry approaches (i.e., N-TIMS vs. ICP-MS) did not lead to any clear isotopic shift between reported  11 B for JCp-1 yet potentially slightly higher  11 B for JCt-1 for N-TIMS (Figures 1-3, Tables 3 and 4) (cf. Farmer et al. 2016). The choice of sample introduction system (i.e., quartz vs. PFA spray chamber, or alternatively direct injection) and purification method for the nine MC-ICP-MS based data sets also did not lead to resolvable differences in results ( Figure 3). The results after ion exchange purification or using the sublimation technique both led to  11 B with z-scores close to zero (not shown). Some laboratories reported elevated boron blank levels, however, these did not result in clearly distinct final  11 B values (not shown).
Overall, the resultant robust mean and associated robust standard deviation for un-oxidised JCp-1 is 24.36 ± 0.45‰ (2s*), compared with 24.25 ± 0.22‰ (2s*) for the same material subjected to oxidative cleaning (Figure 4). For un-oxidised JCt-1, respective compositions are 16.39 ± 0.60‰ (2s*) and 16.24 ± 0.38‰ (2s*) for oxidised material. Hence, the robust means of cleaned and uncleaned powders are within error (for both reference material powders), but with the oxidised results only marginally lower than the un-oxidised material. A two-sided Student's t-test comparing laboratory means screened for outliers during the robust mean and robust standard deviation assessment; see methods above) provides a p-value of 0.12 for comparison of oxidised and un-oxidised JCp-1, and 0.17 for comparison of oxidised and un-oxidised JCt-1, confirming the populations are not different at 95% level of confidence. The difference in the mean values for the two reference materials in the respective laboratories (i.e.,  11 B = mean  11 B (JCp-1) -mean  11 B (JCt-1) ) is 7.98‰ for un-oxidised and 8.01‰ for oxidised reference materials. This difference in reference material δ 11 B caused by cleaning is hence identical (within measurement precision) and suggests lack of preferential 11 B or 10 B removal for both reference materials.
In order to set the above reported robust means and robust standard deviations for JCp-1 and JCt-1 in context with alternative data handling approaches, we also report the results of two simpler statistical approaches: In the first alternative, we calculated the median of each data set (n = 4) using the respective individual mean of the  11 B results of individual laboratories for each approach (un-oxidised or oxidised) and material (JCp-1 or JCt-1) (n = 9 or 10). While the resultant median for each data set is either very close or even identical to the robust mean, the resultant mean average deviation (not to be mistaken with the median average deviation, MAD) is

Accepted Article
This article is protected by copyright. All rights reserved significantly smaller than our calculated robust standard deviation 2s*. The effect is most drastic for un-oxidised JCt-1 (Table 5). Repeating this exercise in a second alternative data treatment approach, now considering every replicate result for each of the four data sets again (n = 27 or 30) provides comparable median  11 B values and slightly more expanded mean average deviations.
Given that these mean average deviations are very close to or below the reported intermediate precision (2s) of individual laboratory results (see Tables 2 and 3), these mean average deviations are deemed unrealistically small, not reflecting realistic  11 B discrepancies between individual laboratories, while the robust standard deviation better illustrates the scatter in the data sets ( Figure 4). Table. 5 summarises the various statistic results and Table 6 provides a list of the laboratories that submitted data.

Discussion
Overall, the agreement in  11 B values reported here is very encouraging. Our BIIP dataset demonstrates that differences between the individual laboratories taking part in this study are orders of magnitude smaller than in earlier interlaboratory comparison efforts (Gonfiantini et al.

Accepted Article
This article is protected by copyright. All rights reserved Although detailed information on the behaviour of the two reference material powders during micro-sublimation purification is not available, results presented from the laboratory using this

Accepted Article
This article is protected by copyright. All rights reserved majority of laboratories, we recommend including this step (i.e., short exposure to buffered H 2 O 2 or NaClO) for boron isotope analysis of biogenic carbonates.

Conclusions
Two biogenic marine carbonate reference materials from the Geological Survey of Japan (JCp-1 and JCt-1) were analysed for their boron isotopic ratio in ten laboratories with a documented record of prior boron isotope analyses. Compiled results reveal an encouragingly good agreement of the laboratory means between laboratories that is close to commonly reported in-house intermediate precisions.
Since the vast majority of research groups participating in this study employed inductively coupled plasma-mass spectrometric approaches, the analytical assessment is somewhat biased towards these MC-ICP-MS approaches. Nevertheless, several general key conclusions can be drawn that also apply to thermal ionisation mass spectrometric approaches.
More consistent boron isotope results are obtained if carbonate materials were exposed to moderate oxidative treatment prior to sample dissolution. While utmost care in sample handling for boron isotopic studies is always required, the analytical approach taken for extracting boron from the carbonate matrix, as well as the sample introduction system used for MC-ICP-MS approaches, does not lead to resolvable isotope offsets. Following the oxidative cleaning approach, reported  11 B for JCt-1 agrees to within ±0.38‰, and ±0.22‰ for JCp-1 (2s*).
Given that future research efforts will tend to focus on smaller sample sizes and/or carbonates with low B/Ca, one of the most pressing pre-requisites for generating accurate  11 B will be sustained or boron isotopic data that are comparable between different laboratories even for small sample sizes in a few nanograms of boron. Finally, we note that despite the increasing levels of inter-laboratory consistency, boron isotope measurements remain challenging, even for those laboratories that have been making these measurements for many years. However, our study highlights that with care and commitment, it is possible to achieve a very encouraging level of consistency within the community.

Accepted Article
This article is protected by copyright. All rights reserved Figure 2. Boron isotope results for Tridacna gigas JCt-1 reference material, presented in delta notation relative to NIST SRM 951, showing the mean  11 B for each laboratory with resultant 2s (see also Table 4)