Evaluating the Formation Pressure of Diamond‐Hosted Majoritic Garnets: A Machine Learning Majorite Barometer

Diamond‐hosted majoritic garnet inclusions provide unique insights into the Earth's deep, and otherwise inaccessible, mantle. Compared with other types of mineral inclusions found in sub‐lithospheric diamonds, majoritic garnets can provide the most accurate estimates of diamond formation pressures because laboratory experiments have shown that garnet chemistry varies strongly as a function of pressure. However, evaluation using a compilation of experimental data demonstrates that none of the available empirical barometers are reliable for predicting the formation pressure of many experimental majoritic garnets and cannot be applied with confidence to diamond‐hosted garnet inclusions. On the basis of the full experimental data set, we develop a novel type of majorite barometer using machine learning algorithms. Cross validation demonstrates that Random Forest Regression allows accurate prediction of the formation pressure across the full range of experimental majoritic garnet compositions found in the literature. Applying this new barometer to the global database of diamond‐hosted inclusions reveals that their formation occurs in specific pressure modes. However, exsolved clinopyroxene components that are often observed within garnet inclusions are not included in this analysis. Reconstruction of inclusions, in the 8 cases where this is currently possible, reveals that ignoring small exsolved components can lead to underestimating inclusion pressures by up to 7 GPa (∼210 km). The predicted formation pressures of majoritic garnet inclusions are consistent with crystallization of carbon‐rich slab‐derived melts in Earth's deep upper mantle and transition zone.

mostly sitting on the B site (Huggins et al., 1977;Waychunas, 1987), and is commonly charge balanced by monovalent cations on the A site. Any phosphorus within garnet is believed to sit on the silicon site (Haggerty et al., 1994). At pressures above ∼6-8 GPa (depths of ∼180-240 km), the number of silicon cations in equilibrium garnet compositions increases, exceeding the capacity of the tetrahedral sites. This excess silicon (calculated throughout this study as Si + P−3) has octahedral coordination and is a consequence of the increased solubility of pyroxene component into garnet with increasing pressure. There are two principal substitution mechanisms that introduce octahedral silicon into garnet that operate at high pressures: The final product of substitution (1) is the creation of A 2+ 3 [M 2+ Si 4+ ]Si 3 O 12 . In simplified chemical systems, this is the majorite end-member (Mg 2+ SiO 3 ), and garnets containing excess silicon atoms beyond 3 per formula unit (pfu) are commonly referred to as majoritic garnets. Under equilibrium conditions, the extent of the Mj substitution in the MgO-Al 2 O 3 -SiO 2 system is known to be a function of pressure (Akaogi & Akimoto, 1977;Irifune, 1987). Mechanism (2), referred throughout this study as the Na-majorite (NaMj) substitution, most commonly involves Na + as the monovalent cation. The final product of this substitution is the Na-majorite endmember, [X + 2 A 2+ ][Si 4+ 2 ]Si 3 O 12 (e.g., Dymshits et al., 2013). As for Mj, the extent of the NaMj substitution appears to be a function of pressure, at least when investigated within the Mg 3 Al-2 Si 3 O 12 -Na 2 MgSi 5 O 12 system (Dymshits et al., 2013).
Natural sub-lithospheric diamonds, those that form at depths beneath the lithospheric mantle, are commonly observed to contain inclusions of garnet possessing measurable Mj and/or NaMj components (e.g., Harte & Cayzer, 2007;Moore et al., 1991;Stachel, 2001). These diamond samples are important because they are amongst the most pristine materials exhumed from depths greater than 250 km to the surface. Placing accurate constraints on their formation depth is vital for interpreting their chemical and isotopic signatures within a relevant depth context and for developing models for their origin, as well as other inclusions co-trapped within the same diamonds. Of all the sub-lithospheric minerals so far observed as diamond-hosted inclusions, majoritic garnets are the only numerous inclusion population that largely retains its primary structure and chemistry without complete retrograde re-equilibration, and so have become important indicators for the formation depth of diamond suites.
Estimates of the formation pressures of diamond-hosted majoritic garnets have commonly been determined following one of two strategies. Formerly, many studies simply applied the experimentally derived relationship between the garnet Si content and pressure as determined by Akaogi et al. (1977) and Irifune et al. (1987) in simplified pyrolitic bulk compositions (Bulanova et al., 2010;Stachel, 2001). Unfortunately, it is now clear that this simple correlation does not hold when more chemically complexity is present, especially for systems containing significant Fe 3+ , Cr, and Na (e.g., Dymshits et al., 2013;Tao et al., 2018;Wijbrans et al., 2016).
An alternative approach, available for the last decade, has been to apply an empirical single-mineral barometer calibrated using majoritic garnet compositions recovered from a selected data set of experimental mineral compositions. This strategy was initially developed by Collerson et al. (2010), and subsequent studies have provided updated calibrations using additional data from specifically targeted experiments to broaden the chemical diversity within the calibration data set (Beyer & Frost, 2017;Tao et al., 2018;Wijbrans et al., 2016). All four majorite barometers utilize an empirical linear barometric equation based only on major element chemistry, and each can reproduce the synthesis pressure of majoritic garnets within their calibration datasets at accuracies better than a few GPa. Thus, application of any of these barometers would appear to allow the formation depth of sub-lithospheric diamonds containing inclusions of majoritic garnet to be reliably inferred. However, these barometers come with the important caveats that (i) none are applicable to garnets that formed in equilibrium with calcium silicate perovskite (CaPv), and (ii) they explicitly require that garnet formed in equilibrium with clinopyroxene (Beyer & Frost, 2017;Collerson et al., 2010;Tao et al., 2018;Wijbrans et al., 2016). Neither of these requirements can be guaranteed when studying natural diamond-hosted garnet inclusions because the co-equilibrating minerals for any isolated garnet inclusion are unknown. Additionally, it is well known that majoritic garnet inclusions are commonly THOMSON ET AL.
In this study we interrogate the current majorite barometers by evaluating their performance against a large and compositionally diverse literature data set of experimental majoritic garnet compositions. As we demonstrate below, our evaluation highlights that all four of the published empirical barometers fail to reproduce many experimental garnet compositions to within 5 GPa. We conclude that none of the existing barometers can be applied with confidence for estimating the pressures of garnet minerals of unknown paragenesis, that is, those found as diamond-hosted inclusions. We address this problem by adopting machine learning (ML) techniques to generate a novel single mineral majorite barometer that is applicable to the full range of experimental majoritic garnet compositions and that is easily expandible, and subsequently apply this new barometer to a global data set of diamond-hosted majoritic garnet inclusions to calculate their formation pressures.

Compilation and Processing of Literature Data
The chemical compositions of majoritic garnets from experimental studies performed in simple and complex chemical systems were collated from the literature. Compositions in the data set comprise garnets from experiments performed on bulk compositions including peridotites, basalts, and sediments, with and without volatile components such as water and CO 2 . Across all types of bulk composition, equilibrium garnet compositions from pressures between 6 and at least 25 GPa and coexisting with a range of minerals are represented.
Reported garnet compositions, as measured by EPMA, were converted into atomic proportions per formula unit (pfu) assuming a garnet stoichiometry with 12 oxygens. All analyses with 7.9 < total cations (pfu) < 8.1 were discarded. The concentration of Mj and NaMj in each garnet were calculated following Okamoto and Maruyama (2004) and Beyer et al. (2017) as: If either component (3) or (4) was calculated to be negative it was set to zero. Any compositions containing less than 0.5% Mj and NaMj components combined were discarded. After applying these filters the literature data set comprises 752 majoritic garnet compositions, which are plotted in terms of their major element chemistry in Figure 1. Experimental garnets from similar bulk compositions can be identified using symbol color (see caption), while data from individual references can be identified using the legend from Figure 1. The various plots demonstrate the continuous variation of Mg content with other chemical components within the experimental data set, from highly magnesian garnets found in peridotitic assemblages to those with the lowest Mg content stable in sedimentary compositions. The experimental data set was further filtered by removing garnets from experiments with simplified chemical systems for barometer evaluation and calibration, to only include those with "natural-like" compositions (i.e., those with complex compositions); defined here as those where Si, Al, Fe, Mg, and Ca are reported alongside at least one of Ti, Cr, Mn, and Na. The final "natural" composition data set contains 519 experimental majoritic garnets.
Most literature garnet compositions do not report Fe 2+ and Fe 3+ separately, but these two components are expected to behave differently in garnet (sitting on the A and B sites, respectively). Thus, for each garnet composition the amounts of ferric and ferrous iron were estimated assuming charge balance based on 12 oxygens pfu. This approach is unlikely to have correctly assigned all Fe 3+ and Fe 2+ , but should have reduced the offsets in Figure 2 caused by erroneously assigned di-/trivalent cations. The pressures predicted by each of the published barometers (Beyer & Frost, 2017;Collerson et al., 2010;Tao et al., 2018;Wijbrans et al., 2016) for each garnet composition were evaluated and is provided alongside the full data set in the data repository detailed at the end of this paper.
In addition to experimental garnets, the compositions of diamond-hosted majoritic garnet inclusions associated with sub-lithospheric origins were also collated from the literature. We assume that all collated compositions correspond to un-retrogressed garnet cores only, that is, we did not attempt to reincorporate the exsolved clinopyroxene components observed in many of these inclusions in the reported inclusion data set (   Discussion of the effects of clinopyroxene reincorporation is discussed later. Inclusion data were treated in a similar manner to the experimental compositions. This data set consists of 221 diamond-hosted majoritic garnet inclusions and is also provided in the data repository. The composition of garnet inclusions, plotted as yellow diamonds in Figure 1, fall within the continuum of compositions defined by the experimental garnet data set; although, many fall in the central region which is relatively under represented. It is assumed that a well-calibrated majorite barometer should accurately predict the formation pressure of the entire experimental garnet spectrum and, therefore, will be able to constrain the formation conditions of diamond-hosted garnet inclusions.

Compositional Systematics of Majoritic Garnets
The systematics of experimental majoritic garnet compositions are plotted to allow comparison with idealized Mj (solid) and NaMj (dashed) substitutions in Figure 2. Plots are provided of the silica excess in each garnet against monovalent cations contributing to the NaMj substitution, divalent cations and B-site cations in Figures 2a and 2c. Additionally, Figure 2d plots the relationship between abundance of divalent and B-site cations. In general, most of the majoritic garnet compositions predominantly follow one of the two substitution trends. Deviations that cannot obviously be explained by a mixture of both components, that is, they lie outside the range explained by the two plotted substitutions, mostly appear to possess too few trivalent cations (Figures 1c and 1d) and might be explained by under-assignment of ferric iron.
THOMSON ET AL.
10.1029/2020JB020604 5 of 20  Figure 2 demonstrates that, as identified by Kiseeva et al. (2013), majoritic garnets that formed in peridotitic bulk compositions almost exclusively fall along, or close to, the Mj substitution. Conversely, those from eclogitic (basaltic) compositions predominantly follow the NaMj substitution. These figures clearly demonstrate that the extent of silica excess that forms along the Mj substitution is larger than that created via the NaMj mechanism. This relationship does not result from the lack of high-pressure samples from both substitution mechanisms, as the experimental database contains garnets from all pressures up to ∼25 GPa for peridotitic, eclogitic and sedimentary compositions. Instead Figure 2 graphically demonstrates why the strategy of using only the silica excess as an indicator of formation pressure, for example, the correlations of Akaogi and Akimoto (1977) and Irifune (1987), is unreliable. Figure 3 shows plots of the chemistry of experimental garnets versus synthesis pressure, and clearly demonstrates the non-trivial relationship between composition and pressure. Inspection of Figures 3a and 3b reveal that different types of bulk composition, for example, basalts (navy) and sediments (green), appear to follow different pressure-composition trends. These differences occur despite both "families" predominantly following the NaMj substitution mechanism in Figure 2. Within peridotite-like bulk compositions, which are generally expected to follow the "simple" Mj substitution, there is also a wide range of variation in divalent or trivalent cation content at each pressure. Pressure dependent variations remain apparent when the experimental majorite compositions are viewed by their proportion of Mj or NaMj content against pressure (Figures 3c and 3d), strongly indicating that neither substitution is a simple function of pressure. Thus, we find it inevitable that a single mineral barometer calibrated using a selective subset of the observed data, THOMSON ET AL.
10.1029/2020JB020604 6 of 20  (1) and (2) are considered, would fail to successfully predict the formation pressure of all types of majoritic garnet.
The influence of temperature is deliberately omitted in this study because the formation temperature of a diamond-hosted inclusion cannot be independently identified and so cannot be used as a constraint on its formation pressure. This omission is supported by analyses from Collerson et al. (2010) and Wijbrans et al. (2016), which indicated that the temperature dependencies of garnet compositions were either negligible, or uncertain, within available experimental datasets.

Evaluation of Existing Majoritic Garnet Barometers
Empirical barometers attempt to parameterize the chemical variation observed in majoritic garnets, including the extent of Mj and NaMj substitutions, such that they may be used to predict the formation pressure of an unknown sample. Each of the four available empirical majorite barometers adopts a different parameterization scheme. None of these barometers have a rigorous thermodynamic basis, but instead are all based on concepts of the crystal chemistry and substitutions occurring in garnet at high pressure.   (2017), and Tao et al. (2018) in predicting the synthesis pressure of "natural" composition garnets from our experimental database. For the Wijbrans et al. (2016) comparison, we employ the peridotitic barometer on all volatile-free and volatile-bearing experiments using peridotite bulk compositions, but use the eclogite barometer (as directed by the paper) for all other chemistries. Whilst all four barometers clearly predict the synthesis pressure of some majoritic garnet compositions within reasonable uncertainties, it is equally apparent that all of them fail for many more. All mis-predict a large proportion of the experimental data set by up to 10 GPa or more. These failures are undoubtably caused by the complex compositions and the interplay of the substitutions present in majoritic garnets as well as being deliberately applied to compositions outside their calibrated range.
Amongst the four, the barometer of Beyer and Frost (2017) produces the strongest statistical correlation between the experimental run conditions ("measured" pressure) and the predicted pressures with an R 2 value (or the coefficient of determination) of 0.39 (RMSE = 3.191; RMSE is the root mean squared error), which we do not consider a successful outcome. The remaining three barometers produce predicted pressures with R 2 correlations of 0.30, 0.28, and 0.35 (RMSE of 3.614, 3.929, and 3.597) when compared with the original synthesis pressures. For reference, when all experimental garnets are fitted including those from experiments using simplified chemical systems, the R 2 values become poorer. Whichever barometer is chosen, all appear to have an uncertainty of at least ±5 GPa across all pressures, but they all have a particular propensity to underestimate those from greater than ∼20 GPa. This underestimation is not surprising since compositions from >20 GPa are never in equilibrium with pyroxene, and therefore beyond any of the empirical calibrations. Figure 4 indicates that there are potentially very large uncertainties, on the order of 10 GPa or greater, when using any of the four current barometers to predict the pressure of formation of diamond-hosted garnet inclusions. Such large uncertainties result from using the barometers for garnet compositions outside their calibrated range whilst simultaneously disregarding their requirement that garnets must be in equilibrium with clinopyroxene and must not have been in equilibrium with calcium perovskite. However, because diamond-hosted inclusions may also not fulfill these requirements it is impossible to know the fidelity of any given pressure estimate for an inclusion. As already noted, diamond-hosted inclusions are isolated minerals, often making it impossible to know what assemblage, if any, the mineral equilibrated with. Also, in those rare instances where multiple inclusions from a single sub-lithospheric diamond have been analyzed, there is no certainty that these represent an equilibrium assemblage. Finally, it can never be guaranteed that garnet was in equilibrium with clinopyroxene at the time of entrapment, indeed, it is unlikely to be the case as it is rarely found together with majorite garnet in sublithospheric diamonds. Thus, what is required is a barometer that is calibrated against, and can accurately predict, the full range of experimental compositions in the literature data compilation with no a priori requirements.
THOMSON ET AL.

10.1029/2020JB020604
8 of 20  (2017), and (d) Tao et al. (2018). The performance of each barometer is evaluated by calculating the R 2 and RMSE statistical scores for the correlation between the predicted and measured pressures. RMSE, root mean squared error.

A ML Barometer
We adopt ML approaches to calibrate a more accurate barometer for use on diamond hosted garnet inclusion compositions. Whilst this is the first use of ML as a majorite barometer, similar approaches have previously been applied to other petrological and mineralogical problems (e.g., Caricchi et al., 2020;Hazen et al., 2019;Petrelli et al., 2020;Petrelli & Perugini, 2016). ML provides the advantage of not requiring a fixed empirical expression, which may or may not have the most appropriate crystal chemical basis. Instead, ML utilizes statistical algorithms that are potentially capable of revealing previously unrecognized correlations within the chemical complexity of the experimental garnet population; it is effectively a hyper-empirical approach. While ML algorithms may identify sensible correlations, for example, those that reflect the known majoritic substitutions, a drawback is that they do not provide an explicit formula linking pressure to composition. Appropriate care combined with suitable cross validation tests must be taken to provide confidence in results and to ensure that any predictions using ML are robust. We are convinced from our analysis that the approach adopted here and described below produces an ML barometer that can be applied to any diamond-hosted garnet compositions with a high level of confidence.
The ML majorite barometer calibrated in this study uses Random Forest Regression (RFR), which is an ensemble method consisting of multiple decision trees that are used to provide collective prediction capability. Predictions are made not by considering results from individual decision trees but by aggregating results from the entire forest using a weighting scheme optimized with the training data set provided. Within RFR, the number of trees and complexity of each individual tree is user defined. We note that RFR is not the only ML algorithm that can be applied as a barometer; we have also tested Principal Component Regression (PCR), Partial Least Squares Regression (PLS), Neural Network Regression (NNR) and Support Vector Machines (SVM). All ML techniques tested outperform published barometers (see Table 1 for a summary of results achieved using all methods), but the RFR barometer is preferred because of its simplicity, high training speed and an output performance that is more robust and superior (in a statistical sense) to the alternative algorithms.
The RFR approach was implemented using the scikit-learn python package (Pedregosa et al., 2011), where pressure (the target variable) was regressed using Si, Ti, Al, Cr, and Fe total (i.e., Fe 2+ + Fe 3+ ), Mn, Mg, Ca, and Na as dependent variables. In an attempt to, so far as possible, ensure that the model(s) created are applicable to wide ranging garnet chemistries, we adopted typical cross validation approaches used in ML; validating trained models using randomly selected test datasets. In this procedure, all "natural" compositions (excluding simple system experiments) within the literature compilation were randomly assigned into either training or testing datasets using a 70:30 split or leave-one-out strategy. RFR models were calibrated using training data and their performance evaluated using R 2 Figure 5a). None of these 1,000 iterative models predict any extreme outliers (those with misfits > 10 GPa). RFR performance using a leave-one-out cross validation approach was extremely similar, with R 2 and RMSE values of 0.81 and 2.07, respectively. The consistency throughout these validation tests suggests that overfitting is not dominant.
The RFR models can be further evaluated by examining the relative importance of each input parameter to the solution determined by each model (Figure 5b). This confirms, across all models, that the two most important compositional parameters are Si and Na, with Al identified as the third most important component. Si and Na are the two components expected to be the strongest indicators of the Mj and NaMj substitutions (Equations 1 and 2), with Al also involved in both pressure dependent mechanisms. Thus, despite being provided no crystal chemical constraints or physically guided empirical expression, RFR has clearly identified THOMSON ET AL.

10.1029/2020JB020604
10 of 20 the known substitution mechanisms as important contributors to pressure predictions. Furthermore, and of critical importance for barometric applications, RFR describes the pressure dependencies of additional chemical components allowing a robust fit to the global experimental data set. While we cannot identify thermodynamic reasons for the role of the remaining components, we note that Ca and Cr, which most obviously distinguish between mafic and ultramafic majoritic garnets, appear to be the next most important inputs. Figure 5c plots the mean pressure predicted across 1000 MC models, alongside uncertainties of 2σ in the mean, compared with the reported experimental pressures. Similarly, Figure 5d plots the predicted majoritic garnet pressures using leave-one-out cross validation. Comparison of Figures 5c and 5d with the results using empirical barometers (Figure 4) illustrates the vastly improved performance of the RFR ML approach for predicting the synthesis pressures of experimental garnet compositions. Despite this much-improved performance, the RFR approach clearly does not perform perfectly, with Figures 5c and 5d demonstrating a remaining tendency for the formation pressures of certain garnet compositions, especially those from >20 GPa, to be under or overestimated by 5-10 GPa. These poorly described compositions are associated with the larger error bars in Figure 5c and are those that plot away from the 1:1 correspondence curve in Figure 5d. These outliers might be explained by a combination of poor analyses, unequilibrated experiments, underrepresented compositions within the data set or simply demonstrate the limitations of ML barometry. It is clear, however, despite these outliers, RFR barometery far outperforms all empirical approaches to date.
Further validation of the RFR model approach was achieved via independent 10-fold, leave-one-out and bootstrap cross validations strategies using the Caret package in R (Kuhn, 2008). Across all three cross validation strategies the average R 2 and RMSE statistics associated with trained RFR barometers, trained using all nine input variables, range from 0.81 to 0.84 and 2.37 to 2.14, respectively ( Table 2, validation code is provided in the linked repository). The observed consistency and high statistical performance of all RFR models irrespective of the training slice or validation approach used suggests that model outputs are insensitive to the specific training data used, and demonstrating the stability of RFR barometers applied to majoritic garnet. Thus, for application in predicting the equilibration conditions of true unknowns, for example diamond-hosted inclusions, it is justified to combine all available literature data into a single, and complete, training data set. This final predictive RFR majorite barometer is provided in the form of jupyter notebooks and/or python scripts with example input files at the listed research data repository.
A final concern with using a ML approach is the potential for the RFR barometer to make erroneous predictions when applied to garnet compositions that are unlike any in the calibration data compilation; that is, RFR regressions may not be reliable in extrapolation. To some extent this is demonstrated in Figure 5d, where "unique" compositions are poorly reproduced when omitted from the training data set. Figures 1  and 2  Abbreviations: RFR, random forest regression; RMSE, root mean squared error.

Global Distribution of Diamond Inclusion Formation Pressures
The results of the application of the newly calibrated RFR majorite barometer to the global sub-lithospheric diamond-hosted inclusion data set is plotted in Figure 6. Predicted inclusion pressures were calculated over 100 MC iterations, where normally distributed uncertainties (with σ = 0.2% for Si, 1% for Al, and Ca, 3% for Ti, Fe total , Mg, and Na, 5% for Cr and 10% for Mn) were randomly added to each chemical component, simulating analytical uncertainties. These uncertainties are representative of those from individual analyses of majoritic garnets in a recent experimental study . Mean predicted pressures are recovered from the 100 MC iterations alongside estimated uncertainties, which are reported as 2σ in the mean. The predicted inclusion pressures using RFR are plotted as a histogram, probability density estimate and a Gaussian Mixture Model in Figure 6a, following the approach of Rudge (2008). These are directly compared with predictions from literature barometers, plotted only as Gaussian Mixture Models, in Figure 6b. The peridotitic barometer of Wijbrans et al. (2016) is used to predict pressures for inclusions containing >1 wt.% Cr 2 O 3 , whereas the eclogitic barometer was used for all other inclusions. Figure 6b shows that all four literature barometers predict that majorite inclusions were formed at pressures concentrated in two pressure intervals; one centered around 6-9 GPa (∼180-270 km depth) and the second between 10 and 15 GPa (∼300-450 km). The empirical barometers predict very few diamond-hosted inclusions originate from greater depths, although the barometers of Beyer and Frost (2017) and Tao et al. (2018) suggest a handful originated at >18 GPa.
In comparison with these bimodal distributions using literature barometers, the RFR barometer (Figures 6a  and 6b) identifies that majoritic garnet inclusions appear to have formation pressures concentrated in three or four intervals. The low pressure mode predicted by the RFR barometer is centered at ∼9 GPa. Additional higher pressure modes are apparent at pressures centered around ∼12.5 GPa, ∼14.5 GPa, and ∼18 GPa with some inclusions yielding pressures in excess of 22 GPa. Comparing RFR results with literature predictions (Table 3) suggests all previous barometers generally underestimate the pressure of inclusion formation, with mean shifts of between 1.1 and 2.0 GPa. Formation pressures for individual inclusions differ by up to −2.9 or +9 GPa (Table 3). Except Wijbrans et al. (2016), the literature barometers typically underestimate the formation pressure of the lower-pressure inclusions by 1.5-3 GPa (∼15%-30%). At the higher pressure end of THOMSON ET AL.

10.1029/2020JB020604
12 of 20 the spectrum, comparison with literature barometers demonstrates that Collerson et al. (2010), Wijbrans et al. (2016), and Tao et al. (2018) all underpredict the entrapment pressure of many inclusion compositions. In contrast, the Beyer and Frost (2017) calibration produces a very similar distribution of inclusion pressures between 11 and 18 GPa. None of the literature barometers replicate the tail of inclusions identified by RFR that appear to come from pressures higher than 18 GPa. As already noted, this is not surprising as they were only calibrated using pyroxene-bearing assemblages, so are not be expected to successfully identify higher-pressure minerals.

Regional Diamond Formation Pressures
Application of the RFR barometer also allows any regional differences in diamond formation pressures to be examined. All 221 available inclusion compositions were divided into regional datasets, consisting of diamonds sourced from Southern Africa, South America, Western Africa, China, North America, and Russia. Southern African diamonds originate from major South African mines including Jagersfontein, Monastery and Cullinan and also incorporate inclusion compositions from Botswana, Tanzania, and Zimbabwe (Kaminsky et al., 1997;Korolev et al., 2018;Moore et al., 1991;Moore & Gurney, 1985, 1989Motsamai et al., 2018;Pokhilenko et al., 2004Pokhilenko et al., , 2001Shatskii et al., 2010;Smith et al., 2009;Tappert et al., 2005;Tsai et al., 1979). Those from South America are predominantly from the kimberlite and alluvial sources in the Juina region (Bulanova et al., 2010;Burnham et al., 2016Burnham et al., , 2015Harte & Cayzer, 2007;Kaminsky et al., 2001;Meyer & Svisero, 1975;Smith et al., 2016;Thomson et al., 2014;Wilding, 1990;Zedgenizov et al., 2014). Diamond inclusions from Western Africa are mostly from Kankan, but also include additional samples from Guinea and Ghana (Stachel et al., 2000;Stachel & Harris, 1997). The remaining inclusion compositions, from China, North America and Russia, are relatively few in number and are collated from a range of localities (Banas et al., 2007;Davies et al., 2004aDavies et al., , 2004bKaminsky et al., 1997;Pokhilenko et al., 2004Pokhilenko et al., , 2001Shatskii et al., 2010;Shatsky et al., 2015;Schulze et al., 2008;Sobolev et al., 1977Sobolev et al., , 1997Sobolev et al., , 1999. The formation pressure of inclusions from each geographic region are plotted in Figure 7 as colored histograms and probability density estimates. Comparisons should be mindful that each geographical locality contains a different number of inclusions (Southern Africa = 101, South America = 65, Western Africa = 14, China = 14, North America = 14, and Russia = 13). Distributions reveal that geographically different localities have apparently sampled diamonds from distinct pressure environments. Looking first at Southern African and South American inclusions, where the bulk of global sub-lithospheric diamonds have been sampled, it can be seen that a significantly greater percentage of Southern African majoritic garnet inclusions formed at pressures shallower than ∼10 GPa, or 300 km depth, compared with very few similar inclusions found in South American samples (Figure 7). The majority of both Southern African and South American diamonds appear to have trapped their inclusions at depths in and around the upper part of the transition zone, ∼13-15 GPa (∼380-440 km) with the estimated entrapment pressures, for South American diamonds in particular, extending throughout the upper mantle. The differences between these diamond populations might be explained by sampling bias, however, it is also feasible that these reflect paragenetic differences otherwise identifiable using chemical differences between the two geographical populations. Certainly, a larger proportion of the majoritic inclusions sourced from South America are "eclogitic" (depleted in Mg and Cr but enriched Ca, Na, and Ti) compared with a larger proportion of "peridotitic" inclusions from Southern Africa. However, whether such chemical and depth differences reflect multiple diamond formation mechanisms, different exhumation behavior or is simply a reflection of limited sampling remains unclear.
Individually there are far fewer majoritic garnet inclusions reported from Western Africa, China, North America, and Russia (Figure 7). China has only produced diamonds exhumed from pressures of 6-15 GPa, whereas samples from Western Africa, North America, and Russia appear to span upper mantle pressures. There is no peak in any of these distributions corresponding to increased diamond crystallization at ∼14-THOMSON ET AL. Notes. Calculated as RFR pressure minus literature pressure. Mean absolute ΔP is the mean of the magnitude of pressure differences (i.e., all ΔP's are positive). We note the inclusions with the large + ve pressure deviations are different for each barometer. Abbreviation: RFR, random forest regression.

Table 3 Summary of the Difference Between Predicted Inclusion Pressures Using RFR (This Study) and Published barometers
15 GPa, as observed for African and South American samples. Until further samples are analyzed it is unclear whether these distributions are simply caused by poor sampling statistics, unaccounted for effects of retrograde exsolution as described below, or whether they reflect competing mechanisms of diamond formation.

Retrograde Exsolution Within Diamond-Hosted Majoritic Garnet Inclusions
As mentioned briefly above, some (if not many) natural diamond-hosted majorite inclusions are not single-phase garnets. Several inclusions have been reported to also possess small exsolved portions of clinopyroxene, and/or albite, most often located around the inclusion rims (e.g., see Burnham et al., 2015;Harte & Cayzer, 2007;Thomson et al., 2014;Zedgenizov et al., 2014 for exemplar SEM images of such inclusions). Consistent with previous interpretations, we ascribe these rim features to partial retrogression and un-mixing of what was once a single-phase majoritic garnet inclusion. Thus, if the original entrapment pressure of an inclusion is to be accurately estimated, any exsolved rim material must be proportionately reincorporated back into the inclusion's bulk composition (Harte & Cayzer, 2007;Thomson et al., 2014). Whilst such rim exsolution features have only been explicitly reported and described in a handful of studies, our personal experience suggests that they may be present in most if not all majoritic inclusions (at least those from South American localities) where a clean and complete analysis surface is recovered during sample preparation (i.e., material is not plucked from the inclusion edges). D. Zedgenizov also reports observing exsolved components in all but the smallest majoritic garnet inclusions they have examined (D. Zedgenizov, personal communication, 2020). The small exposed area of exsolved cpx in some cases means that its composition cannot be accurately analyzed, for example, several exsolved rims were imaged and described, but not analyzed (Bulanova et al., 2010;Zedgenizov et al., 2014). Additionally, it is only in relatively recent studies that diamonds have regularly been prepared by polishing using a scaif, rather than breaking the host diamonds using a "nutcracker" to release trapped inclusions where recovery of exsolved components may be less likely. We find it probable that small, exsolved components have been overlooked in many if not all studies of diamond-hosted majoritic garnet inclusions and, therefore, that pressures determined using the analyzed majorite compositions, as above, must be regarded as minima.
It is only possible to assess the impact on the inclusion formation pressures for those inclusions where the garnet core and exsolved rim compositions have been reported alongside images allowing the bulk composition to be estimated by reconstruction. To our knowledge there are only eight such inclusions, all of which are from South America. Of the eight inclusions that can be reconstructed on the assumption that exposed area ratios are proportional to volumetric ones, four are from Sao Luiz (Harte & Cayzer, 2007), three are from Juina-5 (Thomson et al., 2014) and one inclusion is from a Machado River diamond . Even for these examples it has to be assumed that the two dimensional slice is representative of the three-dimensional distribution of exsolved retrograde component.
Using our preferred RFR model, we predict inclusion pressures before and after reincorporation of the exsolved rim components, where the calculated pressures are determined over 100 MC iterations with randomly assigned compositional uncertainties (solid); uncertainties for garnet cores and bulk compositions were normally distributed as described above. Figure 8 plots the magnitude of the observed pressure correction (calculated as the difference between the mean bulk inclusion pressure and the mean core pressure with 2σ uncertainties) against the pressure deduced from the garnet cores alone. On the basis of the eight inclusions available there appears to be a negative correlation between the pressure predicted for the garnet core and the magnitude of the pressure correction; the lowest pressure cores are apparently associated with the largest pressure corrections. This correlation is suggestive of a record of upward movement of the diamonds and their inclusions in the mantle (Bulanova et al., 2010;Harte & Cayzer, 2007;Thomson et al., 2014;Zedgenizov et al., 2014). However, we recognize that the accuracy of this correlation relies on the fidelity of the modal and chemical reconstruction, which both potentially have large associated errors, and based on only eight data may be completely spurious. As this observation requires further examination in future studies, which should employ careful analyses of inclusion core and rim chemistries coupled to 3D imaging of inclusions if garnet compositions are to be accurately reconstructed, throughout the remainder of this discussion we adopt a weighted mean pressure correction of 4.1 ± 1.4 GPa for all inclusions (where the uncertainty is calculated as the weighted standard deviation).
While it is unclear how applicable any "reconstruction" pressure correction ( Figure 8) is to all majoritic inclusions, it is of interest to assess the potential impact that a correlation of this magnitude would have on the pressures of additional inclusions where exsolution compositions have not been reported. Given that all samples containing quantifiable exsolution are from South America, many studied by us, we have assumed that weighted mean pressure correction of 4.1 GPa (Figure 8) can be applied to all South American inclusion and have recalculated the distribution of diamond formation pressures for this region as a case study (Figure 9). The corrected distribution of diamond formation depths is shifted significantly deeper as expected, with the bulk of diamond-hosted majorites originating from pressures between 15 and 23 GPa (∼450-660 km depth), that is, throughout the mantle transition zone.

A Petrological Model for South American Diamond Formation
Examination of both the corrected and uncorrected distributions of inclusion formation pressures from South America are useful for assessing the geodynamic context of majorite garnet-bearing diamond formation.
Multiple studies have linked the major element, trace element, and isotopic properties of these sub-lithospheric diamonds and their inclusion cargo to the crystallization of carbon-bearing, subduction-derived components (e.g., Bulanova et al., 2010;Harte, 2010;Harte et al., 1999;Moore et al., 1991;Walter et al., 2008Walter et al., , 2011. Additionally, on the basis of experimental melting phase relations and observed melt-rock reactions,  presented a model whereby the intermediate compositions of majoritic garnet inclusions, from South America but also other locations, are explicable in terms of a reaction between slab-derived carbonatitic melts and mantle peridotite, a reaction that also precipitates the host-diamond. It is instructive to consider the depth relationships of majoritic inclusions in the context of this "subduction-metasomatism" model.
THOMSON ET AL.

10.1029/2020JB020604
15 of 20 Figure 8. The relationship between the pressure estimated from the majoritic inclusion core and the correction required when the exsolved rim is reincorporated. A linear negative correlation (dashed curve) whose equation is dispayed above the plot can be fitted or, as utilized throughout the remainder of this study a constant correction of 4.1 ± 1.4 GPa can be applied to all inclusions. Figure 9. The depth distribution of estimated formation pressure for diamond-hosted majoritic garnet inclusions from South America before (blue) and after (red) correction by 4.1 GPa for exsolved components. The uncorrected inclusion distribution is plotted as a histogram and Gaussian Mixture Model, whereas the corrected distribution is plotted as a kernel density estimate with the bandwidth fixed to the uncertainty in the correction (1.4 GPa). These distributions are compared with the melting temperature of carbonated MORB compositions from K13 (Kiseeva et al., 2013), T16  and Z20 (Y. Zhang et al., 2020) and subducting slab adiabats for hot (red arrow), warm (orange) and cold (blue) slab adiabats. Figure 9 shows the distribution of pressures for South American majoritic garnet inclusions relative to the melting curves of carbonated basalt from the studies of , Kiseeva et al. (2013), and Y. Zhang et al. (2020) that span probable compositions of deeply subducted, carbonated oceanic crust. Also shown are idealized profiles for the temperatures at the top of subducted oceanic crust that are based on finite element modeling of modern subduction zones.
The overall depth distribution of the garnet inclusions is potentially explicable in terms of melting of down-going carbonated oceanic crust. The dominant feature of the South American majorite inclusion spectra is a mode at ∼14 GPa in the uncorrected data and ∼18 GPa in the corrected data. It is notable that carbonated oceanic crust reaches its melting point at pressures ranging from ∼10 to >20 GPa for the range of modern slab temperatures. The dominant higher-pressure mode in the South American data correlates well with the depths where "warm" and "cold" slab thermal profiles intersect the depressed melting curves at these pressures. It is noteworthy that cooler slabs are also those that should carry the largest amount of carbonated basalt, as carbonate is much more efficiently removed in "hot" slabs than in colder ones, an expectation supported by experimental data and modeling of fluid and melt extraction from subducted oceanic crust (Ague & Nicolescu, 2014;Gorman et al., 2006;Kelemen & Manning, 2015;Kerrick & Connolly, 2001;Plank & Nature, 2019;Poli, 2015;Tsuno & Dasgupta, 2011). In subduction zones with temperatures sufficient to cause dehydration of altered ultramafic mantle lithosphere most, if not all, of their carbon will be lost in fluids as they percolate through the slab assemblage (Gorman et al., 2006). Only in "warm" and "cold" slabs, where the altered ultramafic portions do not undergo serpentine breakdown, will significant amounts of carbon be transported to sub-lithospheric diamond formation regions. Therefore, a consequence of the "subduction-metasomatism" model is that diamond and majorite garnet formation should be more prevalent at higher pressures (e.g., >15 GPa), as is observed in the South American garnet inclusion data set. We further note that, provided the corrected distribution is generally accurate, then diamond formation under South America predominantly occurs in the mantle transition zone. The extent to which this conclusion can be applied to other sub-lithospheric garnets of similar intermediate composition from other continents requires further investigation.

Concluding Remarks
By evaluating the chemical variations observed within experimental majoritic garnets, we have demonstrated that none of the available literature barometers reliably reproduces the experimental pressure of all garnet varieties. Any pressure estimates based on these empirical expressions could be incorrect by up to ∼ ± 10 GPa based on the observed performance against experimental data. This erodes confidence in their application to the study of natural diamond-hosted majorite inclusions. We have adopted an alternative approach, by using ML algorithms and the experimental data set to train a majoritic garnet barometer using Random Forest Regression. Cross validation demonstrates that this new barometer provides a much improved fit to the experimental data and, therefore, in the continued absence of a rigorous thermodynamic barometer this ML approach provides significantly more reliable estimates of the formation pressures of diamond-hosted majoritic inclusions. Despite the far superior performance of the ML barometer when tested against all literature experiments, compositional gaps remain in the experimental data set, especially in compositions intermediate between peridotitic and eclogitic compositions (Figure 1), where the barometer's reliability cannot be assessed.
We used the RFR barometer to determine the pressure distribution of the global database of majoritic garnet inclusions and probed the geographical distribution of pressures. The presence of exsolved clinopyroxene (±rare plagioclase) that has been reported in many diamond-hosted majoritic garnets, however, indicates that the calculated pressures are minima. It is our experience that these exsolved components are extremely widespread amongst diamond-hosted inclusions, but further study is necessary to quantify this conclusion, especially in inclusions sourced outside South America. However, for those inclusions where sufficient information is available, we have reincorporated exsolved components and discovered that a mean correction of ∼4 GPa (∼120 km) is required. However, it is currently impossible to assess the general applicability of this correction and future studies should aim to critically evaluate this overlooked aspect of diamond-hosted majoritic garnet inclusions.
Interpretation of the South American depth distribution of majoritic garnet inclusions in a petrological and geophysical context links sub-lithospheric diamonds to the melting behavior of oceanic crust in subducting slabs in the deep upper mantle and concentrated in the transition zone. This result supports the previously proposed model of sub-lithospheric diamond formation during interaction of carbonated slab melts and overlying mantle material (i.e., . It is additionally suggested that this mechanism may apply more generally to majorite-bearing sub-lithospheric diamonds, independent of their geographical locality, however this hypothesis requires specific examination in future studies.