Visualisation and outlier detection for probability density function ensembles

Exploratory data analysis (EDA) for functional data—data objects where observations are entire functions—is a difficult problem that has seen significant attention in recent literature. This surge in interest is motivated by the ubiquitous nature of functional data, which are prevalent in applications across fields such as meteorology, biology, medicine and engineering. Empirical probability density functions (PDFs) can be viewed as constrained functional data objects that must integrate to one and be nonnegative. They show up in contexts such as yearly income distributions, zooplankton size structure in oceanography and in connectivity patterns in the brain, among others. While PDF data are certainly common in modern research, little attention has been given to EDA specifically for PDFs. In this paper, we extend several methods for EDA on functional data for PDFs and compare them on simulated data that exhibit different types of variation, designed to mimic that seen in real‐world applications. We then use our new methods to perform EDA on the breakthrough curves observed in gas transport simulations for underground fracture networks.

EDA on such an ensemble is included in this paper; it is meant to provide an example to users on how to apply these tools on real data, as well as highlighting their practical utility.
Much of the early research on data analysis for functional data focused on modelling, clustering and forecasting, with little attention given to EDA tasks such as visualisation and outlier detection (Ferraty & Vieu, 2006;Fitzenberger et al., 2002;Ramsay & Silverman, 2005).A stronger emphasis on visualisation came at the turn of the century, with the phase-plane plot (Ramsay & Ramsey, 2002), the rug plot (Hyde et al., 2006), the singular value decomposition plot (Zhang et al., 2007), the rainbow plot, the functional bagplot and the functional highest density region boxplot (Hyndman & Shang, 2010).Of these methods, only the functional bagplot and the functional highest density region boxplot detect outliers, which they do by ordering the curves using the first two principal components from a robust principal component analysis (PCA) on the entire dataset, mapping this ordering back to the functional space, then applying an interquartile range (IQR) rule.Outlier detection for functional data has been studied without visualisation using likelihood ratio tests and smoothed bootstrapping (Febrero et al., 2007(Febrero et al., , 2008)).
Since the developments in Hyndman and Shang (2010), boxplots have been applied several times to functional data, with much success.In Xie et al. (2014), they were used to highlight regions of reasonable daily wind speeds in West Texas in 2008.In Boschi et al. (2021), the functional boxplot reveals in what ways COVID mortality rates vary across regions in Italy, highlighting the 'usual' rates and visualising the significant differences observed in the outlier regions.If previous work is any indication, the functional boxplot has strong potential for visualising ensembles of PDFs.However, as pointed out in an experiment in Lei et al. (2023b), the curve-ordering methods suggested in Sun and Genton (2011) applied directly to PDF ensembles may lead to nonsensical results.In this experiment, the methods in Sun and Genton (2011) chose a median PDF curve that was a clear outlier.
Much like in the univariate setting, determining the mean, first and third quartiles and outliers for functional data using a boxplot requires some method of ordering the data.Such an ordering for multivariate or functional data is typically done using data depth, which numerically quantifies how representative of-or 'central' to-the data as a whole a single observation is (Whitaker et al., 2013).There exist several notions of data depth, many of which are reviewed and compared in Febrero et al. (2008).Given a choice of depth, the 'most central' point, defined as the point with the highest depth, is regarded as the higher-dimensional analogue of the median point in the univariate setting, and functions are then ordered from the median outward according to their depth.In the case of the bagplot and the highest density region boxplot, Tukey depth (Tukey, 1975) and the value of a kernel density estimator were used on the first two principal components to order the curves, although these plots could be produced by alternative ordering methods.One complaint on these ordering methods is that they require mapping to a lowerdimensional space, rather than working directly on the functional space.This concern is addressed in Sun and Genton (2011), where boxplots are constructed according to the band depth (BD) and modified BD measures developed in Lopez-Pintado and Romo (2009).
Extensions of the functional boxplot from Sun and Genton (2011) have been applied and developed for special types of object-oriented functional data.These typically rely on modifying the notion of BD to properly account for the structure of the functions being studied.In Whitaker et al. (2013), BD is generalised to the contour BD, which is used to calculate depth for isocontours.An analogous development is made in Mirzargar et al. (2014) for parameterised curves.Yao et al. (2020) create visualisations for trajectory curves (multivariate processes mapped to a space without the time axis) using the simplicial BD from L opez-Pintado et al. (2014).Numerous geometric extensions for functions which exhibit both phase and amplitude variability have been proposed; Weiyi Xie and Sun (2017) perform optimisation to obtain separate simultaneous boxplots for phase and amplitude.A more explicit use of the functional shape distribution via the notion of elastic depth is developed by Trevor Harris and Shand (2021).Finally, Xie (2020) extends geometrically obtained functional boxplots to construct shape and orientation boxplots for open and closed curves.
While visualising several types of functional data and detecting outliers have been heavily researched topics in the past two decades, there has been relatively little attention given to ensembles of PDFs, even though they are extremely common functional data objects showing up in, for instance, yearly income distributions Kneip and Utikal (2001), zooplankton size structure in oceanography Nerini and Ghattas (2007), connectivity patterns in the brain Petersen and Müller (2016b), in structural health monitoring data Lei et al. (2023a) and in many geoscience simulation applications (Stansberry et al., 2023;Strait et al., 2023;Stauffer & Lu, 2012).Petersen et al. (2022) discuss methods for EDA for ensembles of PDFs but focus on different ways to calculate a Fréchet mean and do not address outliers.They visualise the first and second modes of variation of a functional principal component analysis (FPCA) on an ensemble of PDFs as a tool to analyse their variance.
As far as the authors are aware, the only paper to give direct treatment to outliers in PDF ensembles is Lei et al. (2023a).Lei et al. (2023a) classify outliers under several different types of data orderings, often transforming the PDF to an unconstrained data type (such as the log quantile density (LQD) of Petersen and Müller, 2016b), applying an L 2 distance from the estimated median as the method for determining depth and then applying an IQR rule.This work gives special attention to two different types of anomalous behaviour a PDF might exhibit: a horizontal-shift outlier and a shape outlier (see Figure 1 for examples of each).The several methods for discovering outliers suggested in Lei et al. (2023a) are each designed to catch one of these two types of outliers.This paper addresses two gaps in the literature on EDA for PDF ensembles.First, while Lei et al. (2023a) discuss several methods to determine data depth for ensembles of PDFs, only one of these methods uses a metric defined directly on the space of probability distributions.Thus, we apply several such metrics to determine data depth and compare them to two methods from Lei et al. (2023a) (the PDF metric and one other method, described in Section 2.1) via a simulation study.Second, the authors have not identified any instances where the functional boxplot has been applied to PDF ensembles, perhaps partly due to the issues highlighted in the experiment in Lei et al. (2023b).We update the functional boxplot method in Sun and Genton (2011) using the several notions of data depth studied in this paper.We study this visual tool's ability to accentuate outliers of various types.Taking what we learn from this simulation study, we apply these tools to analyse an ensemble of PDFs produced from gas transport simulations at Los Alamos National Laboratory.

| PDF DISTANCES AND FUNCTIONAL DEPTH
Several of the methods considered here for determining data depth require a preprocessing of the PDF data to some other functional data type.
When using a metric to order these functions, we first estimate the median function mðtÞ then use the metric to order the functions from the median outwards.Let Φ ¼ fϕ i ðtÞg i¼1,…,n be a set of functions with compact unit support indexed by a variable t I & ℝ, which in this context will either be a set of PDFs or some transformation of a set of PDFs.When the functions in Φ and the metric chosen forms a Hilbert space, the median can be estimated according to its cross-sections (i.e. the point-wise median), where the median operator chooses the median value of the set of functions at a fixed t.The cross-sectional median function mðtÞ will also belong to the Hilbert space and is thus treated as the overall median for the remainder of this manuscript.
When the elements in Φ and the metric do not form a Hilbert space, the function mðtÞ may not be representative of this set.For example, if Φ describes a set of PDFs, the function mðtÞ need not also be a PDF.When mðtÞ still belongs to the metric space defined by the metric on I , the overall median is taken to be the element of Φ with the smallest distance from mðtÞ.When mðtÞ does not belong to this space, we propose two different methods for determining a median: finding the element in Φ that minimises the L 2 distance from mðtÞ and treating this element as the overall median or selecting the overall median to be the function from Φ with the minimal summed distance to all other functions in Φ.The former method, which is suggested in Lei et al. (2023a), approximates a (perhaps more appropriate) metric on the functional space with the L 2 metric on I , which may or may not be a suitable approximation.We will further assess the use of this approximation in Section 3. The latter method corresponds to the geometric median of Φ, which is a strong choice for the median value, but comes with the additional computational burden of calculating every pairwise distance.Calculating this many distances is on the order of Oðn 2 Þ; this is in contrast to the cross-section median, which is only on the order of OðnÞ.
In addition to the several metrics we consider, we apply the BD and modified BD from Lopez-Pintado and Romo ( 2009) on the space of LQDs in Section 2.3 to determine data depth directly.For this scenario, the overall median is chosen to be the function in Φ with the largest depth value.
For some of the outlier detection methods discussed in Lei et al. (2023a), an initial horizontal alignment step is performed on the PDFs.In this step, some centrality feature of each PDF (such as the mean, median, or mode) in an ensemble is aligned along the horizontal axis.This is meant to help expose shape outliers by removing horizontal variability.Thus, in addition to assessing the use of the two median calculations, we assess Examples of horizontal-shift and shape outliers, in black.
the use of this horizontal alignment operation in Sections 3 and 4. For this simulation study and for the subsequent application, we perform this operation by aligning the median values to the midpoint of the collective support of the PDF ensemble and then renormalising the PDFs.

| Nonprobability metrics
Petersen and Müller (2016b) argue that since the space of PDFs is not a linear space, meaning it is not closed under addition and scalar multiplication, it may be more convenient to map a set of PDFs to a linear functional space when performing distance calculations.In Lei et al. (2023a), two such transformations are considered as a preprocessing step before ordering and detecting outliers in PDF ensembles.In this section, we review and assess these two methods.

| L 2 Distance on the LQD Space
The LQD transformation, introduced by Petersen and Müller (2016b), is a commonly used transformation for analysing PDFs (Chen et al., 2019;Kokoszka et al., 2019;Petersen & Muller, 2016a).Let f be a PDF and Q be its corresponding quantile function (Q ¼ F À1 , where FðxÞ ¼ Ð x 0 fðxÞdx is the cumulative distribution function [CDF]).The LQD is defined as follows: where q ¼ d dx Q.Since the space of LQDs is linear, L 2 distance is a natural choice for performing the median and depth calculations.Lei et al. (2023a) point out that using L 2 distance on the LQD space, which we refer to as simply 'the LQD method' for the subsequence, naturally centres the PDFs and is thus better for identifying shape outliers than horizontal-shift outliers.For instance, let c ½0,1 and note that the horizontally shifted PDF fðx þ cÞ ð Þmaps to a vertically shifted quantile function QðxÞ þ c ð Þ .Thus, both the shifted and nonshifted PDF map to the same quantile function, qðxÞ Due to this invariance property, the LQD method is recommended for finding shape outliers over horizontal-shift outliers.
It is recommended in Lei et al. (2023a) to perform a normalisation on the LQD space, ψ norm ðxÞ ¼ ψðxÞ= Ð 1 0 ψðxÞdx, which makes the outlying curves more distinguishable.This normalisation works for the cross-section median approach, since this will estimate an overall median that is part of the original set.However, this does not work when using the geometric median, since it is not immediately clear how to reverse this transformation for a curve that is created directly on the LQD space.For this reason, we normalise the LQDs only when using the cross-section median.

| Bayes distance
The Bayes space on a compact interval I, BðIÞ, is defined as the equivalence class of positive functions with support I that are equivalent under scalar multiplication.This space is a Hilbert space under the linear operations, Consider the centred log ratio (CLR) map on a function f B 2 ðIÞ, where c is defined as above.As argued in van den Boogaart et al. (2014), νðxÞ is an isometry to the L 2 space of square-integrable functions.Thus, for computational ease, calculations on Bayes space are done using equivalence d B ðf, It is recommended in Lei et al. (2023a) to shift the ensemble of PDFs horizontally until some feature of each PDF (such as the mode) are aligned in the horizontal direction.As mentioned previously, we will study the merits of using or not using this alignment operation in later sections.

| Probability metrics
In this section, we review four distances defined directly on the space of probability distributions, which we propose as tools to determine the median value and detect outliers for a PDF ensemble.Note that since these metrics are defined on the space of the PDF data, the cross-section median from (1) calculated directly on this space need not be a PDF.As such, we apply both of the methods discussed above to determine a median from the set of PDFs.For all of the following metrics, let f, g be PDFs with some compact support I, and let F, G be their corresponding CDFs.The metrics considered are as follows: 1 where total-variation distance is equivalent to one half of the L 1 distance between the PDFs as a consequence of Scheffe's theorem (Tsybakov, 2009).The Wasserstein distance is computed between the quantile functions (the inverse CDFs), hence the integral being with respect to quantiles q from 0 to 1.
The first two distances, Wasserstein and total-variation, each compare the location of the area under the curves between two PDFs.The Wasserstein distance can be thought of as the minimum 'cost' of moving the area under one curve until it aligns completely with the area under the second curve (Ollivier et al., 2014).Similarly, the total-variation distance is equivalent to one half of the absolute area between the two curves.
We suspect that these metrics should each be able to identify both horizontal-shift outliers as well as shape outliers, since each of these outlier types may lead to abnormal differences in the area under PDFs in the ensemble.
The last two distances, Hellinger and Fisher-Rao, are, respectively, the extrinsic and intrinsic distances between the PDFs when mapped to the positive orthant of a unit sphere on Hilbert space (Miyamoto et al., 2023;Srivastava et al., 2007).This map is simply the square-root operation, and Hellinger is the L 2 distance under this operation (i.e. the chord distance), while Fisher-Rao is the geodesic distance under this operation.

| BD and modified BD
The visualisation used in this paper, the functional boxplot, was initially developed in Sun and Genton (2011) using the BD and modified band depth (MBD) from Lopez-Pintado and Romo (2009) to order the curves.As pointed out earlier, these visualiations can be produced using any notion of depth or distance between the curves.
Loosely speaking, the BD and MBD of a curve f each measure the proportion of curve pairs in the ensemble which 'surround' f.Thus, the curve with the largest depth value is deemed the most 'central' in the ensemble.Let I be a finite set of discrete points in the compact interval I, where I is the support of f.Define the graph of f as the pairs GðfÞ ¼ fðt, fðtÞÞ : t Ig: the values of the function in ℝ 2 evaluated at the points in I.
The band between two curves g 1 , g 2 is defined as The sample BD for f is the fraction of pairs of curves in the entire ensemble whose band fully contains f.So, for an ensemble of curves fg 1 , …, g n g, A more flexible definition of BD is also developed in Lopez-Pintado and Romo (2009), called the MBD, that measures the overall proportion of time that a given curve exists within the bands created by the curves in the ensemble.Define the set The MBD is simply the averaged Lebesgue measure of this set for every pair of functions in the ensemble over the Lebesgue measure on the entire set Note that when a function is not inside a band B 2 at any time, there is no contribution to BD.The MBD allows for partial coverage by including the proportion of times a graph of a function is covered by a given band.Note also that the definitions for BD and MDB above focus on sets created by pairs of curves, B 2 and A 2 .The general definition for each of these quantities defines these sets for an arbitrary subset of curves, B J and A J for 2 ≤ J ≤ n.This extension comes with a great computational burden, so we (and several other authors who use this depth measure) focus only on J ¼ 2 (Sun & Genton, 2011).
While these approaches have become one of the most popular methods of calculating depth, they are inadequate measures when directly applied to PDFs.For instance, in an experiment in Lei et al. (2023b), BD applied directly to a PDF ensemble is shown to select a shape outlier as the median curve.To examine the merits of BD and MBD in this study, we first map the PDFs to their corresponding LQDs before calculating their depth.As we will see in Section 3, this is a strong method for detecting shape outliers but shows poor performance for horizontal-shift outliers.

| COMPARISON OF METHODS
We compare each of the methods in Section 2 by way of two simulation studies.In each study, we investigate a different type of outlier: the horizontal-shift outlier or the shape outlier.Each simulation repeats the following steps 100 times: 1. Simulate 190 'central' Gumbel PDFs using location parameter ð0:25 þ a i Þ and scale parameter ð0:05 þ b i Þ and 10 outlier curves.The outlier curves are drawn in one of two ways: (a) horizontal-outliers are drawn from Gumbel PDFs using location parameter ð0:32 þ a i Þ and scale parameter ð0:05 þ b i Þ; (b) shape outliers are drawn from the mixture αf 1 þ ð1 À αÞf 2 , where f 1 ,f 2 are Gumbel PDFs with scale parameters both set to ð0:05 þ b i Þ and location parameters set to ð0:2 þ a i Þ and ð0:3 þ a i Þ, respectively.
In the above, the values a i and b i are each independent draws from the normal distribution N ð0,0:01Þ for i f1, …, 200g and the mixture scale α is drawn from a Beta ð3,3Þ distribution.
The results from a single iteration of this simulation study are visualised in Figure 2. In this single iteration, we already see validation for some of the comments made on the different methods in Section 2. Horizontal-shift outliers are consistently captured by the PDF metrics for both median calculations when the PDFs are not centred, with Wasserstein distance having the lowest false positive rate.The L 2 distance on the LQD space and the BD & MBD methods perform poorly for horizontal-shift outliers.These three methods perform better when identifying shape outliers, which is expected since these measures are all calculated on the LQD space, which removes horizontal variability.The LQD distance seems to perform better using the geometric median when detecting shape outliers, which is different than the cross-section median estimate used in Lei et al. (2023a).
We validate these initial observations over the entire simulation study in Figures 3 and 4. For the horizontal-shift outliers in Figure 3, we see that the probability metrics all perform strongly.The best average accuracy, best true positive rate and lowest average false positive rate are all enjoyed by the Wasserstein distance using the geometric median without centring the PDFs.While the BD approach does have a lower false positive rate, it would appear to mostly achieve this by not flagging any curves as outliers.Wasserstein using the cross-section median seems to have around the same average performance, with a slightly wider spread.Based on these results, we would recommend using the Wasserstein distance with the geometric median without centring the PDFs when it is computationally feasible and substituting the geometric median with the cross-section median whenever data size is an issue.
In Figure 3, we examine the overall performance of each method in detecting shape outliers.There are less clear distinctions on which method is best for this type of outliers.The Wasserstein distance continues to enjoy a strong performance on shape outliers whenever the PDFs are centred.Somewhat surprisingly, the MBD with either median has the best overall average accuracy whenever the PDFs are not centred.This may be because the MBD is calculated after transforming the PDFs into LQDs, which is already equivalent to aligning the PDFs horizontally.
The 200 probability density functions (PDFs) from Figure 1, using the methods from Section 2 to classify outliers.The last 10 observations correspond to the black curves in Figure 1: the outliers.These curves are separated by a dashed red line.A filled-in cell indicates that the particular method detected an outlier, with the colour depending on which median estimate led to that PDF being detected as an outlier.
A strong method marks (by colouring in) the 10 curves to the right of the dotted line and does not mark the 190 curves to the left.
F I G U R E 3 Boxplots of classification metrics across the simulation study for the horizontal-shift outlier data.Higher values are better for accuracy and true positive rate; lower values are better for false positive rate.For every classification metric, we consider the cross-sectional and geometric medians and whether to centre the probability density functions (PDFs).
Across all methods, the ability to classify shape outliers is less accurate, suggesting that these outlier detection tools should be used in conjunction with subject-matter expertise to ensure no data are erroneously treated as outliers.
Based on these results, our recommendation is to use both Wasserstein distance with the geometric median and the MBD method using the cross-section median, each without centring.The former should identify horizontal-shift outliers and the most pronounced shape outliers, while the latter should be used to identify more subtle shape outliers.We further recommend to try Wasserstein distance with the geometric median and centring, since this method also seems to have strong performance for shape outliers and can be seen as an intermediary between the two aforementioned approaches.
With the three recommended methods, we assess the functional boxplot of Sun and Genton (2011) as a method for visualising PDF ensembles, using the same curves that are summarised in Figure 2. In Figure 5, we draw the functional boxplot for each outlier type and for each of the three recommended methods.In the functional boxplot, the median curve is in black, the centre 50% of data is coloured yellow, the data outside of this centre 50% that are not outliers are shaded light blue and the outlier curves are visualised with dotted vermilion lines.Note that the colours used here are different than the original colours used in Sun and Genton (2011); this was done to make the visualisation more colour-blind friendly.
We identify both strengths and weaknesses of the functional boxplot applied to PDF ensembles.A strength of this visualisation is that it permits a user to see how significantly different an outlier is to the central curves and in what way.For instance, for the data with horizontal-shifted outliers, in the functional boxplot using the geometric median under the Wasserstein distance without centring (the top-left plot in Figure 5), most of the outlier PDFs identified are those which are horizontally shifted to the right.There is one PDF classified as an outlier since it is heavier-tailed (i.e. less prominent mode) compared with the more central PDFs.A researcher using this visual may focus on investigating the significant shift to the right rather than the curve that is just barely classified as an outlier due to heavier tails.
A weakness of this visualisation is that the outlines of the shaded regions need not be valid PDFs.This issue is especially pronounced in the functional boxplot for Wasserstein distance using the geometric median with centring on the data with horizontal-shift outliers (the middle-left and and bottom-left plots in Figure 5).The outline of the large shaded yellow regions in each do not describe PDF curves and may be misleading when trying to visualise a region of valid PDFs.
The functional boxplots applied to PDF data highlight the effects of focusing on shape outliers with a method that is better designed to detect shape outliers.For the MBD and Wasserstein distance without centring methods (the middle-left and bottom-left plots in Figure 5), horizontal variation is not a significant factor in driving outlier detection, so these regions are significantly enlarged in the horizontal direction.

| APPLICATION
Modelling the flow of fluids (liquids and gases) and associated solutes through low-permeability rock is the focus of numerous civil and industrial engineering applications, such as in aquifer management, hydrocarbon extraction and the long-term storage of spent nuclear fuel (Follin et al., Boxplots of several classification metrics across the simulation study for the shape outlier data.For every classification metric, we consider the cross-sectional and geometric medians, and whether to centre the probability density functions (PDFs).
2014; Hyman et al., 2016;Jenkins et al., 2015;Kueper & McWhorter, 1991;Middleton et al., 2017;National Research Council, 1996;Neuman, 2005;Selroos et al., 2002;VanderKwaak & Sudicky, 1996).Studies have shown that this subsurface transport primarily occurs within interconnected networks of fractures (Viswanathan et al., 2022).Since direct observation of such fracture networks in subsurface rock is generally infeasible, large-scale, stochastic simulations are generally used for analysis (Neuman, 2005;Zhang, 2002).In these simulations, a set amount of particles are walked through the system and the time each takes to exit the system is recorded.A PDF is a natural data type to represent the distribution of particle breakthrough times.
Several studies have acknowledged the numerous layers of uncertainty involved in simulating flow through fractured media using the DFN simulators (Bonnet et al., 2001;Berkowitz, 2002;The National Academies of Sciences, 2021;Hyman et al., 2019;Strait et al., 2023;Osthus et al., 2020;O'Malley et al., 2018;Viswanathan et al., 2022).Visualising breakthrough curves and classifying outliers are major interests for researchers looking to understand the variation observed in breakthrough PDF data for ensembles of DFN simulation runs.
We analyse an ensemble of PDFs collected over 300 runs of a DFN simulator.For our simulations, we use the DFNWORKS computational suite to resolve flow and gas transport through semigeneric fracture networks and record breakthrough times.These simulations are all run at identical input parameter values but with different random seeds (Hyman et al., 2014(Hyman et al., , 2015)).
The DFN used to produce the PDFs involves three orthogonal fracture families in a 300LÂ100WÂ100H 3D environment.The simulation places fractures into this environment until a P 32 value of 0.05 is achieved.The value P 32 is the sum of surface area of each fracture divided by the volume of the domain.Variation between the hydraulic properties of the fractures is introduced via the following relationship between the fracture radius with the transmissivity: This equation is referred to as a correlated relationship Hyman et al. (2016).For each of the three families, α is set to 10 À5 and β is set to 0.5.
The radius of each fracture is drawn from a truncated power law distribution, with upper and lower cutoffs (r u ; r 0 ) and exponent γ: For this application, γ is set to 1.8, r u is set to 30, and r 0 is set to 10.These parameter choices were deemed suitable by subject matter experts.
F I G U R E 5 The functional boxplots of Sun and Genton (2011) applied to the simulation study iteration visualised in Figure 2 for the three recommended methods.The median curve is in black, the centre 50% of data are coloured yellow, the data outside of this centre 50% that are not outliers are shaded light blue and the outlier curves are visualised with dotted vermilion lines.
Using the findings from our simulation study, we consider three methods for visualising these data and classifying outliers.To identify anomalous horizontal-shifts, we use Wasserstein distance directly on the space of PDFs and identify the central point using the geometric median.Since we observed similar performance for this method with and without centring the PDFs, we apply both methods here.To identify anomalous behaviour in the shape of our curves, we transform the curves to the LQD space and calculate the MBD of each curve using as the observed PDF with the smallest L 2 distance from the cross-section median in (1) as the central point.The PDFs were not centred for this third method.
Three functional boxplots, each according to one of the three methods used for this application, are shown in Figure 6.The top plot, which uses the Wasserstein distance method without centring, isolates several horizontal-shift outliers.In the centre plot, this same method is applied with the set of centred PDFs, which appears to isolate several shape outliers.The bottom plot, which uses the MBD method without centring the PDFs, isolates fewer shape outliers than the previous method.
As expected from the simulation study, the Wasserstein distance method without centring (in the top plot) is the best at identifying potential horizontal-shift outliers.This same method with the centred PDFs (in the middle plot) also appears to be a strong method for identifying potential shape outliers.The MBD method without centring (in the bottom plot) is also a strong method for identifying potential shape outliers, but it appears to be a more conservative method as it identifies far fewer outliers than the method in the middle plot.
Once a set of outliers has been identified, detailed probing of the particular realisation in the ensemble can be performed to link these observables with geostructural information, such as network topology, geometry or hydrological properties.This assessment may also lead to further analyses and visualisations of the curves depending on the initial findings.For instance, if a small set of strong outliers are identified with MBD and understood in the context of geostructural information, a further study might attempt the method in the middle plot of Figure 6 to determine if there are additional outliers that can be understood in this context.

| CONCLUSIONS
In this paper, we have assessed several different methods for ordering ensembles of PDF data and determining outliers.This analysis compares these methods to two of the approaches outlined in Lei et al. (2023a) (which are discussed in Section 2.1) and applies these methods to the functional boxplot of Sun and Genton (2011) as means to study the usefulness of this visualisation for PDF ensembles.The methods discussed in this paper consider several distance metrics defined directly on the space of PDFs to order the data, which have seen surprisingly little use in modern F I G U R E 6 Data application on DFNWORKS simulations using the best-performing methods from the simulation study.The functional boxplots on the left are created according to the method outlined in Sun and Genton (2011).The median curve is in black, the centre 50% of data are coloured yellow, the data outside of this centre 50% that are not outliers are shaded light blue and the outlier curves are visualised with dotted vermilion lines.On the right, we show all the PDF curves for the ensemble, with the outlier curves coloured in black.
literature for outlier detection with PDF ensembles.We also assess two different approaches for determining the most central point and whether or not the PDFs should be initially centred, as recommended in Lei et al. (2023a).
Our findings show that Wasserstein distance with the geometric median is a strong choice for both horizontal-shift and shape outlier detection, centring or not centring the PDFs depending on which type of outlier on which one wishes to focus.For a parsimonious approach to shape outlier detection, our findings indicate that the MBD approach with the cross-section median without centring the PDFs is a strong approach.
These recommendations are used to analyse an ensemble of PDF breakthrough curves calculated from DFN simulations at Los Alamos National Lab.
As a part of this work, we have developed an R package to order an arbitrary ensemble of PDFs, determine outliers and visualise these PDFs via a functional boxplot for every method seen in Figures 3 and 4.This package, called DEBOINR (DEnsity BOxplots IN R), is currently available on the Comprehensive R Archive Network (CRAN) (Murph & Strait 2023).
r¼1, 2 g r ðtÞ ≤ xðtÞ ≤ max r¼1, 2 g r ðtÞg: Egozcue et al. (2006)bset of functions in BðIÞ that have square-integrable logarithms.As shown inEgozcue et al. (2006), B 2 ðIÞ is a metric space according to the Bayes metric, Lei et al., 2023b) ðÀ1 K gÞjj B ,where jjfjj B ¼ ffiffiffiffiffiffiffiffiffiffiffiffi ffi hf, fi B p .An ensemble of PDFs can be trivially embedded in B 2 ðIÞ (seeLei et al., 2023b), and thus, the Bayes metric is a natural notion of distance between PDFs.