Probability bound analysis: A novel approach for quantifying parameter uncertainty in decision-analytic modeling and cost-effectiveness analysis

Decisions about health interventions are often made using limited evidence. Mathematical models used to inform such decisions often include uncertainty analysis to account for the effect of uncertainty in the current evidence base on decision-relevant quantities. However, current uncertainty quantification methodologies, including probabilistic sensitivity analysis (PSA), require modelers to specify a precise probability distribution to represent the uncertainty of a model parameter. This study introduces a novel approach for propagating parameter uncertainty, probability bounds analysis (PBA), where the uncertainty about the unknown probability distribution of a model parameter is expressed in terms of an interval bounded by lower and upper bounds on the unknown cumulative distribution function (p-box) and without assuming a particular form of the distribution function. We give the formulas of the p-boxes for common situations (given combinations of data on minimum, maximum, median, mean, or standard deviation), describe an approach to propagate p-boxes into a black-box mathematical model, and introduce an approach for decision-making based on the results of PBA. We demonstrate the characteristics and utility of PBA versus PSA using two case studies. In sum, this study provides modelers with practical tools to conduct parameter uncertainty quantification given the constraints of available data and with the fewest assumptions.


Introduction
Decision-analytic models (DAMs) have been used in numerous applications, from clinical decision-making to cost-effectiveness analysis (CEA). A DAM integrates evidence within a coherent and explicit mathematical structure used to link evidence to decision-relevant outcomes. [1] There are situations where the evidence required for informing the values of model parameters that govern the behavior of DAMs is incomplete or non-existence, for example, health economic modeling at the early stages of a product's life cycle [2,3] and lack of resources to obtain the required data. [4] The importance of explicitly accounting for incomplete knowledge about model parameters (parameter uncertainty) and propagating its effect through a decisional process is underscored in numerous guidance documents in health, including, but not limited to, the guidelines by the International Society for Pharmacoeconomics and Outcomes Research (ISPOR)-Society for Medical Decision Making (SMDM), [5] the Agency for Healthcare Research and Quality (AHRQ), [6] the 2nd panel for Cost-Effectiveness Analysis (CEA) in Health and Medicine, [7] and beyond. [8] At its most basic characterization, parameter uncertainty means that we do not know the exact value of a parameter as several different (potentially uncountable) values may be possible for reasons such as the amount (the size of the arXiv:2011.10398v2 [stat.AP] 11 Aug 2021 available samples of observations) and quality (measurement error or accuracy of the observations) of the available information. [9] In many situations, the only information we have about a parameter is that it belongs to an interval bounded by a lower bound and an upper bound. In addition to knowing the interval, we may have some information about the relative plausibility of different values of θ in the interval. In situations where we have data or previous knowledge about a parameter, we can leverage standard statistical techniques to represent uncertainty in the form of a probability distribution. However, when data and knowledge are limited, we may have only partial or no information about the probability distribution, i.e., we can not assign the relative plausibilities of different parameter values. In some cases, we only know the measures of central tendency (mean or median) from published papers, while, in more extreme cases, only the minimum and maximum values are known to the researchers. To handle such data sparsity situations, it is necessary to have an approach for quantifying parameter uncertainty using the fewest number of assumptions and without the need for assuming precise probability distributions.
Despite the emphasis on its importance, the ISPOR-SMDM best-practice [5] recommends only two analytical tools for evaluating the effect of incomplete knowledge of model parameters on decisional-relevant outcomes despite the wealth of available methods in the engineering literature. [10] First, the best practice prescribes a set of default probability distributions that are mainly driven by the consideration of the parameter's support. For example, a beta distribution is used for characterizing the uncertainty of a parameter with a support [0, 1]. As a result, modelers tend to rely on "off-the-shelf" probability distributions to portray uncertainty "realistically over the theoretical range of the parameter." [5] The use of "default distributions" is, in fact, a matter of convenience because there is no sure way to verify that our choice of the distribution and its parameters is indeed valid. Furthermore, forcing the modelers to commit to a particular distribution implicitly assumes that the modelers have more information (e.g., knowing the shape of a distribution) than they actually possess and the uncertainty is known and quantifiable by a probability distribution. Secondly, the best practice proposes the use of expert knowledge elicitation [11] if no prior data is available. However, the proposed approach also hinges on a rather unverifiable assumption: the precise form of the probability distribution. The lack of methodological guidance is due to the lack of an available approach for representing and propagating parameter uncertainty in situations where it is impossible to assume a precise probability distribution.
An ideal approach to parameter uncertainty characterization is one that requires minimal assumptions. Specifically, in the absence of individual patient data, such an approach should require only information on statistics that are typically accessible to practitioners, such as mean, median, quantiles, minimum, and maximum (hereinafter collectively termed as minimal data). Additionally, the ideal method does not require information on or assumptions about the precise form of a probability distribution. Probability bounds analysis (PBA), [12] a combination of interval analysis and probability theory, is one such method and has been applied in risk engineering and management studies [13,14] and many other fields. [15,16,17] In a PBA, the uncertainty about the probability distribution for each model parameter is expressed in terms of upper and lower bounds on the cumulative distribution function (CDF). These CDF bounds form a probability box and are sufficient for circumscribing the unknown CDF of the model parameter given some minimal data about it. The goal of this paper is to introduce the PBA method for representing and propagating parameter uncertainty in situations where knowledge or data about the parameter is limited and a probability distribution can not be specified precisely or the practitioners are not willing to commit to a particular form. In this study, we assume that the model parameters are mutually independent. This paper is organized into five main parts. First, we review the concept of parameter uncertainty quantification. Second, we formally describe an approach for representing parameter uncertainty in PBA using a probability box. We focus on free probability boxes that is a generalization of parametric probability boxes. Then, we describe an approach for propagating probability boxes into a mathematical model. Fourthly, we introduce an approach for decision-making using PBA results. Then, we demonstrate two applications of PBA in modeling using Markov cohort models and a cost-effectiveness analysis of novel health technology. Lastly, we conclude with a discussion on the limitations and directions for future research. Throughout this exposition, we try to strike a balance between mathematical rigor and accessibility to practitioners.

Preliminaries
We begin by briefly introducing the concept of parameter uncertainty quantification and the status quo approach of probability sensitivity analyses.

Parameter uncertainty quantification
Let M denote a mathematical model (e.g., a cost-effectiveness model [7] or a decision-analytic model [18]) that maps a set of k inputs θ i ∈ θ (i = 1, 2, . . . , k) to a set of quantities of interest Y, i.e., M : θ → Y. We treat M as a black-box model, i.e., only θ and the corresponding Y, after "running" M at particular values of or a realization of θ, are accessible. We assume that the values of θ cannot be determined precisely due to lack of knowledge or data (epistemic uncertainty). Our uncertainty about each parameter in θ may vary according to the extent of what is known. To quantify the effect of not knowing the values of parameters precisely on decision-relevant outcomes Y (parameter uncertainty quantification), we proceed with the following tasks. First, we specify a mathematical framework to encode the degree of uncertainty in the model parameters (parameter uncertainty representation). Then, we prescribe an approach to propagate parameter uncertainty, given a representation from the previous step, into our health economic model (parameter uncertainty propagation). Lastly, we set an approach to interpret the resulting uncertainty in the model outcomes for use in further analyses.

Probabilistic sensitivity analyses
If we adopt the standard approach for parameter uncertainty quantification, i.e., probabilistic sensitivity analysis (PSA), [19] we proceed with the following steps. For parameter uncertainty representation, we treat each parameter in θ as a random variable that is endowed with a cumulative distribution function (CDF), F (θ), which is a monotonically increasing function from a sample space (e.g., the real number line R) onto [0, 1], zero at negative infinity, and one at positive infinity. In situations where the availability of data informing the estimation of the parameters is limited or non-existent, practitioners tend to select a type of CDF whose support matches with the model parameter's support (e.g., gamma distribution for non-negative parameters). Hence, this common practice implicitly assumes that the form of F (θ) can be precisely specified. The location and ancillary parameters of the chosen distributions are typically estimated using a moment matching approach. After the CDF has been assigned to each uncertain parameter, the uncertainty propagation follows an iterative Monte Carlo sampling approach. For each iteration, parameter values are sampled independently from the precisely specified CDFs, and the model is evaluated using these values to generate model outcomes. After a prespecified number of samples, an empirical CDF of the model outcome is obtained. Given the empirical distribution of an outcome, we can calculate its expected value and use it as an input to other analytical tasks (e.g., decision rule and value of information analysis).

Parameter uncertainty representation
This section introduces the parameter uncertainty representation step of PBA. First, we describe the concept of a probability box. Then, we introduce the formulas for a probability box given varying levels of available minimal data.

Probability box
As above, we suppose that the imperfect or lack of knowledge about a parameter (θ) can be characterized by a random variable endowed with a CDF F (θ). Instead of being restrictive in the context of limited data, PBA assumes that F (θ) is unknown or cannot be precisely specified and introduces the concept of a probability box or p-box. A probability box of a continuous random variable θ with an unknown CDF F (θ) is an interval, P D = F (θ), F (θ) , which consists of all CDFs, including the unknown F (θ), that are: 1) bounded by a pair of bounding functions, i.e., a lower-bounding function (LBF) F (θ) and an upper-bounding function (UBF) F (θ) and 2) consistent with a minimal data D (where D denotes the set of available data or information on the statistics of the unknown CDF). [12,20] The UBF and LBF have the following properties: 3. F (θ) and F (θ) form the "tightest" bounds 4. F (θ) and F (θ) are consistent with D We say that a CDF of θ is consistent with the minimal data D if each element in D can be equated to a statistic that is derivable from the CDF. Under the PBA framework, the epistemic uncertainty is given by: for every possible realization of θ, we can only assign an interval of CDF values, F (θ), F (θ) , in contrast to a single CDF value. As we accumulate more and more data on the parameter, the epistemic uncertainty is reduced, and the interval will eventually shrink to a single CDF.

P-box formulas for different D
We consider commons situations of data availability where a modeler can identify and specify a combination of different summary statistics of and/or information on θ that constitutes a particular minimal data D. We show the derivation of one formula (Equation 7) as an exemplar in Appendix A.1. We also show one proof of a p-box providing the tightest bounds on the unknown CDF, among all other pairs of bounding functions, given D (Appendix A.2).
The first situation involves knowing the smallest (minimum) and largest (maximum) values of a parameter. For some parameters, one can infer the range from the theoretical limits, such as zero to one for probability or utility parameters. In some cases, a modeler may ask domain experts to specify a range from their knowledge about the quantity in question. In both cases, the task will set a minimum a and a maximum b such that the value of a parameter lies in the interval [a, b]. The p-box P a,b is given by: for LBF, and, for UBF.
If, in addition to knowing a, b, the median m of θ is also known, then the p-box will be tighter than P a,b . The p-box P a,b,m is given by: for LBF, and, for UBF.
If, in addition to knowing a, b, the mean µ = E[θ] is also known, then the p-box P a,b,µ is given by: for LBF, and, for UBF.
If, in addition to a, b and µ , we have data on the standard deviation σ, then the p-box P a,b,µ,σ is given by: for LBF, and, for UBF, where In principle, as we have additional summary statistics on θ or more information about the unknown F (θ), the p-box becomes tighter ( Figure 1). Explicit formulas for other Ds have a complex form (see Appendix A.3 for an example where D = {a, b, m, µ}) and are generally difficult to derive. [21] In general, we can derive further cases by intersecting the p-boxes of different Ds described above (termed as primitive p-boxes) by "tracing" the minimum (or maximum) of the intersection of the corresponding UBFs (or LBFs). More formally, for each D d (where d indexes each combination of available data), the LBF and UBF are given by: and respectively.

Parameter uncertainty propagation
In a modeling study, we typically have heterogeneity in the amount of and indirectness or imprecision in the available data used to estimate θ. In principle, each θ i ∈ θ, based on the data availability and the chosen representation of its uncertainty, falls into of the following subsets of θ: (1) θ f for parameters with fixed values (no uncertainty), (2) θ c for parameters known up to their precise CDFs, and (3) θ b for parameters known up to their Ds. In some cases, practitioners may have access to information that is sufficient for specifying probability distributions of parameters (θ c ). This section presents the parameter uncertainty propagation step of PBA in the context of θ b only and a mix of θ c and θ b . First, we describe the intuition behind the propagation and proceed with an algorithm.

Propagating p-boxes
We recall that uncertainty propagation in PSA works as follows: each parameter value is sampled from its precise CDF, typically using the inverse transform sampling method if the inverse of the CDF is explicitly known. For PBA, we use the same idea with one modification, i.e., we sample an interval of values instead of a single value. The sampling scheme loosely mimics the inverse transform sampling. For each p ∈ [0, 1] (the image of the CDF), we sample an interval by using the inverses of the LBF and UBF. To mitigate the computational burden, instead of sampling the intervals for all values in [0, 1], we partition the image of the CDF into finite sub-intervals. For each sub-interval and its endpoints, we calculate the corresponding interval of parameter values using the inverse p-box. The choice of how to evaluate the endpoints of the sub-interval, e.g., using the LBF (UBF) for the upper (lower) endpoint, determines the accuracy of the approximation due to discretization. We assign a probability to the interval based on the length of the sub-interval. We then repeat the process for each parameter. Since there are multiple possible realizations (equal to the number of sub-intervals) for each parameter, we need to consider all possible combinations of sub-intervals across all parameters, e.g., using a Cartesian product. The probability of each possible combination (henceforth termed as a hyperrectangle) is computed by multiplying the probabilities assigned to the sub-intervals comprising the combination because of our independence among parameters assumption. After specifying an approach for sampling hyperrectangles from p-boxes, we need to prescribe a method to evaluate our model using the sampled intervals. One approach is based on optimization, where the goal is to find a pair of optima, i.e., the minimum and maximum values of the model outcome for each possible hyperrectangle. The probability of observing a pair of optimum values is equal to the sampling probability of the corresponding hyperrectangle. To obtain the p-box of the model outcome, we cumulate the probabilities of each minimum (maximum) to derive the UBF (LBF).
Formally, to propagate the uncertainty of the set θ b into M, we proceed with the following steps. First, for each θ i ∈ θ b , we derive its p-box given D i . Next, we specify the approach for generating interval-valued samples. To build intuitions about how the sampling works, we use a full-factorial-design approach, i.e., the slicing algorithm with outer discretization [22]. For each θ i , the interval [0, 1] is partitioned into n θi sub-intervals with its corresponding probability mass m j (j ∈ {1, 2, . . . , n θi }) such that For the j-th sub-interval, we identify its lower and upper boundary points, i.e., c j i and d j i , respectively. Given each pair of boundary points, we calculate the boundary points in the θ i domain by using the inverse or quasi-inverse of the p-box: These inverse or quasi-inverse functions are derived from their corresponding LBF and UBF (Appendix A.4), where the quasi-inverses are due to the fact that some LBFs and UBFs are not strictly injective functions. Equation 11 corresponds to a particular choice of a discretization, i.e., an outer discretization approach (Figure 2). The intervals [a j i , b j i ] and their associated m j collectively form the p-box of θ i .
We denote K as the set of multi-indices where each multi-index corresponds to a particular combination of sub-intervals of all θs in θ b : . . , n θi }} (12) where n is the number of parameters in θ b and k i indexes the sub-intervals for θ i . For each k ∈ K, we denote H k as a hyperrectangle (i.e., a Cartesian product of intervals) which is given by: For each H k we calculate its probability mass as follows: 14) where m ki corresponds to the probability mass associated with the k i -th subinterval of θ i . Equation 14 represents our assumption about the independence among model parameters since the dependence structure depends on how the probabilities in the Cartesian product are computed. In our case, we multiply the marginal probabilities to get the probabilities of each H k , which is akin to assuming random set independence. [12] For each H k , we associate two optimization problems whose solutions provide the bounds on a quantity of interest (model outcome) Y : for a particular set of values of θs in θ f and θ c . The existence of a maximum and a minimum is guaranteed by the Weierstrass Extreme Value Theorem [23] since, in decision-analytic modeling, the model M is typically smooth and H k is closed and bounded (compact). The p-box of Y is therefore characterized by a collection of Y k , Y k and its corresponding probability mass P (H k ). The empirical p-box of Y can be calculated as: where k i indexes all elements in K, |K| denotes the number of elements in K, and I Y k i ≤Y is an indicator function.

Propagating p-boxes and precise CDFs
To propagate uncertainty from both sets θ c and θ b into M, we proceed in two steps. First, since the uncertainty of each parameter in θ c can be characterized by a precise CDF, the uncertainty propagation reduces to a Monte Carlo approach, [24] a repeated sampling from a joint distribution of parameters in θ c (if their dependencies are known). Let f (θ c ) be the joint distribution. Repeated samplings from f (θ c ) will generate a sequence of samples of where N is the total number of Monte Carlo samples. Second, for each sample θ l c (l indexes the parameter in θ c ) and each H k (Equation 13), we solve the following optimization problems: and derive F l (Y ) and F l (Y ) using Equation 16. The p-box of Y is then calculated by averaging over the N samples: Alternatively, if M is relatively linear, we can fix the values of θ in θ c at their mean values. This approach avoids the use of repeated sampling and reduces the computational time.

Application of PBA
This section describes how practitioners can utilize the results of uncertainty propagation using PBA (Equations 16 and 18). First, we introduce notations to fix ideas. Then, we describe an application in decision analysis.

Formalism of a decisional problem
A typical decision-making problem in health domains consists of: 1) m competing interventions (e.g., new drug vs. usual care), a r (r = 1, . . . , m); 2) n decision-relevant outcomes (e.g., life expectancy or lifetime cost), Y j (j = 1, . . . , n); 3) a mathematical model to evaluate the effect of a on Y as in Section 2.1, Y = M(a r |θ); 4) k model parameters, θ i (i = 1, . . . , k); 5) measures of knowledge or uncertainty about each parameter (e.g., precise CDFs or p-boxes) and their dependencies; 6) a value (or utility) function, U (M(a r |θ)) := U (a r |θ), that integrates the evaluation of each intervention on all Y j s; and 7) a choice function capturing a decision rule for choosing the (or set of) optimal intervention(s), G(U (a|θ)) = A, where A is the set of optimal interventions. For ease of exposition and without loss of generality, we assume that M is deterministic. Hence, the states of the world are completely determined by our knowledge about θ.

Decision analysis with PBA
We recall that the most commonly used decision rule, i.e., Expected Value Maximization, requires the specification of CDFs in the context of parameter uncertainty. [25] Under this choice function, if we can specify all the CDFs F i (θ i ), then an intervention a * is chosen if Since the propagation of uncertainty in θ results in uncertainty in Y (F (Y)), we write a * = arg max ar Y U (a r |Y)dF (Y) . We note that the calculation of the expected value of Y over its p-box results in an interval of expected values. The interval includes all expected values that correspond to CDFs enclosed by the p-box. This is true because the p-box of Y is guaranteed to enclose all CDFs of Y (assuming that the p-boxes of the θs are properly specified). Therefore, the expected value of U (a r |Y) for each CDF in the p-box must lie in the interval that is given by [µ(a r ), µ(a r )] where µ(a r ) and µ(a r ) are given by Y U (a r |Y)dF (Y) and Y U (a r |Y)dF (Y), respectively. Given what we know and assume about the uncertainty of the model parameters, the expected utilities can not be larger (smaller) than µ(a r )( µ(a r )). Furthermore, [µ(a r ), µ(a r )] is not endowed with an uncertainty measure, i.e., we cannot say the relative plausibilities of each value in the interval. Therefore, we cannot use the Expected Value Maximization for PBA. Instead, the decision rule is based on finding the optimal intervention by comparing the intervals [µ(a r ), µ(a r )] of all interventions (a r ).

Case studies
We conduct two case studies. The first case study uses a hypothetical Markov cohort model to examine the characteristics of PBA and demonstrate the difference between PBA and PSA. The second case study is based on a published early assessment of the cost-effectiveness of a computer-assisted total knee replacement in the absence of clinical trial data. [26] The models are coded in R [27] and available under a GNU GPL license and can be found at https://github.com/rowaniskandar/PBA.

Case study 1
We consider a generic four-state stochastic (Markov) cohort model as our M, which is commonly used in DAM and CEA studies, [1] with the following health states S = {S 1 , S 2 , S 3 , S 4 } (Figure 3), where S 4 is an absorbing state. We assume that the probability distributions for the rates of transitions S 1 → S 3 (c 2 ), S 2 → S 3 (c 4 ), and S 2 → S 4 (c 5 ) are  (Figure 4). For uncertainty propagation with precise CDFs, we use the support point method [28] for sampling from the gamma and uniform distributions with N = 50. For uncertainty propagation using p-boxes (Equation 16), we apply a deterministic search algorithm based on systematic divisions of the domain (Equation 13) into smaller hyperrectangles [29] and use the implementation of the nlopt library [30] in the R program. [31] The first comparison shows the difference between the results of a parameter uncertainty propagation into a model outcome using precise CDFs (gamma distribution) vs. p-box. A PBA results in a p-box enclosing the unknown CDF of the model outcome instead of a precise CDF ( Figure 4A). The p-box gives additional information: (1) the amount of uncertainty in the model outcome due to our imperfect or complete lack of knowledge about some model parameters, which is indicated by the area enclosed by the p-box and (2) the plausible values of the model outcome, which is indicated by the model outcome values with non-zero probabilities. The latter suggests the minimum and maximum achievable values of the model outcome. We also note that the accuracy of the empirical LBF and UBF increases with the number of sub-intervals of each parameter (n θi ). The second comparison showcases the implications of how uncertainty due to a severe lack of data about parameter values is modeled ( Figure 4B). Uncertainty propagation with a uniform distribution results in a model outcome's CDF that gravitates towards a central tendency and, essentially, "eliminates" our ignorance. In contrast, the result of PBA preserves our ignorance. Furthermore, we observe that the plausible values of the model outcome under uniform distributions are concentrated in the leftmost region of the support, thereby discounting the possibility of having high values. Conversely, PBA produces bounds on the model outcome while maintaining the plausibility of a wide range of values. This observation highlights the potential peril of assuming a precise form of a CDF, particularly when the model outcome represents an undesirable outcome.

Case study 2
We replicate a published cost-effectiveness analysis of a computer-assisted total knee replacement (CA-TKR) versus a conventional TKR. [26] We develop a Markov model with the following states: TKR operation for knee problem, normal health after primary TKR, TKR with minor complications, TKR with serious complications, simple revision operation for treating complications, complex revision operation for treating complications, other non-revision treatments for complications, normal health after TKR revision, and death. The analytical period is ten years with a monthly time-step. For the transition probabilities which could not be estimated from available data, i.e., transitions to serious complication from minor complication or other treatment, transitions to minor complication from other treatment or serious complication, and transitions to simple revision from other treatment or vice versa, the authors assumed that their values are identical to the estimated mean values for the same transitions from other states. We relax these assumptions and, instead, subject the six transition probabilities and the efficacy of CA-TKR to an uncertainty analysis using PBA and PSA with data only on the mean, minimum, and maximum values. For probabilities and the efficacy parameters, we use beta and gamma distributions, respectively. Since the study does not report the variances, we assume that the standard deviation is 20% of the mean value. We conduct two uncertainty analyses in which we vary the minimum and maximum values (ranges) of the seven parameters of interest, i.e., the ranges reported in the study and wider ranges of values (ten times the original ranges). For the other parameters, we fix them at their mean values (see Table 2 in [26]). For the cost-effectiveness measure, we calculate the incremental net monetary benefit (INMB) and estimate its empirical CDF (PSA) and p-box (PBA), given a willingness-to-pay threshold of £30000 per quality-adjusted life year. The cost-effectiveness analysis is conducted from the National Health Services' perspective and uses 3.5% as the discount rate. We use all the data and assumptions that are reported in the study and make reasonable assumptions whenever the data is not available in the published paper. For more details on the model structure and estimation and their assumptions, we refer the readers to the original study [26].
Using the PSA approach where we assume precise specifications of the CDF and use the published ranges, the CDF of the INMB lies entirely to the right of zero in Figure 6, i.e., CA-TKR is always cost-effective at the given willingness-to-pay threshold. In contrast, the PBA approach using the same ranges results in a p-box of the INMB, a marginal part of which is to the left of zero line, i.e., CA-TKR is not cost-effective at the given willingness-to-pay threshold. If we consider a wider range of possible values for each of the seven parameters, the p-box stretches to, at least, £−600000. This observation indicates that the uncertainty in the INMB is sensitive to the assumed ranges of values. Therefore, the cost-effectiveness of CA-TKR is overestimated when we assume rather narrow ranges for model parameters for which we lack reliable data.

Discussion
This study introduces the probability bound analysis method for quantifying the effect of parameter uncertainty on decision-relevant outcomes that is distribution-free. This paper is the first study that examines the utility of PBA in DAM and CEA studies. Although our contribution focuses mainly on medical decision-making and economic evaluation fields, the methodologies apply to many studies using mathematical models to inform policy decisions. [32] To assist practitioners, we provide p-box formulas for the most common situations of data availability. We show an approach for propagating p-boxes into a black-box model where the uncertainty of the model parameters is characterized by a combination of p-boxes and precise distribution functions. We conduct two case studies to demonstrate the methodological characteristics and practical application of PBA.

Advantages of PBA
The novel approach allows practitioners to conduct probabilistic assessments even when extremely little reliable empirical information is available about the distributions of model parameters. In PBA, parameter uncertainties are characterized by p-boxes that provide the maximum area of uncertainty (tightest bounds) containing the unknown distribution function, given knowledge about the summary statistics of the parameters. For basic binary operations (addition, subtraction, multiplication, and division), [33,20] the derived p-box of a model outcome is optimal in the following sense: one can not find other tighter bounds without excluding some of the plausible CDFs. However, p-box computations using basic operations cannot be easily extended to black-box models. [34] Nevertheless, the uncertainty propagation of p-boxes into a black-box model, using optimization (Equation 15), generates bounds that are guaranteed to enclose all possible CDFs of the model outcome provided that the parameter p-boxes enclose their respective distributions, without the assurance of the optimality of the bounds [14] . PBA is based on two existing approaches. First, we believe that a parameter value can be bounded in some intervals without specifying the relative plausibility over the interval (interval analysis [35]). Secondly, we assert that the parameter uncertainty can be represented by a probability (probability theory [36]). When taken together, PBA models the uncertainty using a CDF, but the CDF is not precisely specified and assumed to be located within an interval containing all possible CDFs. In a way, a PBA gives an identical answer as an interval analysis whenever the range is the only accessible information. If the lower and upper bounds of a CDF coincide for every element in the support, then a p-box degenerates to a CDF: a situation where Monte Carlo simulation is the standard approach. Therefore, a PBA is a generalization of the two standard approaches for representing parameter uncertainty. In addition, a PBA is an improvement over both approaches for situations where one approach is not sufficient by itself.
One decision-relevant information from the results of a PBA, as demonstrated in our case studies, is the bounds on the plausible values of a model outcome. This information is particularly useful when the model outcome represents a negative outcome (or a catastrophic event). [37] The p-box of a model outcome suggests that the outcome will not be smaller (or larger) than a minimum (or maximum), which can be identified by the infimum (supremum) of the support of the UBF (LBF). The standard approach in DAM and CEA studies, i.e., the (over-)reliance on using "off-the-shelf" probability distributions for characterizing uncertainties about model parameters, may potentially lead to an underestimation of the probability of observing extreme values of the model outcome. Using probability distributions may also assume more information about uncertainty than that is supported by the current evidence base. These errors in estimating probabilities in the context of insufficient data or a complete lack of knowledge may contribute to overconfidence and lead to a failure to insure ourselves against highly consequential risks. [38] Our first case study also highlights the consequence of using a uniform distribution, the most common approach for modeling ignorance about a parameter. Although using a uniform distribution may be justifiable as the embodiment of the principle of indifference [39,40], this "all are equally likely" assumption significantly discounts the possibility of the extremes. On the other hand, PBA can, loosely speaking, transfer our ignorance about parameter values to ignorance about a model outcome.
The second case study [26] represents a real-world setting where we lack the data to inform some of the key parameters, including the efficacy of the novel technology and probabilities of adverse events. The authors of the original study failed to adequately represent the uncertainties in these parameters by prescribing narrow ranges. In our re-analysis, our PBA approach yields a wider p-box of the INMB (more uncertainties) when assuming wider ranges of values. Moreover, the PBA does not require any assumptions about the standard deviations (cf. Equations 5 and 6). Although we are not able to exactly replicate the published results due to missing information on the variances, our qualitative result is still valid. Regardless of the value of the variances, the conclusion on whether CA-TKR is cost-effective is sensitive to the assumed ranges of values.

Computational costs
PBA is computationally intensive for the following reasons. First, an implementation of a PBA requires an optimization step over the p-box. In this study, we use a full factorial design that transforms the problem of propagating a p-box into propagation of a large number of intervals. The higher the required level of accuracy is, the higher number of intervals n θi is needed. Furthermore, an increase in the number of p-box parameters will lead to a higher-dimensional optimization problem. Second, the computational burden is further exacerbated if the black-box model (M) is "expensive" to evaluate for a given θ. Thirdly, if, in addition to p-box parameters, some parameters are characterized by their CDFs, the optimization step is embedded in a Monte Carlo sampling loop (Section 4.2); thereby increasing the number of optimizations by a factor of N (the total number of Monte Carlo samples). To mitigate the computational burden, users of PBA may opt to use less conservative p-box propagation approaches [41], more efficient optimization methods [42], and fast-to-evaluate approximations of the original model or meta-models. [43] Nevertheless, we expect a higher computational burden since a PBA imposes fewer restrictions (i.e., we do not assume a functional form), leading to a larger region of uncertainty over which a model needs to be evaluated.

Relation to other methods
PBA is generally regarded as one of the uncertainty quantification approaches related to the theory of imprecise probability. [44,14] In particular, a p-box is closely connected to Dempster-Shafer's theory of evidence. [45,46,12] The LBF and UBF can be interpreted as belief and plausibility measures for the event θ taking values than a particular value {θ ≤ x}. [12] In Dempster-Shafer's theory, the belief function describes the minimum amount of probability that must be associated with the event, whereas the plausibility function describes the maximum amount of probability that might be associated with the same event. The PBA framework is also related to Bayesian sensitivity analysis (or robust Bayesian analysis). [47] In this approach, an analyst's uncertainty about which prior distribution and likelihood function should be used is characterized by an entire class of prior distributions and likelihood functions. The analysis proceeds by studying various outcomes for each possible combination of prior distribution and likelihood function. Another distribution-free approach is the Chebyshev inequality [48] that can be used to compute bounds on the CDF of a random variable, given the mean and standard deviation of the random variable. However, the inequality cannot produce a tighter bound even if we have more data (e.g., median). Kolmogorov-Smirnov (KS) confidence limits [48] also provide distribution-free bounds on an empirical CDF. The calculation of KS limits requires, however, requires access to sample data.

Limitations
Our study has limitations in the following context. First, we assume independence among the model parameters. To the extent of our knowledge, how to model dependencies among the parameters in the context of uncertainty propagation using PBA and black-box models is an open problem. One potential approach for modeling dependencies among parameters is to use a copula to represent the joint uncertainty of all parameters. [49] A copula approach factors the joint CDF into a product of independent marginal CDFs and a copula that capture the dependencies. In this formulation of bounds using a joint CDF, the overall bound is a function of the bounds on CDFs for some parameters represented by their p-boxes and the bounds of the copula. The potentially promising approach using copula warrants further study and is, however, beyond the scope of our study. Secondly, our study does not address the question of when one should consider using p-box vs. assuming a particular CDF to characterize uncertainty. Instead of being prescriptive, we defer such decisions to the analysts because the level of uncertainty at which a p-box is the preferred approach is problem-dependent. For example, a parameter may be highly uncertain due to the lack of empirical data and/or previous knowledge and, at the same time, non-influential, i.e., the model outcome is not sensitive to variations in the parameter values. Thirdly, we provide a rudimentary treatment on how to make decisions using the results of a PBA. In situations where best-case/worst-case results are the basis for decision making, the analytical interval approach is preferred to assuming a distribution (e.g., uniform) and performing a simulation, particularly when that distribution may not correctly describe the parameters. A more comprehensive treatment of decision-making based on interval values or bounds on probability distributions is needed, and it should be a focus of future studies on uncertainty quantification in decision-analytic modeling and cost-effectiveness analysis.

Concluding remarks
This study addresses limitations in current methodologies for characterizing uncertainty in data and knowledge used to inform mathematical models. The novel methodology maximizes the use of existing limited information with the fewest number of assumptions, and provides a way to honestly characterize the uncertainty in the model parameters distributions used in decision-analytic modeling and cost-effectiveness analysis studies.
acknowledgements I would like to thank Shaun Forbes and Thomas Trikalinos for introducing me to the world of imprecise probabilities or Knightian uncertainty. I would like to acknowledge the COMED consortium for their excellent support. I am so thankful to Cassandra Berns for her help in deriving some of the inverses and plotting the figures. I am grateful to Kosta Shatrov and Vishahan Suntharam for their reviews.

conflict of interest
We have no conflict of interest to declare.

Funding information
This project received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement 779306 (COMED-Pushing the Boundaries of Cost and Outcome Analysis of Medical Technologies).

A.1 Derivation of p-box formulas
In this section, we show a heuristic approach for deriving a p-box for D = {a, b, µ, σ} as an exemplar of other D. We use the same notations and definitions as described in the main text.
The LBF of the p-box,P a,b,µ,σ is given by: Let F denote the set of all CDFs (F(θ)), that are consistent with the minimal data: {a, b, µ, σ}. From the definitions of a CDF, any F(θ) ∈ F satisfies: 0 ≤F(θ) ≤ 1, F (a) = 0 , and F (b) = 1 (denoted collectively as Assumption (*)). Let f t be the probability density function. Using the formula for a mean, we obtain the following equation: Since we know the standard deviation σ, we have: Since the derivative of S(θ) is equal F(θ), then: To derive the LBF, we consider three cases depending on the value of θ. The first case is a ≤ θ ≤ ξ 1 . In this interval, F(θ) has a minimum of 0. We want to (1) find F(θ) ∈ F that (i) satisfies Equations A.2, A.3,and A.4 as well as (*) and (ii) takes the value of 0 ∀θ ≤ ξ 1 and (2) determine the maximum value of ξ 1 . For ξ 1 to be a maximum, the following has to be true: the maximum area under S(θ) is concentrated in ξ 1 < θ ≤ b. Since S (θ) =F(θ) is non-decreasing, the maximum area will be obtained if F(θ) is constant (or S(θ) is linear) over ξ 1 < θ ≤ b. Using Equation A.3 to compute the area under S(θ), we obtain the following equality: Solving for ξ 1 , we have: Therefore, for x ≤ ξ 1 and x > ξ 1 , The second case is ξ 1 < θ ≤ ξ 2 . Since θ > ξ 1 , then F(θ) = 0. To find the minimum possible F(θ), we need to concentrate the maximum area under F(θ) to the left of θ. This is only possible when F(θ) is constant (similar to the above) and takes different (constant) values, depending on whether the interval is to the left or right of θ. Geometrically, the shape of F (θ) follows a step function. Using Equation A.3 to compute the area under S(θ), we obtain the following equality: Hence, we have: Thus, for a < θ ≤ θ: and, for θ < θ < b, using Equation A.2, we have: Since F (θ) must be non-decreasing, Equation A.7 must be no less than Equation A.6: otherwise, Equations A.2, A.3,and A.4 as well as (*) are not satisfied. We also have another condition, i.e., Equation A.7 must not exceed 1, or: In sum, for ξ 1 < θ ≤ ξ 2 , we obtain: The third case is ξ 2 < θ ≤ b. Since θ > ξ 2 , F (θ ) must be equal to 1 for θ > θ. In contrast to the second case, for θ to the left of θ, F (θ ) can not be strictly constant. To obtain the minimum value of F (θ ), we split the interval into θ ≤ ξ 0 and θ > ξ 0 for some ξ 0 : F (θ ) = 0 for θ ≤ ξ 0 and F (θ ) is constant for θ ≤ ξ 0 . As above, using Equation A.3 to compute the area under S(θ), we obtain the following equality: Therefore, the minimum value of F (θ) is given by: Combining the results of the three cases, we obtain Equation 7. The derivation of the UBF follows identical steps: instead of a minimum, we find the maximum of F(θ) and start the derivation from the maximum value of θ (b) . The derivations of p-boxes for other D follow the same principles, i.e., by ensuring that certain constraints are respected.

A.2 P-box as the tightest bound
In this section,we show that a p-box gives the tightest bounds on the unknown CDF, given D = {a, b, m} as an exemplar of other D. We consider only the LBF as the procedure for the UBF follows the same steps. The LBF for D = {a, b, m} is given by: for θ < m 0.5 for m ≤ θ <b 1 for b≤ θ Let F denote the set of all CDFs (F(θ)), that have {a, b, m}. From the definitions of a CDF, any F(θ) ∈ F satisfies: 0 ≤F(θ) ≤ 1, F (a) = 0 , and F (b) = 1 (denoted collectively as Assumption (*)). We use a proof by contradiction. Suppose that there exist an LBF G(θ) (different from F (θ)) and a UBF G(θ) (different from F (θ)) such that the following is true: Consequently, there exist θ 1 ∈ [a, b] such that the following holds true: F(θ 1 ) < G(θ 1 ) ≤ F (θ 1 ) (A.12) We consider two cases: θ < m and θ ≥ m. From Equation 3, we know F (θ) = 0 for θ < m. Therefore, we can find an θ 1 ∈ [a, m) such that: We pick an F (θ) ∈ F with the following form: for a ≤ θ < θ 1 1 2 G(θ 1 ) + 1−G(θ1) 2(m−θ1) (θ − θ 1 ) for θ 1 ≤ θ < m (A.14) The form is chosen specifically for ensuring F (θ 1 ) = 1 2 G(θ 1 ). Therefore, we have G(θ 1 ) > F(θ 1 ) which contradicts Equation A. 13. In sum, we are able to find F (θ) ∈ F that does not satisfy the inequality in Equation A.11 for the proposed lower bound G(θ) in the interval θ < m. Furthermore, Equation A.11 is violated for any value of G(θ) > 0. Hence, the lower boundary for θ < m can not take positive values which contradicts our assumption about the existence of G(θ) that is different than F (θ).
We follow the same approach for θ ≥ m. From Equation 3,F (θ) = 1 2 . Therefore, we can find an θ 1 ∈ [m, b] such that: The inverse functions of the p-box P a,b,µ are given by: