Scenarios for valuing sample information in natural resources

Uncertainty is ubiquitous in natural resource systems, science and management. Sample data are obtained in order to reduce uncertainty, thereby increasing knowledge and improving resource management, but sampling always comes at a cost of some sort. Is that cost worthwhile? Analysis of the value of sample information (VSI) addresses this question. In this paper we develop the valuation of sample information in terms of five elements: (a) a system whose attributes are the focus of analysis; (b) a range of management actions that affect the system's status; (c) uncertainty about system status or structure, as characterized by initial (prior) probabilities of possible system states or structures; (d) an experiment or other information source that produces new data points and updated (posterior) probabilities; and (e) a value measure that is a function of the management action taken, conditional on either the system state or structure. We describe five scenarios for analysing the VSI under uncertainty about system structure and state. Scenarios 1–3 comprise analyses of conditional, expected and optimal expected values of sample information. They focus primarily on choice of management adaptations with new information. Scenarios 4 and 5 involve pre‐selected management actions, and are useful for comparing designs of data collection rather than for choosing a management action. These last scenarios expand the framework for VSI to include actions that have been selected independently of the updating of uncertainty. We discuss other extensions of VSI analysis, which include spatial applications, hybrid scenarios, applications involving dynamic systems, and a focus on costs rather than net benefits. Value of sample information analysis holds promise in emerging areas of ecology such as ecological forecasting and the use of remote sensing in conservation, where potential new data from models and satellites can be evaluated in advance, thereby allowing more efficient prioritization of scientific efforts. More generally, VSI can contribute to better ecological understanding and more effective management in a wide range of ecological situations.


| INTRODUC TI ON
Analysis of the value of information can be an important aid in determining whether further collection of data is warranted (Canessa et al., 2015), given the time and financial resources required. Costs can be contained by well-designed data collection efforts and field expertise. Benefits can be enhanced by a clear understanding of how the data will aid management decisions. Nevertheless, the issue of potential net benefit from new information remains. This issue is captured in the concept of value of information.
The phrase 'value of information' has been around in one form or another since the early 1960s. In their landmark book Raiffa and Schlaifer (1961) provided one of the first treatments of the value of information, coining the name and developing many of its key expressions. Analysis of the value of information occurs frequently in medicine, economics, and other disciplines (Keisler et al., 2014), and its use is growing in conservation decision-making (Bolam et al., 2019).
Several variants are now well recognized (Yokota & Thompson, 2004), including both analyses of data that are already known, and prospective analyses of the potential worth of new information.
Most treatments of information value in the environmental or ecological literature concern the expected value of perfect information, rather than the value of sample information (see, e.g. Bolam et al., 2019;Keisler et al., 2014). The value of perfect information is useful in providing an upper bound on the value of information (e.g. Moore et al., 2011). However, perfect information (or even perfect partial information) about ecological systems is never achieved, due to the ubiquitous presence of uncertainty about structure, processes and status of a resource system, no matter how extensive the data collection. In this situation, valuation of sample (or imperfect (Clemen & Reilly, 2014)) information has great potential for assessing the value of actual data.
In this paper, we focus on the value of sample information, with approaches that range from simpler to more complex. We first describe different forms of the value of information, in terms of perfect information, partial information and sample information. We then present a general framework for the value of sample information, discuss key components and variants of the valuation process, and offer some potential extensions.

| ME A SURE S OF THE VALUE OF INFORMATION
The value of information measures the potential for new information (monitoring, research, etc.) to improve understanding of a resource system and thereby change its perceived value.
Information value is essentially a comparison of resource value before and after accounting for new information (Howard & Abbas, 2015). Several forms are well established in the literature (Yokota & Thompson, 2004).
The expected value of perfect information represents the value added by the elimination of uncertainty. It measures the difference between an average of optimal values over their likelihoods of occurrence, and optimal value in the presence of uncertainty. The expected value of perfect information is the most prevalent form of the value of information in the environmental literature (Bolam et al., 2019).
The expected value of partial perfect information concerns the value added by information that eliminates one of the several sources of uncertainty. There are some examples of the expected value of partial perfect information in the ecological and environmental literature, but not many (e.g. Davis et al., 2019;Maxwell et al., 2015; Moore & Runge, 2012;Williams & Johnson, 2015a).
The expected value of sample information expresses the potential gain in value resulting from the collection of less-than-perfect information. It is a comparison of the optimal values expected with and without additional information. Though analyses of sample information are fairly common in medical science (Ades et al., 2004), to date there has been relatively little use in the environmental field.
Our emphasis in this paper is on the value of sample information (VSI). Key valuation elements include: 1. a resource system characterized by attributes of either the state of the system or its structure and processes. Examples of system attributes include population size or density, population vital rates, spatial distribution, biodiversity and habitat features; 2. actions that affect the system's status or condition. Examples include selection of hunting limits, introduction or removal of species, habitat manipulation, contaminant clean-up, adaptations to climate change and regulatory actions; 3. a value measure by which to assign value to the outcomes of actions. For example, value could be measured in terms of population survival rate, number of animals, increase in biodiversity, risk abatement, economic profit and opportunity cost; 4. uncertainty about resource status or structure, as characterized by the likelihoods of occurrence of possible system states or system structures and processes. Likelihoods can be identified by statistical assessment, elicitation from experts and other means (Johnson et al., 2017;Runge et al., 2011). Before being updated with new data, likelihoods express uncertainty as prior probabilities in a prior distribution; 5. an 'experiment' that produces data to update uncertainty and improve understanding of the resource. We use the term 'experiment' broadly to include any source of new information.
Experimentation typically produces a new probability distribution for potential data points, which can be combined with the prior distribution to produce posterior probabilities in a posterior distribution.
We note that if there is more than one experiment under consideration, decision-making will include not only the choice of management action a, but also the choice of an experiment e. The addition of experimental alternatives factors into the calculation of VSI (see Section 5.3 below).
In what follows, we assume initially that potential states, actions and hypotheses are finite in number. Later we will discuss the challenges posed by continuous variation and approaches to deal with it. We use the following notation: • resource status is signified by x, with x 1 , . . . , x N the set of possible resource states; • a hypothesis about resource structure and function is indexed by k for a set of K alternative hypotheses; • management action is denoted by a, with a 1 , . . . , a M the set of actions that could potentially be taken.

| UN CERTAINT Y
Value of sample information builds on the presence of uncertainty, as expressed by a distribution of prior probabilities, and the potential for its reduction with new information via posterior updating.

| Sources of uncertainty
Two sources of uncertainty are considered here: uncertainty about the system's structure or processes (e.g. additive vs. compensatory mortality, density-dependent mortality vs. densityindependent mortality); and uncertainty about the system's state (e.g. population size, density, distribution). Uncertainty about resource structure or function is represented by the distribution q = q(k) of hypothesis-specific probabilities q(k). Uncertainty about the resource state is represented by hypothesis-specific probabilities p(x|k) and the marginal distribution p(x) with probabilities p(x) = ∑ k p(x�k)q(k). Uncertainty is reduced through experimentation and sampling, which can identify the system's state more precisely and improve understanding of its process or structure. We use the term 'experiment' generically to describe data resulting from experimental trials, field sampling or access to independent data sources. Let e represent an experiment that produces data about the resource, with e 1 , . . . , e n a set of potential experiments (possibly consisting of a single experiment) and f e (z|x, k) the distribution for data z from experiment e with resource status x under hypothesis k. The data distribution f e (z|x) accounts only for resource state x, and f e (z|k) accounts only for hypothesis k.
We note that other sources of uncertainty, apart from partial observability of the system state and structural uncertainty, can be readily identified. These include partial controllability and environmental variation (Williams, 2011b), as well as other non-physical sources (e.g. linguistic imprecision, subjective judgment and disagreement (Morgan & Henrion, 1990)). We focus here on partial observability and structural uncertainty because of the ubiquity of these uncertainty sources in ecology and natural resources, and the need to address them in management situations.

| Posterior updating of uncertainty
For uncertainty about the system's state, the prior distribution p(x) of probabilities of resource states can be combined with data probabilities f e (z|x) from experimentation to produce new posterior probabilities. A well-known approach is Bayesian updating (Lee, 1989), with posterior probabilities calculated by Bayes' theorem: Under uncertainty about the system's structure or process, prior hypothesis probabilities q(k) can be combined with the data probabilities f e (z|k) to produce the posterior probabilities We note that other methods of posterior updating are possible, including maximum likelihood estimation (Mood et al., 1974) or even the simple replacement of a prior distribution with another distribution from an independent data source. Different updating approaches produce different posterior distributions, and therefore different valuations.
Although updating with discrete distributions can often be handled by summation over discrete states and hypotheses, as above, many natural resource problems involve continuous distributions.
Obvious examples include resource states like animal density and vital rates such as survival and reproduction, which can vary across a continuous range of potential values. Continuous distributions present computational challenges. With Bayesian updating, an important simplification that greatly reduces these challenges involves conjugate distributions of the data and prior probabilities (Hobbs & Hooten, 2015), which produce a posterior distribution differing from the prior only in terms of its distribution parameters.
To illustrate, assume a normal distribution for the system state, x ∼ N( , 2 ), and a lognormal distribution for the data, z ∼ lognormal x, 2 0 . Because the lognormal and normal distributions are conjugate (Hobbs & Hooten, 2015), Bayesian updating produces another normal distribution, (x|z) ∼ N( 1 , 2 1 ), with and The conjugate relationship thus allows one to avoid more burdensome analytical or numerical methods that otherwise would be needed to determine a posterior distribution (see Appendix). Exact expressions of information value for this uncertainty structure are found in, for example, Bickel (2008) and Bhattacharjya et al. (2013).

| VA LUATI O N
A value function representing resource valuation includes arguments that reflect the factors influencing value. In particular, value can be expressed in terms of management actions as well as attributes of the resource system like abundance, diversity and spatial extent. In addition, it may reflect hypothesized structures and processes of the resource system, such as species interactions, community structure, environmental influences and movement patterns. A generic expression of value accounts for the costs of experimentation as well as the net benefits of management: with returns R k (a, x), costs C k (a, x) and net benefits B k (a, x) = R k (a, x) − C k (a, x) for action a, along with costs C(e, z) for experiment e. Because environmental returns and costs often are not expressed in the same units of value, it may be necessary to express returns in units of economic value. For example, it may be possible to describe the biological benefit of population augmentation in terms of dollars, for comparison with the dollar cost of the enhancement. More generally, returns and costs often can be described in terms of 'utility' (Clemen & Reilly, 2014).
Several variations of valuation can be distinguished: • The resource structure is known but its state is uncertain. Then the structural index k in V k (a, x, e, z) can be treated as a fixed parameter, and value can be represented by V(a, x, e, z) = B(a, x) − C(e, z).
• The resource state is known but structure is uncertain. Then state x can be treated as a fixed parameter, and value can be represented by V k (a, e, z) = B k (a) − C(e, z).
• The value function omits costs and records only returns, . Then the focus of decision-making is on increasing the returns to management actions. This is the usual way of measuring value in an analysis of the value of information; overall costs are typically considered only after determining marginal value based on returns (see Section 7.3 for further discussion).
• The value function may omit returns and record only costs, V k (a, x, e, z) = − C k (a, x) + C(e, z) . In this case the decision focus is on choices that can reduce costs.
Because the value function just described is influenced both by actions as well as uncertainty factors (state x and structure k), it may be thought of as a bivariate function, in which the action variable and uncertainty distribution are at least partially independent. In this context one can hold either one or the other variable constant, and investigate not only how average values vary with actions for a given uncertainty distribution, but also how average values vary with uncertainty distributions for a particular action. The latter approach offers an opportunity to expand the concept of value of information itself, as discussed in Sections 5.4 and 5.5.

| SCENARI OS FOR THE VALUE OF SAMPLE INFORMATI ON
In this section, we describe five scenarios for the value of sample information that cover a range of complexity. Each scenario is described in terms of the common elements listed in Section 2, that is, a resource system characterized by attributes of its state or structure, uncertainty about the resource system, management action(s), a value function and new 'experimental' data. To simplify notation, we initially assume uncertainty about the resource system state, as represented by a prior distribution p = p(x) and a value function V(a, x).
We discuss uncertainty about system structure in a later section.
The first three scenarios (1-3) represent standard metrics of the value of sample information-conditional value, expected value and maximum expected value of sample information, as described in, for example, Raiffa and Schlaifer (1961). The last two scenarios (4-5) represent logical extensions of the standard mathematical framework.
In scenarios 1-3, we consider analyses of values that involve the collection of data to inform future management actions. This is the purview of traditional VSI, where decision-making is directly influenced by the data that are collected ( Figure 1). The posterior values, which depend on actions that are directly influenced by new data, are compared to prior values that are determined in the absence of new data. As shown below, one option is to collect the data first, determine optimal actions with the data in hand, and do a posterior assessment of value (scenario 1). Another is to do a pre-posterior assessment that accounts for stochastic data results, before any data are actually collected (scenarios 2-3). With either option, the point is to use data, whether actual or potential, as a basis for future decision-making.
Alternatively, a retrospective valuation involves management actions that have been pre-selected, independent of any new data collection (scenarios 4-5). Though valuation again compares posterior values based on new data to prior values in the absence of new data, the approach differs from a prospective assessment in that management actions are not influenced by the data (Figure 3).
Instead, the perspective shifts to experimental design. A retrospective assessment looks back at decisions already made, and assesses value based on the impact the new data have on the posterior probabilities. The difference between posterior values based on new data and prior values in the absence of new data is useful for guiding the design of data collection, rather than for selecting management actions. Although this approach is uncommon in ecology (but see Bal et al., 2018), it occurs more frequently in medical science (e.g. Willan & Eckermann, 2010; Willan & Pinto, 2005).

| Scenario 1-Conditional value of sample information
In where the subscript refers to the data point z upon which uncertainty updating is based. The difference between these two maximum average values is the conditional value of sample information (conditional on the particular data point z (Raiffa & Schlaifer, 1961)).
This measure is in effect a comparison of the highest value achievable with the additional information z, versus the highest value achievable without it.
A simple example of scenario 1 is the initiation of a new survey of the extent and severity of climate change in a region, followed by the use of survey results to refine a climate adaptation strategy that was previously thought to be optimal on the basis of earlier information.
The difference between the optimal valuations expresses the gain in value from the new survey information.
Scenario 1 is an example of a posterior assessment, in that it is based on specific data z to calculate posterior probabilities. The reference to specific data makes the resulting conditional value of sample information useful for valuation with data known already. However, such a conditional valuation is not particularly useful for determining the benefit of new information before any data are collected. To make the latter determination, we need an analysis that considers the full set of possible data points, as described in scenarios 2 and 3.

| Scenario 2-Expected value of sample information
Scenario 2 builds directly on scenario 1, by recognizing a range of data-specific CVSIs produced in scenario 1 and calculating an average CVSI over the data values ( Figure 1). The result is the expected value of sample information, where the subscript e on the expectation indexes the experiment producing f e (z). The sequence of calculations for EVSI e is highlighted in Figure 2 in terms of decision trees (Clemen & Reilly, 2014), for a problem with two states, two decisions and two experimental outcomes.
An example of scenario 2 might involve a wildlife monitoring plan, and corresponding data distribution, designed to reduce uncertainty about population growth, so as to enhance population sustainability while at the same time allowing hunting. EVSI e measures the enhancement that is expected to result from the monitoring effort.
In contrast to scenario 1, scenario 2 is a pre-posterior assessment that does not require data to be known in advance. By averaging across potential data points, an expected value of sample information can be calculated before any new data are actually collected. Thus, EVSI e allows one to consider the potential benefit of new information in deciding whether to proceed with data collection.

| Scenario 3-Maximum expected value of sample information
Scenario 3 is also a pre-posterior valuation, which builds directly on scenario 2 by considering the expected value of sample information for multiple experiments, each generating its own data distribution f e (z) (Figure 1). A comparison across experiments allows one to identify the experiment with maximum expected value of sample information: with the optimal experiment It is through such a comparative evaluation of different experiments that the expected value of sample information can play a role in sampling and experimental design.
An example of scenario 3 is the selection of a sampling design for climate change mitigation involving several potential activities. Each sampling design and mitigation action is expected to produce a distinct distribution of carbon dioxide emissions, and a comparison of EVSI e values across designs allows one to choose the design that on average will yield the optimal reduction of carbon dioxide emissions.
Each of the foregoing scenarios (1-3) involves the comparative evaluation of actions in determining optimal valuations, with optimal actions a * and a * z that are based on the distributions p(x) and p(x|z). That is, the actions selected are influenced by prior and posterior uncertainties. A further topic for consideration is the influence of uncertainty when actions are selected independently, as in scenarios 4 and 5.

| Scenario 4-Value of sample information with pre-selected actions
In scenarios 4 and 5 the emphasis shifts from management to data collection. With scenarios 1-3, the focus was primarily on management actions, and the adaptation of management with new information. In the next two scenarios, the emphasis is on data collection design. Thus, scenarios 4 and 5 involve pre-selected management actions that are not influenced by the data (Figure 3) data, but the difference between values is useful for guiding the design of data collection rather than selecting management actions. Averaging CVSI z (a 1 , a 2 ) over the distribution f e (z) produces a preposterior assessment of value for experiment e, and maximizing over experiments identifies an optimal data collection design e * (Figure 3).  V(a, x). The only difference is that the action drivers, which can differ between pre-and post-valuations just as they do in scenarios 1-3, are given a priori and are not subject to optimal selection.
Obviating the selection of actions affects the actual value produced, but does not change the meaning of the value function itself. The same value functions that apply in scenarios 1-3 also can apply to scenario 4.
The use of two pre-selected actions in scenario 4 means that CVSI z (a 1 , a 2 ) expresses not only the influence of new data, but also the influence of a change in management. To highlight the marginal value of the data alone, one can use identical prior and posterior actions.

| Scenario 5-Value of sample information with a single pre-selected action
Scenario 5 involves a system under fixed (or no) management, in which new information about the system's status alters the measurement of its economic or other value, without the influence of a change in management. In this context, a 1 = a 2 = a and the value of sample information represents a change in value occasioned  1 , a 2 ), EVSI e (a 1 , a 2 ) and EVSI e * (a 1 , a 2 ). Scenario 5: Computations are identical with those for scenario 4, except that prior and posterior actions are identical: a 1 = a 2 = a 1 1 by auxiliary information alone. Because new information (but not new management) is used to change the value, this scenario is essentially an efficiency analysis (Back et al., 2007). It may also be thought of as an assessment of the loss in value if fewer data were available.
As before, scenario 5 involves updating a distribution of prior probabilities with new data to produce a distribution of posterior probabilities (Figure 3) an optimal sampling design for investigating contaminated lands that are subject to remediation, and measured value in terms of estimator precision (see Table 1). In medical applications, Willan and Pinto (2005) and Willan and Eckermann (2010) used this type Scenario 5 is the simplest of the five scenarios in our spectrum, in that it focuses exclusively on the effect of data in reducing uncertainty and changing value. The contrast of values nonetheless produces a value of sample information that is useful for designing a data-collection effort, even though both prior and posterior values involve the same (or no) management action. Though valuation in this scenario focuses on data collection per se, we emphasize that in ecological applications, the value function is usually tied to conservation objectives rather than to statistical performance as such.

| UN CERTAINT Y ABOUT SYS TEM S TRUC TURE
Thus far we have considered VSI in terms of uncertainty about system state. The treatment of uncertainty about system structure follows the same logic as with system state, except the reference to state x in the foregoing development is replaced by a reference to hypothesis k. The following steps produce scenarios for the five variants of sample information under uncertainty about system structure rather than system state.

| E X TEN S I ON S OF THE FR AME WORK
There are a number of extensions for the scenario framework we described earlier, including hybrids that combine features of different scenarios; spatial applications; the incorporation of system dynamics; valuation based on costs rather than net benefits; trade-offs among system attributes; and other extensions.

| Hybrid scenarios
There valuation V z (a 2 ). Though less useful, this situation might apply when long-standing management that was previously regarded as optimum is adjusted at the direction of an oversight authority, which also mandates follow-up monitoring.
It is straightforward to show that VSI HYBRID(a 2 ) < CVSI z < VSI HYBRID(a 1 ) .
The inequality CVSI z < VSI HYBRID(a 1 ) indicates that the replacement of optimal with fixed (suboptimal) prior actions produces greater information value than CVSI z . On the other hand, the replacement of optimal with fixed posterior management reduces the information value VSI HYBRID(a 2 ) < CVSI z . In particular, management corresponding to HYBRID(a 2 ) is inferior to HYBRID(a 1 ), in that VSI HYBRID(a 2 ) < VSI HYBRID(a 1 ) .

| Spatial applications
In our development thus far we have considered uncertain states with a prior probability distribution p(x), and data z related to the state with conditional distribution f(z|x). The state and data distributions can be combined via Bayes' theorem to produce a posterior distribution p(x|z) showing the influence of the data on the state distribution. The value of sample information in the scenarios described earlier is calculated as a comparison of value averaged over the posterior and prior distributions.
An extension of VSI in a spatial context involves additional structure imposed on the states, data and their distributions. Here we consider a spatial region that is divided into areas s i , i = 1, . . . , n, such that the state x i = x(s i ) for area s i is uncertain with distribution p(x i ).
With n area-specific states, the system state of the region is represented collectively by a vector x � = (x 1 , . . . , x n ). Spatial structure is characterized by a joint probability distribution p(x) with a multivariate mean � = ( 1 , . . . , n ), and a covariance matrix of variances 2 i for state x i and covariances between states x i and x j that may vary with the distance between the areas. Such a framework is especially relevant to remote sensing of landscape features, where multispectral data are recorded over a pixelated area and pixel data exhibit patterns of correlation depending on their proximate locations in the area.
A specific example might consist of a forested region that is gridded into areas s i containing an uncertain amount of woody biomass x i .
A survey of the region produces data z i for some (perhaps all) of the areas, which can be used to update the distribution for the region to p(x|z). A value function V(a, x) = ∑ i V(a, x i ) aggregates the net profit from timber harvest over the areas, with harvest actions that include clear cutting (a 1 ), moderate thinning (a 2 ), or heavy thinning (a 3 ).
Computing the value of sample information for this situation follows the same general pattern as shown in Figure 1, except that averaging to produce V(a) and V z (a) is complicated by the multivariate nature of the states and data over the region (Bhattacharjya et al., 2013;Bickel, 2008). Assume for example a prior distribution for states over the region that is multivariate normal, x ∼ N( with data that are linked to the states by z i = x i + i , ¯∼ N(0 , 2 I¯).
A design F for data collection produces a marginal distribution z ∼ N F ¯, F Σ F � + 2 I¯ for the data, along with the multivariate normal posterior distribution.
for the states (Graybill, 1976). The prior and posterior state distributions are used in V(a) and V z (a) to produce CVSI z , and averaging the CVSI z values over the marginal distribution of z yields EVSI e for the spatial problem.
This formulation includes a number of simplifying assumptions about the value function, state and data distributions, and management alternatives. Even with the simplifications, computing EVSI can be challenging depending on the structure of the covariance matrix and the design used for data collection. There are only a limited number of ecological and environmental applications of the value of sample information in a spatial context. Examples that incorporate spatial covariance structure include weed management (Wiles, 2004), salmon farming (Forsberg & Guttormsen, 2006) and forest information (Kangas, 2010). Eidsvik et al. (2015) provide a detailed technical review of spatial VSI, aimed primarily at the geosciences. Again, we note the pertinence to VSI of spatial patterns of covariation in multispectral data in remotely sensed images of landscapes.

| Dynamic systems
Value of sample information can be extended to include fluctuating environmental conditions and actions over time, in which value is based on system responses to a management strategy rather than to single actions (Williams, 2011a). With uncertainty about system structure, the optimum average value is equivalent to active adaptive optimization (Williams & Johnson, 2018) requires an experiment to get the distribution of data needed for q(k|z) (Williams, 2015). With uncertainty about system state under Markovian state transitions (Puterman, 1994), optimal average value is equivalent to analysis of partially observable Markov decision processes (POMDP) (Kaelbling et al., 1998;Williams, 2009). The comparison of prior and posterior values is then a comparison of POMDP solutions under p(x) and p(x|z) . Examples in fish and wildlife biology that allow for value of information under partial observability include fishermen's decision-making (Lane, 1989), seabird habitat management (Tomberlin, 2010) and resolution of structural uncertainty (Williams, 2011a). Again, the computation of VSI requires an experiment to get the distribution of data needed for q(x|z) (Williams, 2015).
A somewhat simpler framework for iterative decision-making involves sequential experimentation and a stopping rule for its termination (James & Gorelick, 1994). In this situation an initial prior probability distribution and set of potential experiments are used to generate EVSI e * . A positive expected EVSI e * suggests that the gain in knowledge from experimentation will produce a positive net benefit, so a decision is made to conduct the optimal experiment. The resulting experimental data z are used to update uncertainty about system structure or status, which then becomes a new prior probability distribution for use in calculating a new EVSI e * . This process is repeated as long as EVSI e * is positive. When a negative EVSI e * is produced (e.g. the data z are statistical outliers that obscure the actual system state and lower the average posterior value), further experimentation is stopped.

| Cost analysis
In many instances it is useful to reframe VSI in terms of minimizing costs rather than maximizing benefits. For example, Bennett et al. (2018) and Bal et al. (2018) considered sampling costs versus the information value of different monitoring schemes under a limited budget.
In Section 4 we described a generic value function in terms of costs and benefits, which includes costs C(e, z) for experiment e as well as net management benefits By omitting returns R k (a, x), the value function records only the costs of management and experimentation, and the decision focus is on actions and experiments that can reduce costs.
To illustrate, consider the expected value of sample information in scenario 2, in which CVSIs are averaged over the distribution of data values z. As shown in Figure 2, the optimal actions a * and a * z in CVSI z are obtained by maximizing V(a) and V z (a) over the potential actions to get the net benefits V * = V(a * ) and V * z = V z (a * ). If the value function accounts only for the costs C k (a, x) and C(e, z), optimal decision-making seeks to minimize these costs: and All other calculations for EVSI z are the same as described in Section 5.2 and shown in Figure 1.
Thus, a focus on costs follows the same logic as with net benefits, except that costs are minimized, whereas net benefits are maximized. Because minimizing a cost produces the same result as maximizing its negative, for example, both approaches can be described with maximization as in Section 5.1.
There are at least two ways in the literature that costs are handled with VSI. One way is to include the costs (as above), with net benefits and experimental costs actually included in the value function itself. In this case, VSI measures net returns (or losses) that accrue to new information, indicating the relative gains for data collection (Raiffa & Schlaifer, 1961). The other, and more traditional, way is to use a value function that includes only returns, so that VSI highlights the marginal gain in returns from new information, without any consideration of the costs of acquiring and using the information. In a typical traditional application, the VSI value would then be compared to the associated costs, and data collection could be justified by a VSI value greater than the costs of its acquisition. Alternatively, the VSI value itself could be identified as an upper limit of the cost one should be willing to incur for purchase of the information (Maxwell et al., 2015).
It is straightforward to show that these two approaches do not necessarily yield the same results. Consider a situation in which returns for an action a 1 are large but only marginally exceed its costs; and returns for action a 2 are smaller than for a 1 , but nevertheless substantially greater than its costs. Maximum valuation based on net returns R(a) − C(a) identifies optimal action a 2 for a value of R(a 2 ) − C(a 2 ). However, maximum valuation based on returns alone identifies optimal action a 1 for a value of R(a 1 ), producing a net value of R(a 1 ) − C(a 1 ). This difference in overall valuation reflects the difference in value functions, and suggests caution in deciding which approach to take.

| Analysis issues
There are several analysis issues concerning patterns in VSI values.
For example, consider variation in the prior probabilities that serve as the starting point for the conditional value of information in scenario 1, and assume state uncertainty and Bayesian posterior probabilities p(x|z). As uncertainty declines (i.e. p(x) approaches 1 for some state x ′ ), so does the corresponding posterior uncertainty p(x|z), irrespective of the data z. The result is that V(a) = ∑ x p(x) V(a, x) and V z (a) = ∑ x p(x�z)V(a, x) both converge to V(a, x ′ ), and CVSI z converges to In contrast, the magnitude of CVSI grows as uncertainty increases from 0, up to a maximum value in the interior of the probability space for p. This mirrors a similar pattern for expected values of perfect and partial perfect information (Williams & Johnson, 2015). As with EVPI, the use of maximization in determining EVSI induces a partition of the probability space, wherein the same action is used for each p in a given partition . Among other things, this implies that a positive value of EVSI does not by itself suggest a change in action.
In general, system status can be represented with multiple state attributes x = x 1 , . . . , x l , and management with multivariate combinations a = a 1 , . . . , a m of actions. In this situation values are averaged over multivariate probabilities p(x) and p(x|z) to produce V(a) and V z (a), and the maximizations V k (a, x, e, z) = B k (a, x) − C(e, z), Finally, there is a question about the functional form of the value function that will ensure non-negative values of sample information.
One way to ensure non-negativity is simply to impose that condition on results. Another way is to identify positive gradients in the value function such that reducing uncertainty with new data actually leads to increased value. Yet another way is to change the comparison V z a * z − V(a * ) based on optimal prior and posterior actions a * and a * z , to a comparison V z a * z − V z (a * ) that instead uses the posterior distribution {p(x|z)} in both the prior and posterior valuations (McDonald & Smith, 1997). It is straightforward to show that the latter metric is necessarily non-negative. Using the posterior distribution for both values in the comparison isolates the influence of the actions, in much the same way that the analysis in scenario 5 isolates the influence of new data.

| E VPI A S A FORM OF E VSI
In most discourses about value of information the introduction of EVSI occurs after a discussion of EVPI, perhaps because EVPI is analytically simpler. However, there are insights to be gained by beginning with EVSI and developing EVPI from it. Consider, for example, the use of data z from a sampling effort to produce an estimator ⋀ of parameter for an ecological system. Here we use the generic term to represent any system attribute, whether it is a system state, structural feature, or parameter, to express the idea that there is a system feature with uncertainty that can be reduced with additional

| E X AMPLE S IN THE LITER ATURE
In the ecological literature, studies of the value of sample information (e.g. Canessa et al., 2015;Costello et al., 2010;Runge et al., 2011;Sahlin et al., 2011;Williams, 2015), can be described in terms of the essential features in Section 2 and the computations in Sections 5 and 6, as shown by the examples in Table 1. The studies in Table 1  In the Appendix we use an example of frog population translocation from Canessa et al. (2015) to demonstrate the framing of a problem with discrete uncertainty, and an example involving sustainability of a hunted population to demonstrate the framing of a problem with continuous uncertainty.

| D ISCUSS I ON
Most analyses of the value of information concern the valuation of perfect information (or partial perfect information). However, due to the ubiquity of uncertainty about ecological structure, processes and states, updating uncertainty with new but less-than-perfect information-that is, sample information-is especially useful in ecology.
In this paper, we have described several variants of VSI and a common framework for them, which includes system features, a range of actions, value functions, experimental data and uncertainties. Recognition of these variants expands the usual framing of the value of sample information. In particular, scenarios 4 and 5 and their hybrids include pre-selected actions that occur frequently in ecological management, even if such situations typically are not described in terms of VSI.
The value of sample information holds promise in developing areas of ecology such as ecological forecasting and the use of remote sensing in conservation. In forecasting, models can produce the new data needed for Bayesian updating. Thus, understanding expressed by a prior probability distribution can be model-generated (Dakins et al., 1994(Dakins et al., , 1996 and updated by means of model-based predictions (see e.g. Costello et al., 1998). Value is essentially based on knowledge of the system with and without forecasts, and a comparison of the two values represents the value of the forecasting information.
In the use of remote sensing for conservation, satellite data provide the new information for computation and analysis of VSI.
For example, a posterior analysis that compares system status or structure with versus without satellite coverage (e.g. Bernknopf et al. 2019;Macauley, 2006) expresses the value of the satellite coverage that accounts for spatial patterns of covariation in multispectral data. In remote sensing as well as forecasting, the value of new information can be determined prospectively through a comparative valuation before the information is collected, thus permitting prioritization of experimental as well as management decisions.
The variants of the value of sample information outlined in scenarios 1-5 may play different roles in actual decision-making.

Expected values of sample information (EVSI and maximum EVSI)
are forward-looking analyses that show the potential usefulness of new information before it is obtained. They therefore can help in making decisions about experimentation, as shown in the example about testing a proposed population translocation site for disease (Canessa et al., 2015;see Appendix). On the other hand, the conditional value of sample information (CVSI) measures the value of information that is in hand, and hence can be used for looking back at the comparative value of known data. Retrospective analyses with preselected actions (scenarios 4 and 5) are disconnected from active management, in the sense that decisions are made irrespective of the data. However, information value can be useful for experimental design, and there is at least implicit potential for it to guide future management. For example, remote sensing data that allow a new understanding of the ecosystem services provided by a nature preserve (and thus a perceived increase in the preserve's value) might lead to future management or policy changes, if ecosystem service assessments such as those described by the National Ecosystem Services Partnership (2014) are incorporated in decision-making.
As the complexity and expense of resource management grows, deciding whether further investigations are needed will become increasingly important. Wider use of the value of sample information can demonstrate the usefulness of collecting more data, and highlight research and monitoring programmes that are effective in enhancing ecological understanding and outcomes.