Estimating hazardous concentrations by an informative Bayesian approach

Authors


Abstract

The species sensitivity distribution (SSD) approach is recommended for assessing chemical risk. In practice, however, it can be used only for the few substances for which large-scale ecotoxicological results are available. Indeed, the statistical frequentist approaches used for building SSDs and for deriving hazardous concentrations (HC5) inherently require extensive data to guarantee goodness-of-fit. An alternative Bayesian approach to estimating HC5 from small data sets was developed. In contrast to the noninformative Bayesian approaches that have been tested to date, the authors' method used informative priors related to the expected species sensitivity variance. This method was tested on actual ecotoxicological data for 21 well-informed substances. A cross-validation compared the HC5 values calculated using frequentist approaches with the results of our Bayesian approach, using both complete and truncated data samples. The authors' informative Bayesian approach was compared with noninformative Bayesian methods published in the past, including those incorporating loss functions. The authors found that even for the truncated sample the HC5 values derived from the informative Bayesian approach were generally close to those obtained using the frequentist approach, which requires more data. In addition, the probability of overestimating an HC5 is rather limited. More robust HC5 estimates can be practically obtained from additional data without impairing regulatory protection levels, which will encourage collecting new ecotoxicological data. In conclusion, the Bayesian informative approach was shown to be relatively robust and could be a good surrogate approach for deriving HC5 values from small data sets. Environ. Toxicol. Chem. 2013;32:602–611. © 2012 SETAC

INTRODUCTION

The so-called species sensitivity distribution (SSD) methodology has been used since 1996 in the European Community to determine quality criteria such as the predicted no-effect concentration of a chemical substance or for deriving probable impacts in exposed systems. This approach (which was developed by Kooijman 1 and expanded by Van Straalen and Denneman 2, Wagner and Løkke 3, and Aldenberg and Slob 4) is based on the assumption that the species for which ecotoxicological tests results are available are representative of the sensitivity of the rest of the species in an ecosystem. A likely SSD is then estimated from these results, and a concentration that is likely to protect a given percentage of the species can be extrapolated. For the first use of SSDs—that is, for setting quality criteria—the agreed European concentration is HC550%, the hazardous concentration affecting 5% of species with 50% confidence, which is equivalent to 95% of the species being protected with a confidence limit of 50% (in this article, this value will be denoted by HC5). The SSD approach is now widely used for operational risk assessments of chemical compounds, even if it raises a number of fundamental challenges 5, 6. First, the assumption that the data arise from randomly sampled species that represent the actual ecosystem has major flaws in practice 6, 7 because species are usually selected for laboratory purposes. Second, because of the scarcity of available tests, the data used to generate SSDs often represent a mixture of end points (e.g., growth, reproduction, and mortality), each of them being more or less relevant for protecting the target 6. Third, the trophic composition of taxa is heterogeneous within the data for a substance, which limits the relevance of comparative analyses 8. The goal of this article is not to discuss these fundamental limitations of the SSD approach (even if they must be considered), especially those related to the quality of the ecotoxicological data, but to focus on the statistical method used to build SSDs.

Since the SSD concept was first developed, several frequentist statistical methods for estimating distributions and calculating HC5s have been suggested and compared 4, 8–16. These methods differ in the choice of the underlying statistical distribution (empirical distribution, log-normal or log-logistic distribution, and others) and in the method used to estimate the confidence interval (bootstrap, asymptotic theory, and nonparametric statistics). The results of these methods can vary significantly when applied to the same data 17. Although Aldenberg and Slob 4 have derived extrapolation factors for small data sets, frequentist approaches inherently require a minimum quantity of data to guarantee goodness-of-fit. For example, Forbes and Calow 6 proposed that a minimum of 19 and 9 data points are required to estimate the 5th and 10th percentiles, respectively. Thus, the SSD methodology can only be used if there are sufficient chronic ecotoxicological test results for the substance (a minimum of approximately 15). In addition, three taxonomic groups (algae, invertebrates, and vertebrates) need to be represented in aquatic ecosystems. As a consequence, the SSD approach can only be used in practice for a few substances, most of which are metals. When the quantity of chronic data is insufficient, the predicted no-effect concentration estimate is based on an alternative approach that uses assessment factors. Only the lowest no observed effect concentration (NOEC) or effect concentration (ECx) value divided by predefined assessment factors (10, 100, or 1,000, depending on the case) is used to estimate the predicted no-effect concentration, and no confidence intervals can be derived for it.

Alternative approaches that use Bayesian theory have also been proposed for creating SSDs 12, 14, 17, 18. In the Bayesian paradigm, the data sample is fixed and unique, and the distribution parameters themselves are uncertain. For example, if the SSD is assumed to be log-normally distributed, the Bayesian method aims at calculating the joint distribution of the mean and standard deviation of the SSD; in contrast to the frequentist approaches, the mean and standard deviation of the SSD are then probabilistically described. Aldenberg and Jaworska 12 used a Bayesian method to derive extrapolation factors for calculating an HC5 from the sample mean and standard deviation of small data sets. This approach was compared with frequentist methods (maximum likelihood, parametric bootstrapping, and nonparametric bootstrapping) by Verdonck et al. 14. Hickey and Craig 18 proposed modifying these extrapolation factors by introducing an asymmetric LINEX (linear-exponential) loss function for deriving asymmetric extrapolation factors for over- and underpredictions. However, all of these authors used noninformative priors for the means and standard deviations of the SSD. Grist et al. 17 introduced expert judgment to define priors; these expert judgments were used to define an a priori sensitivity index for several of the taxa used to construct the SSD. A panel of experts consulted by the European Food Safety Authority 19 recommended using informative Bayesian approaches to derive HC5s from small data sets, but their approach was theoretical and was not tested on real data sets.

Nevertheless, the experience gained from constructing SSDs for well-known substances allows collecting prior information. For example, Duboudin et al. 8, 20 calculated the chronic and acute SSDs of several substances for which sufficient data were available. If the standard deviation is relatively homogeneous among substances, prior information about the expected variance of the SSD can be defined for a new substance.

Therefore, our objective was to develop and test a Bayesian approach to deriving SSDs and HC5s using informative priors derived from past experiences related to species sensitivity variance. This method was tested on a set of 21 substances; the cross-validation allowed comparing the “true” HC5 (i.e., the HC5 calculated from the complete data set) and the simulated Bayesian HC5 derived from truncated samples.

MATERIALS AND METHODS

Bayesian approach: General considerations

Assume that the SSD follows a log-normal distribution (i.e., a normal distribution for log-transformed data). Log-normal and related distributions (e.g., the log-logistic) have been tested in many articles that address SSD construction 8, 10, 12, 14, and no theoretical justification for choosing other types of distributions have yet been identified. The SSD (after log transformation) can then be described as a normally distributed random variable: equation imageequation image. Within the Bayesian framework, the parameter vector equation image is modeled as a random quantity to which a prior distribution p(θ) is assigned. Suppose that a small sample of n ecotoxicological (log-transformed) data points, X = (x1 … xn), is available. Given the data vector X, our objective is to estimate the parameters of the posterior distribution equation image and then the posterior equation image. Bayesian inference provides a precise updated version (the posterior) of p(θ), denoted by equation image, through the use of Bayes's formula. In essence, this formula computes equation image, where equation image is the likelihood of the data given a parameter vector θ.

In the following paragraphs, we will simplify our notation by omitting the index “SSD” for the mean and the variance of the SSD. We will use µ and σ2 rather than µSSD and σ2SSD. We will also replace the usual parameterization equation image of the Gaussian model with the equivalent parameterization equation image, where equation image. We will also use equation image and s2 to denote the empirical mean and standard deviation of the observed sample X.

The classical approach to Bayesian inference with a Gaussian model 21, 22 is to choose a conjugate prior, which implies, in practice, that prior and posterior distributions have the same parametric form.

A conjugate prior p(µ,τ) for the Gaussian model has the following conditional structure:

equation image(1)
equation image(2)

that is, τ follows a gamma distribution equation image with shape and inverse-scale parameters α and β; and, conditional on τ, the mean µ follows a Gaussian distribution with precision that is proportional to τ. Because the gamma distribution can be parameterized in two different and equivalent ways, to avoid any possible confusion, we specify in the present study that the β parameter for the gamma distribution is a rate parameter, that is, the inverse of a scale parameter. Equivalently, the prior density equation image, mean equation image, and variance equation image of τ are calculated using the following equations(3)equation image

equation image(4)

(5)equation image

The marginal prior distribution of µ is obtained by integrating the joint distribution of (µ,τ) over all possible values of τ. This integration shows that µ has a generalized Student's t distribution with 2α degrees of freedom and location and scale parameters equal to η and equation image, respectively

equation image(6)

which implies that equation image follows a standard Student's t distribution with 2α degrees of freedom. According to this prior, the posterior equation image and the marginal posterior equation image take the following parametric forms:

equation image(7)
equation image(8)
equation image(9)

which are identical to the prior distributions with the parameters updated by

equation image
equation image(10)

Another benefit of this conjugate prior choice is posterior distributions of the quantiles qp of the Gaussian distribution of X with closed forms. More precisely, let qp be the quantile of probability p of X, that is, equation image. Because X is Gaussian with mean µ and precision τ, qp can be written as equation image, where up is the p quantile of the standard Gaussian distribution. It can be shown that the posterior distribution of equation image follows a noncentral T distribution with 2α degrees of freedom and a noncentrality parameter equal to equation image. The expression for the density of this distribution is quite complex, but it can be easily handled by current scientific software, such as R. This result is particularly useful in our ecotoxicology context because the quantity of interest, HC5(θ), is precisely the quantile of the log concentration distribution associated with the probability p = 0.05 (up = −1.645).

It has to be noted that the idea of using an inverse-gamma informative prior for the variance of the SSD is one of the methodologies retained by European Food Safety Authority 19. The Bayesian methodology we followed in the present study is based on the implicit assumption that tolerances of all species are a priori exchangeable. The question of exchangeability has been recently discussed by Craig et al. 23, who proposed introducing two additional parameters for perturbing the mean and the variance of the sampling distribution of the nonexchangeable species, to be used when there is strong evidence against exchangeability. The question of nonexchangeability might have to be taken into consideration in the future for improving our methodology.

Ecotoxicological database and frequentist SSD building

Specifying prior distributions allows incorporation of formally elicited expert opinions or personal beliefs about the model parameters (equation image and τ). A noninformative or vague prior is typically one that assigns equal weights to all of the values (such as a uniform distribution) or one that has a large variance. As indicated above (Introduction), most of the previous studies that focused on Bayesian methods for building SSDs and deriving HC5 10, 12, 14 used noninformative priors. Informative priors are chosen when knowledge of the range of likely parameter values is well defined. In this study, we analyzed past experiences with constructing SSDs for deriving informative priors.

More precisely, SSDs were derived for 21 substances using a frequentist approach. These substances each had at least 10 chronic NOEC-type data points, with at least one point containing data from each of the three taxonomic groups (vertebrates, invertebrates, and algae). These data were taken from the AQUIRE (U.S. Environmental Protection Agency, www.epa.gov/ecotox/), EAT (European Centre for Ecotoxicology and Toxicology of Chemicals, www.ecetoc.org/), Dutch National Institute for Public Health and the Environment (RIVM) 24, and INERIS (www.ineris.fr/index.php?module=cms&action=getContent&id_heading_object=3) databases. The data were initially provided in micrograms per liter and converted into base 10 logarithms. These 21 substances included 11 metals (chromium, copper, zinc, cadmium, lead, nickel, aluminum, arsenic, cobalt, selenium, and mercury), four pesticides or similar chemicals (parathion, lindane, atrazine, and malathion), three other organic compounds (butylbenzyl phthalate, dibutyl phthalate, and dimethyldioctadecylammonium chloride [DODMAC]), and one inorganic compound (boric acid). The number of data points for each substance was variable, ranging between 10 and 79 (Table 1), with an average of 26.

Table 1. Number of data points for each toxicant and each taxonomic group, number of different species represented, arithmetic means, and variances
SubstanceAlgae: Number of data pointsAlgae: Number of speciesInvertebrates: Number of data pointsInvertebrates: Number of speciesVertebrates: Number of data pointsVertebrates: Number of speciesTotal number of data pointsAM (log10): equation imageVariance (log10): s2
2,4-Dichloroaniline5565861930.53
Aluminum422243101.580.39
Arsenic444365142.820.78
Atrazine28151211129521.820.75
Boric acid43921773040.39
Butylbenzyl phthalate733132132.320.07
Cadmium861271714370.480.27
Chromium109127139352.10.72
Cobalt641133101.820.48
Copper171124113814791.120.44
Dibutyl phthalate416332132.750.17
Dieldrin337476170.180.57
DODMAC927332192.10.22
Lead139431411312.220.53
Lindane64141144241.311.47
Malathion2288167260.661.27
Inorganic mercury1111191363360.630.77
Nickel8713920104120.6
Parathion863275181.641.7
Selenium7510553221.870.44
Zinc531471062920.31

Duboudin et al. 8 described different weighting assumptions for deriving SSDs using frequentist approaches that can be used to correct disproportions in the number of data for the three main taxonomic groups. According to the alternatives proposed by Duboudin et al. 8, two frequentist SSDs were then derived for each substance. First, all of the data were used to derive the SSDs and equally weighted, whatever their taxonomic group or redundancy (i.e., different data for the same species). This nonweighted frequentist SSD approach produced SSDs that were directly comparable with the Bayesian SSDs described below. Second, the respective proportions of algae, invertebrates, and vertebrates in the data shown in Table 1 vary by substance. In addition, the ratio of the number of data points to the number of species for each taxa shows that some species are replicated. To facilitate a homogeneous comparison between the substances, the weighting procedures defined by Duboudin et al. 8 were used to derive the SSDs in the weighted frequentist SSD approach. To correct under- or over-representation of some taxa in the data sets for some substances, the data are balanced to make the three taxonomic groups equally weighted in the SSD. To avoid giving more importance to species for which multiple data points are available, each data point is weighted to give each species the same contribution to the SSD. Intraspecies variation is then taken into account, and no species is given more importance than any other. It has to be noted that our Bayesian approach does not take into account intraspecies variation and is then not directly comparable with the weighted frequentist SSDs. This choice was motivated because our goal was to define a method suitable for small databases where there is low probability of finding several data points for a given species. Intraspecies variation was, however, incorporated into a Bayesian method for SSD building by Hickey et al. 25, and the priors that they defined could be incorporated in our approach as a future development.

For each toxicant, the normality of the distribution of the available log-transformed data was tested using the chi-squared, Kolmogorov-Smirnov, and Shapiro tests. Only five of the 21 toxicants (boric acid, copper, dibutyl phthalate, mercury, and lindane) were found to follow nonnormal distributions at the 5% level. The nonnormality was caused by the presence of outliers with low values, and the distribution of the remaining data could be considered normal.

The empirical mean equation image and variance s2 of each SSD (expressed as the log10-transformed values) are presented in Table 1. It may be observed that the range of the standard deviations is relatively limited (from 0.25 for butylbenzyl phthalate to 1.46 for parathion).

Specification of the priors

From the latter observation (i.e., the limited range of the observed variances for the well-studied substances), it is safe to assume that preliminary information on the variance of the expected SSDs of new substances is available. The available information suggests that a reference value τref that does not depend on the particular pollutant can be used. This reference value is interpreted as the “central value” for τ and is a natural choice for the prior precision mean. Several methods may be used to formally translate this expertise into a prior distribution. We propose using a prior that satisfies two desirable conditions: (1) being as systematic as possible, and (2) being conjugate to the Gaussian model.

A classical method for defining a nonarbitrary informative prior is to choose the distribution that maximizes the statistical entropy from the distributions that are compatible with the expert information. This method is motivated by the argument that the maximum-entropy prior contains the least information that is not provided by the expert. In the context of the present study, the available prior information often takes the form of a reference value (interpreted as a mean) for the variance of the SSD. A classical information theory result states that the probability distribution on [0; + equation image] with a given mean τref that maximizes the statistical entropy is the exponential distribution with rate parameter 1/τref 21. Because the exponential distribution is a special case of the gamma distribution (with shape parameter α = 1), the proposed informative prior for τ is conjugate with respect to the Gaussian model. In the following, an exponential prior will be used, based on these considerations.

For a given value of τ, the parameter κ tunes the dispersion of the prior distribution equation image, where µ is the mean: the lower the value of κ, the flatter the prior distribution will be. Intuitively, when κ tends to 0, the prior tends to a uniform distribution over the infinite support of the Gaussian distribution. This consideration motivates the classical choice of setting κ = 0 when creating a noninformative conjugate prior for µ. Of course, this prior is no longer a Gaussian distribution or even a proper probability distribution (we say that the prior is “improper”), but the result of Equation 7 remains valid. Note that if κ = 0, the value of η plays no role in the Bayesian updating formulas (Eqn. 10).

In the following, the τref is chosen using available information about some substances that are considered to be well-known (see next section), while a noninformative prior (κ = 0) is considered for the mean µ.

Cross-validation testing

Our objective was to test whether it was possible to derive relevant SSDs and their associated HC5s through an informative Bayesian approach when only limited data sets are available. To evaluate the reliability of the approach described above, a cross-validation exercise was undertaken.

We assumed that of the 21 substances in our database, one (denoted by k) was unknown; that is, the data vector was limited to a small number of data points (typically fewer than 15), which (according to adopted criteria guidelines) prevented calculating a frequentist SSD. In addition, 20 substances were assumed to be well-known; that is, enough data were available to use frequentist SSD methods. The information collected for these 20 substances allowed calculation of the τref value required by the Bayesian approach described above. The total variance is decomposed into the sum of the intrasubstance variance and the intersubstance variance as follows:

equation image(11)

where equation image is the total variance (i.e., equation image), equation image is the empirical variance for substance i (equation image), equation image is the empirical mean for substance i (equation image), equation image is the global empirical mean for all substances except equation image, and equation image is the number of data points for substance i (equation image).

For substance k, the frequentist SSD was first derived as described above, and the HC550% values were derived from this SSD. In parallel, the HC550% values were derived using the Bayesian approach and the same complete data set. This comparison allowed us to observe whether the Bayesian approach is conservative when the identical data are used to derive safety values.

The second stage of the cross-validation exercise tested the ability of the Bayesian approach to derive relevant HC5 values when the data set is limited. For the unknown substance k, the data samples are then randomly generated as follows: 1,000 samples containing one data point for algae, one data point for invertebrates, and one data point for vertebrates are first randomly sampled, generating samples of three data points belonging to each of the taxonomic groups. Then, 1,000 data points (randomly belonging to algae, invertebrates, or vertebrates) are sampled step by step in the complete data set using a bootstrap process (random sampling with replacement), and 1,000 samples containing from 4 to 15 data points are generated. At the end of this sampling process, 13,000 samples equation image were generated, where p = 1 to 1,000 is the sample index and q = 3 to 15 is the number of data points. For each of these samples, the HC550% values are calculated using the Bayesian approach described above.

RESULTS AND DISCUSSION

Comparison between frequentist and Bayesian HC5 with the complete data sets

As previously described, the HC550% values were calculated using the frequentist (with or without weighting assumptions) and Bayesian approaches and the complete data sets for each of the 21 substances (Table 2).

Table 2. Hazardous concentration affecting 5% of species with 50% confidence (HC550%) values calculated by the frequentist approach, with or without weighting, and by the Bayesian approach with the complete data set for each substance
SubstanceHC550%: Nonweighted frequentistHC550%: Weighted frequentistReference prior precision τrefBayesian E(HC5|x)
2,4-Dichloroaniline58.579.97.3.10−141.7
Aluminum3.67.27.16.10−17.9
Arsenic23.316.77.35.10−120
Atrazine2.54.96.94.10−12.3
Boric acid9424598.93.10−1822
Butylbenzyl phthalate76.484.57.14.10−131.7
Cadmium0.420.547.36.10−10.35
Chromium57.47.1.10−14.6
Cobalt4.71.97.17.10−13.2
Copper11.56.8.10−11
Dibutyl phthalate114.792.97.24.10−162.2
Dieldrin8.8.10−20.127.48.10−17.10−2
DODMAC2125.27.1.10−113.3
Lead10.57.77.1.10−19.2
Lindane0.210.37.3.10−10.21
Malathion6.4.10−23.2.10−26.8.10−11.4.10−2
Inorganic mercury0.150.187.47.10−10.14
Nickel5.23.537.10−14.7
Parathion0.315.8.10−27.3.10−10.33
Selenium5.94.857.1.10−14.8
Zinc11.8167.10−19.4

Before comparing the frequentist and Bayesian HC5s, the variability of the HC550% with the weighting assumptions chosen for the frequentist methods was analyzed. The ratio between the nonweighted frequentist and weighted frequentist SSDs is close to 2 for some substances (aluminum, atrazine, cobalt) and can even reach a factor of 5.3 for parathion. Respective nonweighted frequentist and weighted frequentist SSDs for these four substances are presented in Figure 1. Such variability has already been observed by Duboudin et al. 8 and can be explained by the heterogeneity in the number of data points for the three taxonomic groups and by redundant data for some species (e.g., several data points for Daphnia magna). These results show that further comparisons between the frequentist and Bayesian results must be considered with intrafrequentist variability in mind.

Figure 1.

Nonweighted frequentist and weighted frequentist SSDs for aluminum, atrazine, cobalt, and parathion. [Color figure can be seen in the online version of this article, available at wileyonlinelibrary.com]

Regarding the Bayesian approach, a previous comparison between equation image (which can be analytically calculated from Eqns. 7–10) and equation image (i.e., the posterior median of HC5, which can be calculated using a specific function of the R software) was undertaken for the complete data set of each substance. It was observed that the ratio between these values does not exceed 10%, showing that equation image can be used as a reliable surrogate for equation image. The advantage of using equation image is that it can be analytically calculated and easily used in further regulatory applications, while equation image requires a specific function of the R software. Therefore, only the equation image values were analyzed further.

If we compare the Bayesian equation image with the nonweighted frequentist HC550%, we find good agreement between these two approaches; they do not differ by a factor greater than 2, except for four substances. The main discrepancies concern aluminum, malathion, butylbenzyl phthalate, and dibutyl phthalate. The empirical variances calculated from the data sets for butylbenzyl phthalate and dibutyl phthalate are low compared to the other substances (0.07 and 0.17, respectively). This result can be explained by redundancies in the data set; there are 13 data points for each of these substances, but they originate from only six species (see Table 1). It follows that some data points were obtained from the same species and that the empirical variance most likely underestimates the actual variability that would be obtained using a more comprehensive sample of species. In this case, the Bayesian approach corrects this underestimation by incorporating (higher) prior expected variances. Because the posterior variance is higher than the purely empirical variance, the Bayesian equation image produces a more conservative value. Aluminum also shows a low empirical variance due to redundancy in the data set (10 data points for seven species). In this case, the Bayesian approach again corrects the possible underestimation of the variance by incorporating prior knowledge. The Bayesian equation image is also close to the result from the weighted frequentist approach, showing that the intrafrequentist variability can be as large as the difference between the frequentist and Bayesian approaches. Malathion has a high variance compared to the other substances, and a detailed analysis of the data set shows the presence of one low-value outlier. The high variance can also be explained by the specific mode of action of this pesticide, which particularly targets invertebrates. This example shows that specific modes of action can generate breaks in data sequences and misfits or outliers when forced into an SSD. The approach presented in the present study has then to be used with caution in such cases where higher variances can be expected. Nevertheless, in such cases, the Bayesian approach appeared to be conservative, the Bayesian equation image being significantly smaller than the frequentist HC550%.

Effect of sample size on Bayesian HC5

In the following discussion, the HC550% calculated using the nonweighted frequentist approach was considered to be the reference HC5 because it corresponds to the approach described in the European Technical Guidance Document and is generally used by regulators to determine safety limits. Thus, the reference HC5 was selected in the present study for regulatory rather than purely scientific reasons. Our goal was not to establish a hierarchy between the three HC5 values presented in Table 2, which were generated using different approaches and assumptions, rather, our goal was to compare our new approach with those currently employed by end-users and regulators. The nonweighted frequentist HC5s are then not seen here as a gold standard but only as the values currently adopted by regulators and thus suitable for relative comparison purposes.

To evaluate the consistency of the Bayesian approach for estimating suitable HC5 values, three criteria have to be examined in order of priority. The first is the probability of overestimating the reference HC5. Because the reference HC5 is currently accepted by regulators for establishing safety limits, it is essential to estimate to what extent the Bayesian approach may overestimate it. In ecotoxicological risk assessments, it is better to underestimate the HC5 than to overestimate it because overestimation would potentially put greater than 5% of species at risk. The second is the ability of the Bayesian approach to calculate relevant HC5 values (i.e., values close to the reference HC5), even for small data sets. The third is the probability of underestimating the reference HC5. Even if underestimation can be considered to be less critical than overestimation, it is essential to evaluate the impact of additional data to reduce the probability of underestimation. From a practical perspective, a slight increase in the HC5 that results from additional data can lead to a large reduction of costs (e.g., remediation costs) without impairing regulatory protection levels.

All of these issues are illustrated in Figure 2. For each substance, the equation image ratio was calculated for the 13,000 samples, equation image, where p = 1 to 1,000 is the sample index and q = 3 to 15 is the number of data points. For each q (q = 3 … 15), we obtained the mean value and the 10th and 90th percentiles from the 1,000 corresponding samples. It should be noted that the sampling design we selected for generating the equation image samples (i.e., bootstrapping with replacement) is a conservative approach to estimating the variability of HC5 for a given q value (i.e., the range between the 10th and 90th percentiles). Because of the replacement process, the lowest and highest ecotoxicological data can indeed be used several times within a given sample, amplifying the probability of under- or overestimation of the HC5 value. The ranges presented in Figure 2 can then be seen as maximum ranges of variation in the HC5 calculation.

Figure 2.

The E(HC5 x)/reference_HC550% ratio for all of the tested substances and data set sizes (3–15 data points). E(HC5 x) is the point estimate of HC5 (the posterior mean) calculated using the Bayesian approach, and reference_ HC550% represents the best estimate of HC5 calculated using the frequentist nonweighted approach.

The common symbols are used for all the substances and are indicated in the panel for zinc. [Color figure can be seen in the online version of this article, available at wileyonlinelibrary.com]

Even for small sample sizes, the probability of overestimation (as shown by the 90th percentiles) is rather limited. It is at a maximum for lindane and parathion with q = 3 (overestimation by a factor of approximately 6). For parathion, adding new data points (even one data point, as for q = 4) significantly limits the probability of overestimation. As far as lindane is concerned, it can be observed that this substance shows one of the highest variances (see Table 1). The overestimation observed for lindane can be explained by the bootstrap process for generating samples; as indicated above, some data can be resampled several times in a given sample, and this method is especially sensitive to the presence of large outliers (which are present for lindane). For the other substances, the probability of overestimation is limited (less than a factor of 3), even for small sample sizes. These results show that our Bayesian approach can be consistently used to predict HC5s.

For the majority of substances, the mean value calculated by the Bayesian approach is close to the reference HC5.

The probability of underestimation is higher than the risk of overestimation. This result can be seen as a positive feature of our approach, given that it is preferable to underestimate rather than overestimate HC5s for protection purposes. It can be observed that adding new data generally reduces the probability of underestimation, which encourages generating new ecotoxicological data (in contrast to the current assessment factor approach). The reduction in the underestimation probability as the data points increase can be explained by the change in the posterior variance with the sample size. For small sample sizes (e.g., j = 3), the posterior variance is dominated by the prior variance, which was chosen to be large compared with the empirical variance. When new data are added to the sample, the posterior variance is more sensitive to the actual empirical variance of the data and generally decreases. This evolution in the variance is illustrated for two substances (cadmium and DODMAC) in Figure 3, which represents the ratio between the posterior and empirical variances. The posterior variance is much higher than the empirical variance for small sample sizes (by a factor of 3.5–4.5); however, increasing the number of data points dramatically decreases the posterior variance, which becomes closer and closer to the empirical value. Because the quantile q5 (HC5) is calculated by the formula equation image, reducing the posterior variance equation image results in reducing the risk of underestimation.

Figure 3.

The E(Var | x)/EmpiricalVar ratio for cadmium and DODMAC using 3 to 15 data points.

E(Var | x) is the point estimate of the SSD variance calculated using the Bayesian approach, and EmpiricalVar represents the empirical variance calculated from the complete data set for the substance. [Color figure can be seen in the online version of this article, available at wileyonlinelibrary.com]

Comparison with the assessment factor approach and noninformative Bayesian approaches

As indicated above (see Introduction), noninformative Bayesian approaches have been proposed so far to derive HC5s from small data sets. We selected two noninformative Bayesian approaches to compare with ours, which uses informative knowledge of the variance.

The first noninformative Bayesian approach was proposed by Aldenberg and Jaworska 12. They calculated the probability equation image, where equation image and equation image are the unbiased sample mean and standard deviation of the log-toxicity data, equation image and equation image are the unknown sample mean and standard deviation, equation image is the (100–p)th percentile of the normal distribution (e.g., u5 = 1.6445), and equation image is an assessment shift factor (equation image was considered in the present study). This probability was shown to be equivalent to equation image, where equation image is a random variable distributed with a noncentral T distribution with n – 1 degrees of freedom and the noncentrality parameter equation image. Aldenberg and Jaworska 12 then established a table presenting one-sided lower (95%), median (50%), and upper (5%) confidence/credibility limits for equation image by setting equation image to the required levels (i.e., 0.05, 0.5, and 0.95).

Hickey and Craig 18 extended this approach by introducing nonsymmetric LINEX loss functions, which asymmetrically penalize under- or overestimation. Considering that overestimating HC5 is more critical than underestimating it, the LINEX loss function considers loss to increase approximately linearly on one side (underestimation) and exponentially on the other side (overestimation); hence, it is nonlinearly asymmetric. The LINEX approach is based on a scale parameter, α, that controls the asymmetry of the loss function; as α increases, the conservatism increases—that is, overestimation is penalized more severely than underestimation and vice versa. The value of α depends completely on the risk assessor and his or her willingness to punish overestimation. Similar to Aldenberg and Jaworska 12, Hickey and Craig 18 proposed a table of values for equation image under several assumptions for the α values (i.e., α = 0.5, 1, 3, and 5). Notice that the median and the mean, proposed here as point estimates for the HC5, can be formally justified with respect to the approach of Hickey and Craig 18, which is rooted in Bayesian decision theory. They are obtained by replacing the LINEX function with a symmetric piecewise linear and a quadratic loss function. Both losses used in the present study are symmetric and equally penalize under- and overestimation. Eliciting appropriate loss functions for a specific environmental context can be seen as an interesting aspect of the present study.

In addition to these Bayesian approaches, the assessment factor approach was tested because it is the approach currently adopted by regulators for calculating HC5s using small data sets. For small data sets, an assessment factor is applied to the lowest ecotoxicological data; it supposedly addresses all of the uncertainties associated with extrapolating from single-species laboratory data to a multispecies ecosystem. In our case, at least three chronic data points across three different trophic levels (i.e., algae, invertebrates, and vertebrates) are available, and an assessment factor of 10 is normally applied to the lowest data.

All of these approaches—the assessment factor approach, Aldenberg and Jaworska's approach with three equation image levels (0.05, 0.5, and 0.95), Hickey and Craig's approach with four α values (0.5, 1, 3, and 5), and our approach—were used for each substance and for each of the 13,000 samples np,q. As previously described, the mean value and the 10th and 90th percentiles were calculated for each q value (q = 3 … 15). Example results are presented for three substances (cadmium, atrazine, and DODMAC) and four sample sizes (3, 4, 7, and 10) in Figure 4.

Figure 4.

The E(HC5x)/reference_HC550% ratio for cadmium, atrazine and DODMAC with 3, 4, 7 and 10 data points, calculated using eight different approaches (from left to right: Assessment factor; Aldenberg 95%; Aldenberg 50%; Aldenberg 5%; Hickey alpha 0.5; Hickey alpha 1; Hickey alpha 3; Hickey alpha 5; this work). [Color figure can be seen in the online version of this article, available at wileyonlinelibrary.com]

When the most conservative approaches (i.e., Aldenberg and Jaworska's approach with equation image= 0.95 and Hickey and Craig's approach with α = 5 or α = 3) are used, the probability of underestimation remains high until q = 7. It is possible to underestimate the HC5 value by more than two orders of magnitude. From a regulatory point of view, such an observation is important because it can lead to overregulation and prevent defining reliable priorities for chemical regulation and monitoring.

If low Aldenberg and Jaworska's and Hickey and Craig's parameters, equation image and α, are chosen instead (i.e., if equation image= 0.05 and α = 0.5), the probability of overestimation can exceed an order of magnitude for small samples (see atrazine for j = 3 or 4) and can remain significant even if the sample size increases.

Generally, the assessment factor approach is actually conservative, but it is possible for it to overestimate the reference HC5 (see atrazine for j = 3 or 4). It can also be observed that, as expected, this approach is more conservative when the number of ecotoxicological data points increases, which discourages generating new data. Stakeholders (e.g., industry) are not encouraged to produce new data when they do not increase safety levels without impairing protection levels.

In conclusion, we observed that the approaches of Aldenberg and Jaworska and of Hickey and Craig were highly sensitive to parameter values that have to be chosen by risk assessors. For example, taking α = 0.5 or α = 3 can dramatically modify the HC5 values. In addition, the assessment factor approach does not encourage generating new data. By comparison, the Bayesian informative approach presented in the present study was shown to be more robust (considering that our robustness criterion is the comparison with the reference HC5 currently accepted by regulators, but that can be subject of debate). The risks of over- and underestimation are both relatively limited, and the HC5 values are generally close to the reference values, even for small data sets.

CONCLUSIONS

Bayesian approaches are promising tools for deriving chemical safety levels using small data sets because they can merge prior knowledge and actual information (even if scarce) derived from experimental observations. However, only noninformative Bayesian approaches have been proposed to date, and such methods are highly sensitive to the risk assessors' choices of loss-function parameters. We used knowledge of the expected SSD variances to develop an informative Bayesian approach that we tested through a cross-validation exercise with actual ecotoxicological data. We found that even for small sample sizes, the probability of overestimating HC5 values is rather limited. By contrast, the probability of underestimation is greatly reduced as the number of data points increases. From a practical perspective, increased HC5 values can then be obtained from the additional data without impairing regulatory protection levels, which encourages generating new ecotoxicological data. Finally, the HC5 values were generally close to the reference values, even for small data sets. In conclusion, the Bayesian informative approach presented in the present study was shown to be relatively robust and could be a better surrogate approach than the assessment factor method for deriving HC5 values from small data sets.

Acknowledgements

This study was supported by the French ministry in charge of ecology and sustainable development, within the framework of Programme 190, and by the National Research Agency, within the project AMORE (contract 2009 CESA 15 01).

Ancillary