Proposing a vigorous hybrid of the LOECs and ECx
Standardized toxicity tests are greatly oversimplified representations of the toxicological processes that occur when organisms encounter contaminants in the real world. However, toxicity tests belong to that select group of models that, despite being wrong, remain useful. The information captured by standardized toxicity testing has provided a quantitative, scientific basis for environmental regulation and risk assessment of thousands of substances in jurisdictions around the world.
Current practice in analyzing and reporting toxicity test data is to calculate estimators of effect, expressed as exposure levels associated with some biological response of interest following a particular duration of exposure. These values are then used to derive toxicological benchmarks for applications such as risk assessment and environmental regulation. For these applications, the key attributes of the estimator are toxicological relevance (i.e., the associated biological response must be known, and ideally of an acceptably low magnitude) and statistical confidence (i.e., uncertainty in the estimated concentration must be known, and ideally of an acceptably low magnitude). Unfortunately, the estimators in common use, based either on ANOVA-type hypothesis testing (i.e., no-observed-effect concentrations [NOECs] and lowest-observed-effect concentrations [LOECs]) or regression-type curve fitting and interpolation (e.g., effect concentration at a set percentage [ECx]), do not consistently exhibit these attributes.
The past decades have seen numerous pleas for researchers to eschew NOECs/LOECs in favor of reporting point estimates including a call for an all-out ban on the publication of NOECs/LOECs (Landis and Chapman 2011). The impetus for the strong position against NOECs and LOECs stems from the well-known limitations of these values. Among other things, NOECs/LOECs are constrained to predefined test concentrations, are strongly influenced by within-treatment variability, and can be associated with any possible magnitude of effect (the magnitude of which is typically not reported). Essentially, NOECs/LOECs are slaves to experimental design and statistical confidence, at the risk of retaining perilously little toxicological information.
The current alternative to NOECs/LOECs has quite the opposite limitations. By fixating on a predefined effect size, regardless of the statistical properties of the data, the ECx becomes a slave to (apparent) toxicological relevance, at the cost of statistical confidence. In fact, the toxicological relevance conveyed by standardizing the effect size x is largely illusory—the ecological implications of a 10% effect are considerably different for different test endpoints; for example, whether the effect is growth or reproduction, whether reproduction is expressed as mean offspring or cumulative offspring, or whether the offspring in question are mink or water fleas. Furthermore, x has a tendency to shrink under pressure—in situations where environmental concern is greatest, there can be a desire to base management on a smaller effect size, and this further reduces confidence in the derived value exactly when it is most desirable. An EC10 or an EC5 or an EC1 can always be calculated, but in the majority of cases these values likely are not reliable estimates of the effect sizes they purport to represent. This is because, as x becomes smaller, the confidence limits on ECx balloon to include unrealistically low exposures and the precision of this point estimator dwindles. Estimators of low x are also sensitive to small differences in the performance of reference or control organisms. Variability in control performance within and among tests is an accepted part of toxicity testing, but the implications of this variability for estimates of low ECx values are rarely considered.
The respective limitations of NOECs/LOECs and ECx values propagate through their application to set toxicological benchmarks. Benchmarks based on compiled NOECs/LOECs represent a blend of varying effect sizes, whereas benchmarks based on ECx values represent a blend of high-confidence and low-confidence values with unknown statistical properties. We propose a hybrid alternative that retains the strengths of the NOEC/LOEC and the ECx. Our hybrid also retains some of the limitations of its progenitors, but these limitations are more fully disclosed, such that their implications are more difficult to ignore. Various methods of calculation are possible, but at least some methods are simple and readily carried out given output of standard toxicological analysis software.
The approach we propose is to report the lowest effects concentration (LEC), and the modeled effect size associated with that concentration (subscript “x”), given a predefined level of statistical confidence that this effect size differs from the control or baseline response. This value differs from the ECx in 2 important ways. First, the effect size associated with the LECx is not predefined; it is a statistical outcome of the test. Second, that effect size is determined by the desired level of statistical confidence (the selected probabilities of Type I and Type II errors), rather than the reverse. Like the LOEC, the LECx is defined as a value with known and acceptably low statistical uncertainty. However, the LECx improves on the LOEC in that it can be interpolated from a fitted response curve (i.e., it is not constrained to test concentrations). The LECx also improves on the LOEC in the clear disclosure of the magnitude of effect associated with the reported concentration. The toxicological relevance of the derived value will depend on the data and the type of effect being measured, but will in any case be disclosed in the subscript x.
Various methods can be used to calculate the LECx, depending on experimental design, statistical curve-fitting considerations, and how the desired level of statistical confidence is defined. We present below 2 simple calculation methods based on commonly used data analysis methods. If the LECx gains traction in the toxicological community, we are confident that qualified statisticians will develop much more robust methods that improve on these simple approaches.
The first simple calculation method can be applied to toxicity tests with replicate test organisms at a series of specified treatment levels. In this method, x is defined to be the percent minimum significant difference from the control (PMSD), given a specified level of desired statistical power in an ANOVA-type comparison. The PMSD is calculated from the Dunnett's statistic or other appropriate hypothesis test, per approaches outlined in USEPA (2000) and Environment Canada (2005) and subject to the considerations outlined by Denton et al. (2003). Most software applications for toxicological data analysis report the PMSD or the information required to calculate it. The LECx is simply the toxicant concentration associated with that PMSD, reported with the x explicitly specified. The LECx and associated confidence bands can be calculated from the fitted concentration–response curve or other interpolation procedure (USEPA 2000; Environment Canada 2005; OECD 2006).
The second simple calculation method can be applied to toxicity tests without replication. In this method, x is defined to be the 95th percentile confidence band on the baseline response, and the LECx is the modeled concentration associated with x. This method of calculating the LECx is equivalent to that proposed by Chèvre et al. (2002) for calculating a statistical no-effect concentration (SNEC), and is subject to the considerations outlined in Chèvre et al. (2002). However, whereas Chèvre et al. (2002) considered the SNEC to represent the highest concentration for which the modeled effect is not statistically different from the baseline response (and therefore represents no effect), we consider this value to be the lowest concentration for which the modeled effect is significantly different from the baseline response. Essentially, this reflects the concentration at which the dose-response curve departs from the range of performance of control and low-dose organisms. The LECx and associated confidence bands can be calculated in this manner using methods outlined in Chèvre et al. (2002).
The scientific community seems to be approaching consensus that the limitations of NOECs and LOECs are insupportable, and a change is needed. However, a wholesale shift to ECx approaches simply trades one set of limitations for another (Jager 2011). Traditional NOEC/LOEC approaches were initially adopted because they have merit, and much thought has subsequently gone into ways to capitalize on this merit and ameliorate the limitations inherent in null hypothesis-based interpretation of toxicity test data (e.g., USEPA 2000; Environment Canada 2005; OECD 2006). Abandoning NOECs and LOECs leaves behind the desirable statistical confidence that has supported their continued use despite decades of criticism.
We applaud Landis and Chapman (2011) for providing the toxicology community with a renewed rallying cry for change. However, what we believe should be the objective of that change—to adopt a standard point estimator with the desirable attributes of toxicological relevance and statistical confidence—is only partially achieved by the ECx. We suggest that the LECx, reported in conjunction with ECx values, achieves a better balance between these desirable attributes, and that it does so in a manner that facilitates a toxicologically and statistically honest interpretation of toxicity data.