I received a copy of the Fox et al. (this issue) Learned Discourse (LD) as it was being proposed for submission. David Fox, Elise Billoir, and I were co-chairs of the Statistical Methods session at the recent SETAC Berlin World Congress; this brief commentary is part of a larger conversation, discussion, and collaboration on the topic.

The Fox et al. (this issue) LD is one of the best short summaries of the statistical issues that I have read. As soon as it is published I will require my students to read it as part of our toxicology program. In several ways I find this proposal more restrictive than what Peter Chapman and I have proposed (Landis and Chapman 2011). Although we called for a ban of the use of hypothesis testing, we left the modeling segment open. Fox et al. have actually set specific criteria for modeling and curve fitting. These criteria could become a part of a reviewer's checklist when reviewing toxicity tests. I think that these criteria are great, but the education curve is going to be steep among the general community.

The way Fox et al. (this issue) addressed common questions and concerns regarding curve fitting was also appropriate. Overall I think the ecotoxicological community has little experience surrounding modeling of almost any sort and are new to many of the fundamental concepts. I also dismiss the issues raised by van der Hoeven (1997) and Newman and Clements (2008) regarding data paucity. Let us do the experiments to get the data and, as Peter Chapman and I pointed out (Landis and Chapman 2011), there are great examples of studies that do exactly that.

However, in screening level risk assessments I rarely see newly generated data. The information is usually from some published source and is often a no observed effect concentration/no observable effect level (NOEC/NOEL) or a toxicity reference value generated from hypothesis testing, probably approximately a 10% to 15% effect level (in other words, not a no-effect level). Years ago in a guideline document written for British Columbia (Canada) our team, which included Peter Chapman's group, suggested that an EC20 or lower be the cutoff even for what were pretty much screening level assessments. So the question is not, Is there toxicity?, but What is the amount of toxicity at each exposure?

In range finding tests, my students use a range of concentrations and we plot the data. The n (sample size) at each concentration is often not high so we can test more concentrations for the same number of organisms and we understand the loss of power. I found plotting the data and plotting a curve more useful for setting up the next set of tests than calculating a NOEC/LOEC in a range-finding exercise. After all, we are most interested in describing (modeling) toxicity at lower concentrations of the toxicant, levels likely to be seen in the environment.

One item that I did try to bring up in my talk at Society of Environmental Toxicology and Chemistry Berlin was that this switch means that the design of toxicity tests will have to be optimized for model construction, not hypothesis testing. This means more concentrations at lower doses (Olmstead and LeBlanc 2005; Rider and LeBlanc 2005).

Finally, my experience is that the ecotoxicological community in general is highly conservative. For instance, although the paradigm change regarding not using NOELs/NOECs occurred decades ago, the approach remains in wide-spread use. In the year since Landis and Chapman (2011), (39) and (14) additional articles have been published in Environmental Toxicology and Chemistry and Integrated Environmental Assessment and Management, respectively (search term NOEC, May 14, 2012). Without a clear reason to change I see little change occurring, hence the call for a ban. Bleaney (2012) concurs that a ban may be the only reasonable impetus for change. I am for an orderly transition to such a ban, but eventually the use of inappropriate hypothesis testing has to stop.


  1. Top of page