1. Top of page
  2. DEAR SIR:
  3. References

Recently, aquatic semifield experiments have been criticised in the scientific literature for their lack of clear criteria for acceptability of effects and their power to detect effects at the population level (e.g., De Jong and Montforts 2006). In this letter, I wish to highlight that the lack of clear evaluation criteria is not a problem of semifield experiments alone, but of ecological risk assessment (ERA) in general, since responsible risk managers have not as yet described or proposed protection goals in terms of quantifiable measurement endpoints. In fact, we should be happy that these tests force risk managers (and stakeholders) to further investigate the fundamental questions of cause and effect. I also wish to encourage evaluation of statistical power at the community level, rather than at the population level; aquatic semifield experiments are often not performed to specifically evaluate effects on specific populations. Instead, these experiments should be seen in the context of evaluation of overall community responses.

Evolution of current U.S. and European practices

The use of microcosms and mesocosms (hereafter referred to as “cosms”) in aquatic ERA started in the U.S. in the 1970s by pioneers like Hurlbert (e.g. Hurlbert et al. 1975). Cosms were increasingly used in the 1980s but, unfortunately, many of the older US mesocosm studies were difficult to interpret due to a high variability, their size (very large), and the presence of fish and ‘demonic intrusion’ sensu Hurlbert (e.g. frogs). The studies were also considered to take too much time and resources. Therefore, the US Environmental Protection Agency stopped their requirement in 1992, and took the route of improving laboratory-based risk assessment (De Jong et al. 2005) and placing more emphasis on mitigation and monitoring to demonstrate the adequacy of the mitigation.

Europe took a different direction, focusing instead on improvement of cosm studies. Several workshops were organized in which the experimental design of the studies, their statistical analysis, and interpretation of their results were discussed. This led to the production of several guidance documents, such as Higher-tier Aquatic Risk Assessment for Pesticides (Campbell et al. 1999), Community-Level Aquatic Systems Studies-Interpretation Criteria (Giddings et al. 2001), Effects of Pesticides in the Field (Liess et al. 2005) and European Union (2002). In addition, these guidance documents stimulated research on the spatio-temporal extrapolation of results of freshwater cosm studies performed with pesticides (e.g. Van Wijngaarden 2006; Brock et al. in press), the development of effect models (e.g. Van den Brink et al. 2005) and the extrapolation of effects observed in cosms to the field (Liess et al. 2005).

Acceptability of effects

A common criticism of higher tier semifield experiments is that, in contrast to the lower tiers, they lack sound criteria for acceptability of effects (e.g. De Jong and Montforts 2006). In the past, risk managers decided that the first tier risk assessment procedure as described in the Uniform Principles (EU 1997), based on the sensitivity of fixed standard test species and assessment factors, results in acceptable risks. Since one does not know which risks to field populations are associated with the outcome of the first tier risk assessment under 91/414, this means that policy criteria are also lacking at the lower tiers. The lack of setting of clear criteria at these lower tiers is concealed by the use of supposed worst-case assessment factors.

It is, therefore, a blessing in disguise of higher tier studies that the quest for these criteria becomes clear. By this quest one is forced to answer the fundamental questions of ERA and registration of chemicals; that is, which risks on which effects are we willing to accept?

I don't wish to elaborate on the acceptability debate, but since ecology has no morality, certainly disciplines other than ecotoxicology and ecology alone have to be included (Crane and Giddings 2004; see Crane et al. 2006 for a constructive onset for this discussion). The effect classes as described in EU (2002), and further elaborated by Brock et al. (in press), are an attempt to objectively classify the treatment-related effects observed in cosms. On the basis of consensus reached amongst risk managers with respect to relevant protection goals (e.g. regulatory acceptable time frame for recovery), these effect classes can be adapted accordingly.

Power to detect effects

An issue that is increasingly discussed is the statistical power of cosm experiments to demonstrate effects at the population level (e.g. Sandeson 2002; Wogram 2005; De Jong et al. 2005). By putting this forward one has to take into consideration that cosm experiments are often performed to evaluate the effects on communities rather than populations. Therefore, the power of these experiments should also be calculated at the community level and not necessarily at the population level. When cosm experiments are criticised for their low power to detect effects at the population level, they are being criticised for something for which they are often not designed. Of course, cosm experiments are sometimes performed to evaluate effects at the population level, for example, on macrophyte taxa (Hanson et al. 2003). But if particular invertebrate taxa such as Gammarus or Asellus are of concern, then one should be sure that these taxa are present in high enough numbers equally divided over the replicate cosms to ensure high enough statistical power (Wogram 2005). Alternatively, one could perform a population study.

Interestingly, it is possible to calculate the multivariate power of cosm experiments (Timmerman and Ter Braak 2006), but initiatives in this direction unfortunately have not received much attention. So although I completely agree with the quests of Sandeson (2002) and Wogram (2005) for more attention on the power issue when evaluating cosm experiments, the level at which the statistical power should be calculated is, in my opinion, different (i.e., community instead of population level). Besides this, the power argument is premature until one decides what sizes of effects are meaningful to detect; i.e., which sizes of effects should be considered as unacceptable (Sanderson and Petersen 2001).

Concluding remarks

Ecotoxicology, and herewith ERA, should permit itself to grow out of its “single species” armour, to grow to a more complete and mature science. With that I mean that often the wrong data are gathered (i.e., the sensitivity of individuals of one species in an artificial environment) for an ERA, which should be focused on the protection of populations and communities in the field. The misfit between the question of ERA and the answer provided by single species tests is concealed by the use of large assessment factors.

I, therefore, disagree with the statement by De Jong et al. (2005) that “field tests do not necessarily give more or better results than laboratory tests.” In cosm experiments, more or less natural species assemblages are exposed to a stressor under a realistic exposure scenario. In my opinion, cosm experiments always give a more honest view on the risk assessment problem, which is about protecting populations and communities that are present in a variable environment and have the ability to recover.

In addition, recent studies indicate that the results of properly performed cosm experiments stressed by single or repeated applications of pesticides can be fairly well reproduced (see e.g. Van Wijngaarden 2006; Brock et al. in press). Since the protection aims of the registration procedure of pesticides in Europe (EU 1997) are set at the population and community levels and short term effects on these populations and communities are considered to be acceptable, we have to permit and admit ecology into our field, whether we like it or not. We should, therefore, also allow population and community level considerations into our experimentation, whether we like it or not. If we fail to reject the conservatism of testing individuals in single species tests and the use of assessment factors, then we risk converting ERA from a field of science to an art of bookkeeping.

I recognize that, due to economic and practical reasons, we will continue to use the results of single species tests in the coming decades. But this does not justify criticising cosm experiments. Cosm experiments force regulators to answer the right questions (what is acceptable). In my view, cosm experiments provide the opportunity for the science to progress by finding ways to quantify the assessment endpoints which are of relevance to regulators. Moreover, a recent opinion of a PPR Panel (EFSA 2006) showed that, when evaluated by experts, cosm experiments yield interpretable results and can provide unambiguous answers.

Maybe we should change the old joke of the spelling checker changing mesocosms to masochism to a new one: including the buzzword “omics” into our field of science and rename the field “mesocomics”. It is, indeed, fun to work on the interface of ecotoxicology and ecology!


  1. Top of page
  2. DEAR SIR:
  3. References
  • Brock TCM, Arts GHP, Maltby L, Van den Brink PJ. In press. Aquatic risks of pesticides, ecological protection goals and common claims in EU Legislation. Integr Environ Assess Manag.
  • Campbell PJ, Arnold DJS, Brock TCM, Grandy NJ, Heger W, Heimbach F, Maund SJ, Streloke M. 1999. HARAP guidance Document: Higher-tier Aquatic Risk Assessment for Pesticides. Society of Environmental Toxicology and Chemistry (SETAC)-Europe, Brussels, Belgium.
  • Crane M, Giddings J. 2004. “Ecologically Acceptable Concentrations” when assessing the environmental risks of pesticides under European Directive 91/414/EEC. Human and Ecological Risk Assessment 10: 733747.
  • Crane M, Norton A, Leaman J, Chalak A, Bailey A, Yoxon M, Smith J, Fenlon J. 2006. Acceptability of pesticide impacts on the environment: what do United Kingdom stakeholders and the public value? Pest Manag Sci 62: 519.
  • De Jong FMW, Mensink BJWG, Smit CE, Montforts MHMM. 2005. Evaluation of ecotoxicological field studies for authorization of plant protection products in Europe. Human and Ecological Risk Assessment 11: 11571176.
  • De Jong FMW, Montforts MHMM. 2006. Workshop critical effect values for field studies with pesticides. National Institute for Public Health and the Environment. SEC report 10308a00, Bilthoven, The Netherlands.
  • EFSA. 2006. Opinion of the scientific panel on plant health, plant protection products and their residues on a request from the EFSA related to the aquatic risk assessment for cyprodinil and the use of a mesocosm study in particular. The EFSA journal 329: 177.
  • [EU] European Union. 1997. Council Directive 97/57/EC of September 21, 1997; Establishing annex VI to Directive 91/414/EEC Concerning the placing of plant protection products on the market. Official Journal of the European Communities L265: 87109.
  • [EU] European Union. 2002. Guidance document on aquatic toxicology in the context of the Directive 91/414/EEC. Working Document of the European Commission Health & Consumer Protection Directorate-General. Brussels, Belgium.
  • Giddings JM, Brock TCM, Heger W, Heimbach F, Maund SJ, Norman SM, Ratte HT, Schafers C, Streloke M. 2002. Community-Level Aquatic Systems Studies—Interpretation Criteria: CLASSIC. Society of Environmental Toxicology and Chemistry (SETAC), Pensacola, FL, USA.
  • Hanson ML, Sanderson H, Solomon KR. 2003. Variation, replication, and power analysis of Myriophyllum spp. Microcosm toxicity data. Environ Tox Chem 22: 13181329.
  • Hurlbert SH. 1975. Secondary effects of pesticides on aquatic ecosystems. Residue Reviews 57: 81148.
  • Liess M, Brown C, Dohmen P, Duquesne S, Hart A, Heimbach F, Kreuger J, Lagadic L, Maund S, Reinert W, Streloke M, Tarazona JV. 2005. Effects of pesticides in the field. Society of Environmental Toxicology and Chemistry (SETAC). Brussels, Belgium.
  • Sanderson H. 2002. Pesticide studies. Replicability of micro/mesocosms. Environ Sci & Pollut Res 9: 429435.
  • Sanderson H, Petersen S. 2001. Power analysis as a reflexive scientific tool for interpretation and implementation of the precautionary principle in the European Union. Environ Sci & Pollut Res 8: 16.
  • Timmerman ME, Ter Braak CJF. 2006. Bootstrap Confidence Intervals for Principal Response Curves. Submitted.
  • Van den Brink PJ, Brown CD, Dubus IG. 2005. Using the expert model PERPEST to translate measured and predicted pesticide exposure data into ecological risks. Ecological Modelling 191: 106117.
  • Van Wijngaarden RPA. 2006. Interpretation and extrapolation of ecological responses in model ecosystems stressed with non-persistent insecticides. [PhD Thesis]. Wageningen, The Netherlands: Wageningen University.
  • Wogram J. 2005. Aquatic higher-tier tests in regulatory ecotoxicology—between easy way out and dead end. Platform presentation at the 5th ECOTOX Conference, 2005 Dec 8–9, Cologne, Germany.