Furthering the derivation of predictive wildlife toxicity reference values for use in soil cleanup decisions


  • Editor's Note: This paper represents 1 of 5 articles generated from a workshop entitled “Ecological soil levels: next steps in the development of metal clean-up values” (September 2012, Sundance, Utah, USA). The purpose of the workshop was to provide managers and decision makers of contaminated sites in North America with appropriate methods for developing soil clean-up values that are protective of ecological resources. The workshop focused on metals and other inorganic contaminants because of their ubiquity at contaminated sites and because their natural occurrence makes it difficult to determine adverse levels.


The development of media-specific ecological values for risk assessment includes the derivation of acceptable levels of exposure for terrestrial wildlife (e.g., birds, mammals, reptiles, and amphibians). Although the derivation and subsequent application of these values can be used for screening purposes, there is a need to identify toxicological effects thresholds specifically for making remedial decisions at individual contaminated sites. A workshop was held in the fall of 2012 to evaluate existing methods and recent scientific developments for refining ecological soil screening levels (Eco-SSLs) and improving the derivation of site-specific ecological soil clean-up values for metals (Eco-SCVs). This included a focused session on the development and derivation of toxicity reference values (TRVs) for terrestrial wildlife. Topics that were examined included: methods for toxicological endpoint selection, techniques for dose–response assessment, approaches for cross-species extrapolation, and tools to incorporate environmental factors (e.g., metal bioavailability and chemistry) into a reference value. The workgroup also made recommendations to risk assessors and regulators on how to incorporate site-specific wildlife life history and toxicity information into the derivation of TRVs to be used in the further development of soil cleanup levels. Integr Environ Assess Manag 2014;10:358–371. © 2013 The Authors. Integrated Environmental Assessment and Management Published by SETAC


Derivation of exposure levels to protect wildlife is an integral component of regulatory ecological risk assessment. Environmental media exposure levels, such as the ecological soil screening levels (Eco-SSLs) established by the US Environmental Protection Agency (USEPA), were developed to facilitate initial screening of chemicals found at contaminated sites for common pollutants (USEPA 2005). The Eco-SSL program used a systematic process to identify and qualify appropriate toxicity data to derive toxicity reference values (TRVs) that were intended to be protective of most species of wildlife. Concurrent with the development of Eco-SSLs, significant advancements in the fields of soil ecotoxicology and ecological risk assessment (particularly for metals and metalloids) have been realized (see reviews by Merrington and Schoeters 2011; Smolders et al. 2009; USEPA 2007a). In the present work, we review recent advances in the practice of wildlife toxicology and risk assessment, with specific emphasis on establishing protective levels of exposure for the purposes of remediation of metal contaminated soils. Furthermore, we explore the application of new approaches to progress from generic soil screening values designed to be protective of any species (e.g., Eco-SSLs) to site-specific ecological soil clean-up values (Eco-SCVs) directed toward particular species of concern at individual sites.

Development of soil contaminant thresholds protective of wildlife (defined herein to include birds, mammals, reptiles, and amphibians), whether in the United States or other countries, consists of 2 primary components: exposure estimation and toxicity evaluation. The focus of this article is on wildlife toxicity assessment, whereas Sample et al. (this issue2013) examines advances related to wildlife exposure estimation for development of Eco-SCVs. USEPA defines a toxicity reference value (TRV) as a “dose above which ecologically-relevant effects might occur to wildlife species following chronic dietary exposure and below which it is reasonably expected that such effects will not occur” (USEPA 2005). The equivalent metric developed in Europe is termed the “probable no effect concentration” (PNEC) and is regarded as a concentration (or dose) below which an unacceptable effect will most likely not occur (ECB 2003). The current method for deriving USEPA's Eco-SSLs combines the lowest available TRV with modeled exposure dose estimates to approximate a concentration in soil considered safe for all species of wildlife (USEPA 2005). One the other hand, TRVs used for developing Eco-SCVs specific to individual contaminated sites will be derived for only those species present on the site and directed toward the ecologically relevant endpoint for each species.

The TRV procedures established by other countries (Europe, Canada, Australia) share some similarities with the United States (ECB 2003; USEPA 2005; CCME 2006; CSIRO 2009). Typically, the TRV (or equivalent toxicological benchmark) is based on a no observed adverse effect level (NOAEL) from repetitive, long-term oral toxicity studies generally using soluble metal salts and representing a sensitive endpoint (e.g., reproductive impairment). Protective soil concentrations are then based on either the lowest available TRV or through using a species sensitivity distribution (SSD), often with the addition of uncertainty factors to account for interspecies variability and laboratory-to-field extrapolations. Although this general approach has been standardized and adopted by various regulatory entities for screening level ecological risk analyses, such an approach is not suitable for developing a site-specific Eco-SCV as it fails to incorporate the effect of local environmental factors, bioavailability differences between laboratory and field diets, or dose–response information needed for parameterizing population models or making risk policy choices (Kapustka 2008; DeForest et al. 2012). It also assumes that exposure is a continuous condition and occurs at a constant rate and is unaffected by ecological interactions with the environment (e.g., habitat preferences or feeding frequency). Numerous recent publications have discussed the importance of each of these factors: environmental parameters (McLaughlin and Smolders 2001; USEPA 2007a; Zhao et al. 2007); bioavailability adjustments (Anderson et al. 2012; DeForest et al. 2012); dose–response information (Chapman et al. 1996; Crane and Newman 2000; Allard et al. 2010; Landis and Chapman 2011), and cross-species extrapolation (Newman et al. 2000; Raimondo et al. 2007; Awkerman et al. 2008, 2009). Furthermore, whereas existing Eco-SSLs for wildlife have focused exclusively on birds and mammals, recent toxicity data for amphibians and reptiles have become available such that these receptors can be examined in greater depth in regards to protective soil levels (James et al. 2004a, 2004b; Johnson et al. 2004; Johnson et al. 2007; Bazar et al. 2008, 2009, 2010; McFarland et al. 2008, 2009, 2011; Sparling et al. 2010).

This discussion presents the results of a workshop focused on moving from soil screening values (Eco-SSLs) to site-specific ecological soil clean-up values for metals (Eco-SCVs), incorporating new scientific advancements into the development of wildlife TRVs. We review international methodologies for establishing wildlife soil criteria and discuss inherent limitations and uncertainties. Furthermore, we review recent scientific publications that provide improvements upon existing methods and recommendations to improve toxicity threshold derivation procedures for use in site-specific risk assessment. The focus of this article is specifically on refinements that can be made to the selection, analysis, and interpretation of toxicity data in the calculation of wildlife TRVs, as refinements of dose (environmental factors and bioavailability adjustments) were discussed in a different workgroup (Sample et al. this issue2013).


In this section, we review the current state-of-the-practice for wildlife TRV development in the United States and elsewhere, with a focus on evaluation and use of laboratory toxicity information. We recognize that these procedures were developed to establish screening level values for initial site evaluation, and there have since been advances in the science of wildlife toxicology. However, they provide valuable data and information that can be used as a starting point when developing site-specific protection goals, so it is instructive to briefly review the available approaches and their (current) limitations.

United States (ecological soil screening levels)

USEPA's Eco-SSL process established standardized protocols for conducting literature searches, data quality evaluation, dose conversion, and selection of final TRVs (USEPA 2005). Toxicity data extraction and dose conversion is complex due to inconsistencies in test protocols, dosing regimens, duration of exposure, and species differences. The final database consists of a uniform set of oral daily doses (reported as mg·kg−1·d−1) for each species and for biochemical, behavioral, physiological, pathological, reproductive, growth, and mortality endpoints.

Final TRV point estimates are selected from databases of toxicity thresholds for avian and mammalian species. Thresholds consist of no observed and lowest observed adverse effects levels (NOAELs and LOAELs). Final Eco-SSL TRVs are set equal to the geometric mean of NOAEL values for growth and reproduction or the highest bounded NOAEL below the lowest bounded LOAEL for growth, reproduction or survival (USEPA 2005). To derive a TRV, the Eco-SSL guidance requires a minimum of 3 NOAEL or LOAEL results for at least 2 test species for either growth, reproduction, or survival effects (USEPA 2005). TRVs are then combined with a food web model for 3 ecological receptors (herbivore, insectivore, and carnivore) to calculate a soil concentration that represents a level that does not result in adverse effects in any of the species represented in the toxicity database (and presumably species for which no data are available).

Europe (secondary poisoning)

The European method for wildlife risk assessment used for the registration of new compounds (termed secondary poisoning) has evolved since the initial algorithms were developed (Romijn et al. 1991; Van de Plassche 1994) to implementation under formalized guidance (ECB 2003) and increasing use as a consequence of new regulations (i.e., Registration, Evaluation, Authorisation and Restriction of Chemicals [REACH]) enacted in 2007 (ECHA 2008). It should be noted that the scope of the European secondary poisoning assessment is for the registration of chemical substances, as opposed to the evaluation of contaminated sites. Although there are some considerations for site-specific soil conditions, the assessment is explicitly generic. This method involves the derivation of a predictive no effect concentration for oral exposure (PNECoral). The PNECoral, in the context of secondary poisoning, is a level below which food concentrations are not expected to pose a risk to birds or mammals (ECB 2003; ECHA 2008). The risk assessment guidance recommends that only long-term toxicity studies reporting on dietary and oral exposure are relevant as the pathway for secondary poisoning and refers exclusively to the uptake through the food web (ECB 2003; ECHA 2008). The recommended effect endpoints under this framework include no observed effect levels (NOEL) (i.e., concentrations or doses) for mortality, reproduction, or growth. The final PNECoral is ultimately selected from the available toxicity data with the application of assessment factors (or uncertainty factors) that account for interspecies variation, extrapolation from acute or subchronic exposures to chronic exposures, and extrapolation from laboratory organisms to field organisms. European guidance also allows for 2 levels of assessment (Tier 1 and Tier 2) (ECB 2003; ECHA 2008). In Tier 1, PNECoral values are set equivalent to the lowest NOECoral divided by a default assessment factor of 30, and any species-specific differences in food ingestion rates and body weights are not taken into account. Although in Tier 2, PNECoral values can be derived based on species-specific food ingestion rate-to-body weight ratios for birds and mammals considered to be more relevant for the exposure scenarios under evaluation. In addition to species specific exposure modifications under Tier 2, bioavailability of the chemical can be factored into the assessment. It is recognized that toxicity studies with metals (i.e., metal salts) may overestimate the bioavailability of (and exposure to) the compound in the diet of the test species (ECB 2003; ECHA 2008). Thus, a relative absorption factor (RAF) is used in secondary poisoning assessments to refine the PNEC to account for more realistic bioavailability scenarios.

Canada (soil quality guidelines)

Wildlife risk assessment in Canada also applies the secondary poisoning concept for developing soil quality guidelines (SQGs) (CCME 2006). The intent is to account for secondary poisoning in higher trophic level organisms in the terrestrial food web. However, soil quality guidelines are only developed for agricultural land use and only consider the herbivore foraging guild (e.g., grazing livestock and wildlife). Consequently, the selection of an oral TRV (termed daily threshold effects dose [DTED]) is established from toxicological data for grazing and foraging species. The Canadian guidance recommends a minimum of 3 toxicological studies be considered and at least 2 of these must be oral mammalian studies and one should be an oral avian study (CCME 2006). Furthermore, a maximum of 1 laboratory rodent study and consideration of data for a grazing herbivore (e.g., ungulates) with a high ingestion rate to body weight ratio is recommended. Field data may also be considered in conjunction with laboratory data if available (CCME 2006). The final DTED used for developing the SQGs, is represented (if possible) by the lowest effects level (EC25). However, in most cases, the lowest available LOAEL is used due to limitations in data availability and/or quality (CCME 2006). Uncertainty factors (UFs) from 1 to 5 are also applied to the DTED (based on professional judgment) to account for biological significance, exposure duration (e.g., subchronic to chronic extrapolation), meeting minimum data requirements, and taxonomic representation (i.e., less than 3 groups) in the underlying database (CCME 2006).

Australia (ecological investigation levels)

The method for developing ecological investigation levels (EILs) for soil in Australia is under development (CSIRO 2009). The current draft method focuses on soil invertebrates and plants and, at present, does not use complex food web models to estimate soil concentrations protective of wildlife. This exception is due to a recognized lack of Australian species-specific data necessary for the models (CSIRO 2009). In the current draft guidance, wildlife receptors are only considered for contaminants showing biomagnification potential. In cases where a chemical is determined to biomagnify in the food web, a more conservative level of protection is set for EILs established for soil invertebrates or plants (CSIRO 2009). Lack of detailed methods to evaluate wildlife is considered a significant limitation of this approach (CSIRO 2009).


Existing TRV derivation methods contain several conservative assumptions and have a number of inherent uncertainties (Allard et al. 2010). During development of the Eco-SSL process many of these limitations were discussed (USEPA 2000). Examples include exclusion of some data sets due to the strict scoring scheme, lack of data for amphibians and reptiles, and use of point estimates rather than distributional estimates of toxicity. Because Eco-SSLs were developed as a screening tool to remove contaminants with negligible potential for risk, many of the simplifying assumptions to ensure conservatism were intended. However, these same assumptions limit their application to more detailed risk analyses and ultimately site remediation. Several of the key limitations in these approaches are further explored below.

Toxicity reference value methods limit toxicological data to long-term exposure studies using standard organisms. For some chemicals, available toxicological data are sparse, and strict quality screening protocols may further limit the data available for soil criteria development (USEPA 2000, 2005). Acute and subacute toxicity studies and field studies are largely restricted in these frameworks. However, beyond the screening risk assessment phase, these data may provide valuable information to incorporate in weight-of-evidence analyses or to refine judgments for exposure and effect assumptions. Use of acute data or data of limited quality in the development of TRVs should only occur when the data support their use, and any assumptions should be clearly documented (Allard et al. 2010).

Ecological endpoint selection under current practices is focused on a limited set of toxicological endpoints (i.e., only growth, reproduction and [chronic] mortality deemed ecologically-relevant) (USEPA 2005). However, when developing Eco-SCVs, these endpoints should be reviewed at each site and if necessary alternative endpoints explored that match the goals described in the problem formulation of the ecological risk assessment (Allard et al. 2010). Furthermore, the soil screening levels developed under existing regulatory programs rely on developing point estimates for these ecological endpoints (i.e., NOAELs and LOAELs). Use of this approach in ecological risk assessment has been widely debated (Chapman et al. 1996; Crane and Newman 2000; Allard et al. 2010; Landis and Chapman 2011). Alternative statistical approaches have been gaining wider acceptance and should be used to analyze the full dose–response relationship rather than narrow the assessment to an effect level that is an artifact of study design.

The Eco-SSL process relies on numerous assumptions (e.g., body weights and food ingestion rates) to convert the administered exposures (usually dietary concentrations) to a uniform daily oral dose (USEPA 2005). The conversion process can result in variable estimates of dose, depending on the judgments of the risk assessor, resulting in imprecise estimates of NOAEL or LOAELs (McDonald and Wilcockson 2003; Mayfield and Fairbrother 2013). Although the protocols developed for the Eco-SSLs standardized the exposure conversion assumptions, these may not be applicable to all sites and should be reviewed after the screening phase of the risk assessment. Sample et al. (this issue2013) further explore the effects of changing exposure assumptions on soil screening levels.


Because of advancements in the science of wildlife toxicology and our understanding about how risk assessors use TRVs, we can identify areas where their derivation can be improved, regardless of their use (screening vs clean-up values) (Allard et al. 2010; Mayfield and Fairbrother 2013). In this section, we review the types of decisions made when selecting literature from which TRVs are derived, as well as how to combine data into effects models. We build off the uncertainties noted in existing methods (described above) and the discussion in Allard et al. (2010) to suggest the most appropriate path forward for TRV derivation (summarized in Table 1).

Table 1. Summary of improvements useful in refining toxicity-based criteria for remediation
Refine species of concernNarrow species of concern to those on the site that did not pass the screening assessment.Consider choosing either high profile species or species representative of feeding guilds that were not screened out during the SLERA.
Cross species extrapolationsChoose and/or select data from species that best represent species of concern. Consider mode of action and physiological similarities, especially in gastric physiology.Strive to explain likelihood of adverse effect in species of concern using knowledge of toxicology and physiology.
Endpoint selectionSelect toxicity endpoints that are related to site-specific routes of exposure and species-specific natural history.Implement conceivable adverse population consequences into endpoint selection (e.g., reproduction or mortality-driven population dynamics).
Use of critical body or tissue-specific TRVsAscertain the possibility of using a tissue (e.g., blood, liver) threshold for adverse effects.Tissue-specific values are less variable in describing the probability for adverse effects than oral estimates. Use physiologically based pharmacokinetic (PBPK) models for extrapolation across species, when available.
Acute and chronic valuesDevelop both acute and chronic thresholds for use in different exposure scenarios.Short-term exposures during dispersal, migration, or other movement patterns would result in acute exposure, whereas residence on the site is better modeled with chronic values.
Determine biological versus statistical significanceReduce data set to only include adverse effects that are outside the range of normal variability.Requires comparative review of primary literature and summary of range of normal values.
Bioavailability adjustmentAdjust data to account for higher bioavailability of metal salts typically used in laboratory studies.In the absence of such data, consider information from human health studies to include a qualitative approach in the uncertainty discussion.
Refine threshold for adverse effectsModel relevant data using Benchmark dose models, or EDx values from dose-response curves.Use dose response relationships to develop predictive estimates for population-level effects. Use NOAEL or LOAEL values only if no other data are available or dose-response assessment is not applicable.
Integrate field dataUse field data as alternative lines of evidence to support or discount the relevance of specific endpoints of toxicity and accuracy of threshold predictions.Field observations may provide information about bioavailability or whether effects actually occur at predicted exposures.

Endpoint selection

Primary adverse responses in wildlife exposed to contaminants are those that affect mortality and reproduction, as these are typically considered the drivers for population persistence, growth or decline (USEPA 2005). Although it is relatively simple to select studies that directly measure mortality rates or some obvious part or parts of the reproductive cycle (e.g., pregnancies, egg production, or birth and hatch rates), it is less obvious how to choose among adverse outcomes that indirectly affect survival or reproduction. For example, exposures to compounds that affect neurobehavior and produce laboratory observations (e.g., lethargy and ataxia) can conceivably result in mortality under natural conditions as they will influence predator avoidance. An understanding of the species and community-specific population regulatory mechanisms may help in the selection of ecologically relevant adverse effects. For example, if a population of small mammals is primarily regulated by reproduction rate (i.e., to offset the normal high mortality due to predation), then metal-induced changes in productivity will have the greatest effect on population persistence. If populations are food limited, then energetic demands of somatic maintenance and repair resulting from toxicity may subsequently reduce offspring production (Gill and Elliott 2003). Impairment of learned and innate behavior may detrimentally affect breeding success (Hoogesteijn et al. 2005; Frederick and Jayasena 2010). Other adverse sublethal effects should be included if they can be related to changes in the fitness parameters of most importance to the species. Changes to enzyme functions, induction of metallothionein, gene expression, or other similar biomarkers that cannot conceivably be related to changes in survival or productivity rates can be informative about exposure but are not relevant for establishing clean up values if the goal is population persistence. Furthermore, due to homeostatic mechanisms within and among individuals, measurable changes at a suborganismal level may not translate directly into population level responses (Ferson et al. 1996; Grant 1998). Therefore, a strong basis for making such a connection is necessary before use of any endpoint in the TRV derivation process. Any hypothesized connections that require multiple steps may result in too much uncertainty in the ecological relevance of a suborganismal change to result in selection of such an endpoint for regulatory purposes. Tarlow and Blumstein (2007) critically reviewed several endpoints used or suggested previously and ranked, from good fitness indicators to those that simply indicate environmental disturbance, as follows: breeding success, mate choice, fluctuating asymmetry, flight initiation distance, immunocompetence, glucocorticoids, and cardiac response. They conclude that there is no single optimal endpoint that quantifies ecologically-relevant effects. Therefore, selection should be for a suite of endpoints most appropriate for the species and area of concern considering their ecology.

Although growth is a well-established endpoint in aquatic toxicology where many organisms are indeterminate growers and larger size can be correlated to greater survival and fecundity (Mebane and Arthaud 2010), most wildlife species are growth limited. Once adult size is reached, growth stops and has no influence on reproduction or survival rates (note that fluctuations in body mass occurs on an annual basis in many species to cope with cyclic climatic changes, but this is different from continual growth over time) (Sebens 1987). For some species, slower initial growth can be compensated later, such that adult size or morphology is still achieved within the normal time frame (Schalk et al. 2002). This has been shown to be the case for some contaminant exposures (Fairbrother et al. 1994). Conversely, a reduction in growth may be an adaptive, useful response in maintaining a population that is resource limited. Furthermore, growth monitored from controlled laboratory toxicity studies frequently employ ad lib feeding regimes, which are typically not representative of environmental situations. Therefore, selection of growth as an endpoint should be carefully reviewed in the context of the species of concern.

The most relevant toxicity endpoint is dependent on the attribute that is most critical to regulating the population in question. Some species (r-selected), most notably small mammals, have high reproductive rates to offset short life-spans and high mortality rates as a result of high predation pressures. Therefore, the population is vulnerable to reduced reproductive rates. On the other hand, K-selected species are relatively long-lived and have multiple opportunities in their life time to successfully reproduce, so adult survival is more likely to be the population-limiting factor. Often, for any specific population, regulating mechanisms are unknown, regardless of the species of interest (e.g., may depend on predator composition of the community). Therefore, for TRVs to be appropriate, site-specific ecological information is needed where sensitivity as well as the relevance of endpoint is assessed. This selection of studies and data used to derive TRVs requires a degree of professional judgment and thus may be more useful when setting site-specific cleanup goals than when developing generic screening TRVs.

Biological significance versus statistical significance

Data used in TRV derivation are largely gleaned from the primary literature (USEPA 2005). Many of the studies incorporate designs not intended for the development of environmental risk assessment criteria. Statistical differences in measurements of health-related criteria are derived relative to treatment levels. Inferential statistics (such as analysis of variance [ANOVA] or Student's t tests) are strongly influenced by the amount of variability in the metric being measured and the sample size (i.e., number of replicates). Therefore, a study with high variability and low replications may not show any statistical differences among treatment, even when some of the animals within a treatment are exhibiting biologically relevant changes. Conversely, a very high replication may result in statistical differences among treatments when all animals are responding within the normal range of variability. One way to ascertain whether either of these cases is occurring is to compare the measurements in all exposed animals to ranges of values considered to be normal for that species (e.g., species-specific normal ranges for blood cell or chemistry data). It is incumbent on the author and/or TRV developer to review the chemical study data together with what is known to be normal for the species under study to reach a finding of an adverse effect, regardless of statistical significance. This may involve using various lines of evidence to converge on a toxicity effect concentration (e.g., packed cell volume, red blood cell counts, hemoglobin concentration are all used to determine a finding of anemia). However, for many nontraditional laboratory animal species, use of historical data may not be an option and a weight-of-evidence analysis should be employed (i.e., consider mode of action of the contaminant given the functional moieties of the molecule; review field data when available to determine if effects are observed at concentrations above estimated thresholds; and review of information from similar species).

Laboratory toxicology models

As previously mentioned, wildlife animal models useful for controlled laboratory toxicity investigations are limited, but are constantly expanding. Regardless, given the effort studies require and ethical considerations about live animal toxicity testing, species-specific data are few. Still lacking, are studies conducted under environmentally stressful conditions (e.g., extreme temperatures) that would be more representative of natural exposures. These limitations result in uncertainties when extrapolating from laboratory data to species of concern at contaminated sites. However, extrapolation requirements differ depending on the phase of the assessment process. Early screening necessitates that data from a few species be extrapolated to many (any possible species that might use the site for some or all of the year) although site-specific, detailed assessments will apply the data from appropriate surrogates to the specific species of concern.

Data from traditional laboratory animal models (e.g., rats, mice, or rabbits) are common, primarily from their use in human health risk assessment, as are data (from inorganic substances) on livestock (e.g., cattle, horses, or poultry) due to nutritional considerations. Use of isogenic strains of rats and mice typically produce data less variable and more sensitive than their outbred wildlife counterparts; however, data from both can be useful in TRV extrapolations. Studies with outbred strains (or wild species) typically result in greater variability, which may make statistical inferences more difficult. An analysis of statistical power can help ascertain whether the lack of statistical differences lies in the variability associated with the measurement or true differences in sensitivity. This type of evaluation can reduce the potential for accepting Type II errors (false negatives). Currently available mammalian wildlife laboratory models include voles (Microtis spp.), shrews (Sorex spp.), New World mice (Peromyscus spp.), and mink (Neovison vison) (Smith et al. 2006; Basu et al. 2007).

Laboratory animal models for birds are largely restricted to granivorous (gallinaceous birds, e.g., chickens, quail, pheasant, and turkeys) and some predatory species because of issues associated with feeding, although there has been some success with brief captive studies with migrant songbirds. Examples of established wildlife models include northern bobwhite (Colinus virginianus), Japanese Quail (Coturnix japonica), mallard (Anas platychynchos), doves (Columba spp.), and some finch species (e.g., house finch [Carpodacus mexicanus] and zebra finch [Taeniopygia guttata]) (Johnson et al. 2005; USEPA 2012a). Model predatory species include the eastern screech owl (Megascops asio), great horned owl (Bubo virginianus), and American kestrel (Falco sparverius) (Marteinson et al. 2010; Rattner et al. 2012).

Examples of amphibian models include exotic and native species. African clawed frog (Xenopus laevis), the Northern leopard frog (Rana pipiens), and other frog models (Anura spp.) are available and can be used for exposures from egg through tadpole and adult metamorphosis (Bleiler et al. 2009). Toad (Bufo) models have also been used with success (James et al. 2004a, 2004b). However, many species are mostly used for aquatic exposures which is outside the scope of this article. Some salamander species have been used successfully in terrestrial toxicological testing (Ambystoma maculatum, Ambystoma tigrinum) (Johnson et al. 1999, 2004, 2007; Bazar et al. 2010). Although some species are entirely terrestrial, quantification of exposure through the oral route may be inconsequential, because dermal exposures may be more significant for some compounds (Johnson et al. 1999); therefore, studies often track media concentrations, not oral exposure estimates (see review by Sparling et al. 2010). The impact of oral exposures to environmental contaminants for these species is currently unknown.

Laboratory reptile models are limited. Studies have been conducted in captive-bred colubrid snakes (e.g., Elaphe guttata) (Jones and Holladay 2006). Because many species do not eat on a daily basis, oral exposure estimates are best based on weekly intake of food for young and monthly intake for adult individuals. The western fence lizard (Sceloporus occidentalis) has been gaining use in the conduct of controlled laboratory toxicity studies (Talent et al. 2002; McFarland et al. 2008, 2009). Individual lizards can be orally exposed daily for up to 60 days to a compound in a liquid vehicle through the use of a pipette and evaluated for adverse outcomes. Other lizard models include anoles (Anolis carolinensis) (Talent 2009) and lacertid lizards (e.g., Podarcis Ambystoma maculatum, Bocage's Wall Lizard) (Amaral et al. 2012). Turtle models have include young snapping turtles (Chedylra serpentina) and red-eared sliders (Chrysemys scripta) in laboratory settings (de Solla et al. 2006; Sparling et al. 2010). Together, they represent several useful laboratory models from which to develop toxicity data; however, there remains the need for standardized methods.

Field studies

Field studies provide additional lines of evidence that may be useful in refining TRV estimates. Although most studies are descriptive of effects from field observations and estimates of exposure are lacking, they can provide corroborative results. For example, if severe effects are predicted at the measured soil concentration, but there are robust resident populations of target species, then toxicity may have been overestimated, exposure underestimated, or the population has been able to compensate. Most field studies address marked adverse outcomes in a course-grained manner (wildlife epidemiology) and are of too short a duration to incorporate stochastic climate events; therefore, they are unlikely to produce information useful for establishing an appropriate remedial concentration. However, focused field observations can help corroborate mode of action for some specific substances to ascertain whether effects are occurring.

Cross-species extrapolations

Physiological differences between classes of vertebrates (e.g., birds and mammals) are considered profound, and extrapolation of toxicology data between them is discouraged (Allard et al. 2010). However, the same criticism is likely true for physiological differences among families within Reptilia and also across lifestages within species of anurans (e.g., egg-tadpole-adult frog). Therefore, extrapolation of data from tested species to nontested species of concern should be done in a thoughtful and knowledgeable manner. Screening values often consider the range of potential sensitivities through the use of combining all acceptable data for species within a wildlife class. Because the detailed assessment is focused on a shorter list of species, knowledge about phylogenetic conservance of physiological pathways, mode of action, mechanism of toxicity, and toxicokinetics can help with selection of appropriate surrogate species and reasonable extrapolations to species of concern. The application of modifying or uncertainty factors has been suggested by many but should be considered as a risk management decision and not a scientifically based recommendation (Chapman et al. 1998; Duke and Taggart 2000; USACHPPM 2000). As a rule, when physiological differences are profound and either toxicokinetic or toxicodynamic differences between tested and receptor species could occur, extrapolation is contraindicated.

Species sensitivity distributions are a valuable tool for screening assessments, as they capture the range of differences among species (Posthuma et al. 2001), but they are not particularly useful for developing site-specific, species-dependent TRVs for setting local Eco-SSVs. Differences in study design can affect the variability in the results, even when the same species and endpoints are tested. SSDs frequently use the geometric mean of studies for data from the same species. This also keeps commonly tested species from overwhelming the data set, thereby giving similar weight to all species. Although SSDs are most commonly developed using mortality or reproduction endpoints, any ecologically-relevant endpoints can be included. Whether different distributions are derived for each endpoint or they are grouped on a single distribution depends on the user and the question. In essence, all SSDs are based on the range in sensitivity of a response for the species included in the distribution. The uncertainty around each point (i.e., each species mean) can be incorporated onto the SSD, or the input values to the SSD can vary, such as NOAECs/NOAELs or an ECx or EDx. A weight-of-evidence approach can then be used for selection of a particular regulatory endpoint. For example, one could select the 5th percentile value of the SSD (often referred to as the Hazard Concentration at the 5th percentile, or the HC5) or its lower confidence limit. However, because a site clean-up value is focused on the species known to be present at the site, and because remediation can have profound adverse habitat consequences, SSDs generally are not used when setting soil clean-up values; predictive, but not necessarily broadly protective, values are needed.

Exposure–response assessment approaches

The dose–response (or concentration–response) function is the cornerstone of toxicological evaluations. These are developed for varying exposure times (acute, subchronic, or chronic) and for any measurable endpoint which has a clear dose–response relationship. Precision of the exposure–response function depends on the study design, including the number of treatments, the spacing of the treatments along the exposure gradient, and the variability in the measured response. The proportionate response (and its associated confidence intervals) at any exposure level can be calculated, using a logit-probit function or other similar model. Many wildlife studies with inorganics have been conducted with only 2 or 3 exposure levels, especially those designed for dietary optimization. These typically use ANOVA procedures to determine NOAELs and LOAELs to attempt to bracket the effects threshold. However, they lack information on the magnitude of the effect at those levels or the proportion of the tested population that responded. Therefore, the entire exposure–response functions are preferred when establishing wildlife TRVs to be used for site cleanup (i.e., beyond the screening stage). Exposure–response functions can be calculated post hoc, albeit quite likely with relatively large confidence intervals, if the central tendency and measures of variation are available (e.g., means and standard deviations of animals affected at each exposure level).

The benchmark dose procedure is 1 example of an exposure–response function that provides a proportion of the population affected (with a specific confidence level) at a given exposure level (USEPA 2012b). The benchmark dose procedure plots the best fit curve to the measured exposure and response data, and then calculates the proportion affected at that level, along with its confidence limits. Because it fits a curve to the data, the benchmark dose is more robust than an ECx calculation typically based on linear responses (e.g., using the probit model). The effect level of 10% (BMD10) or 20% (BMD20) is suggested for regulatory purposes; frequently, a 50% or 95% confidence interval, respectively, is applied to the BMD to bracket effect levels to reduce the potential for adverse population-level effects. The BMD10 level is suggested because below this value predictions are more uncertain. Furthermore, this value represents the proportion of the population expected to develop a specific sublethal adverse effect. This method is robust in that it ascertains the threshold of effect at a desired level of the proportion of the population and desired confidence level, while still providing flexibility to consider population implications of the toxic endpoint.

With the exception of single dose oral gavage studies for acute endpoints (e.g., mortality), most of the wildlife toxicity studies with metals or other inorganics do not provide information on the administered dose, reporting instead the concentrations in the feed. Conversion from feed concentration to ingested dose requires knowledge of the food ingestion rate and body weight of the animal. If food ingestion rates were not measured, they can be estimated, but this adds considerable uncertainty to the estimate of the dose–response function and toxicity threshold. Therefore, consideration should be given to comparison of dietary intake in the laboratory study, to measured concentrations in field organisms (plants, invertebrates, or other prey items). This is most likely to be done during the final, detailed site assessment that has been focused onto a few chemicals and individual species so appropriate surrogate test animals can be identified.

Foraging guild analysis

The topic of assimilation efficiency is reviewed by US EPA (2007). Briefly, the gut physiology influences the amount of metal that is removed from the ingesta and absorbed across the gastrointestinal tract mucosal layer into the blood stream of the animal. Because wildlife represent an array of gut physiologies (from ruminants to hind gut fermenters to simple monogastrics), assimilation efficiency will differ across species. This can have a significant effect on the absorbed dose of the ingested metal. Therefore, when extrapolating dietary exposure from tested to nontested species during a detailed site assessment, we recommend doing so within groups that have similar gut physiologies. Broadly speaking, these animals will also have similar diets, placing them in the same feeding guild. Thus, exposure estimates will be more similar as well, thereby increasing the accuracy and precision of the risk estimate.

Laboratory-to-field extrapolations

Many remedial risk assessments assume long-term continuous exposure of resident wildlife at the site and therefore use the chronic, low-dose (concentration) exposure extrapolations in the TRV. However, not all wildlife that use the site are resident species. Some stop at the site only briefly (hours to days) as they migrate through, whereas others spend the breeding season on-site and over-winter elsewhere (or vice versa). Species often diet-shift and, hence, are exposed at potentially different levels that vary seasonally. Acute or subacute exposure studies would provide relevant information for the migrant species whereas subchronic data may be most suitable for part-time residents and chronic, full life-cycle studies reflective of the year-round residents.

Furthermore, there may be significant differences between responses in the laboratory and those in the field. Organisms respond in adaptive and compensatory ways, and those studies conducted under controlled laboratory conditions may react differently than wildlife under more variable environmental conditions (e.g., food availability, climatic factors, intra- and interspecies interactions). Results can differ in either direction (i.e., field responses may be greater or lesser than laboratory responses) (Keenan et al. 1997). The direction and magnitude of such differences generally are not known, so the laboratory-to-field comparison may need to be addressed qualitatively in the uncertainty analysis when characterizing risk predictions.

Critical body residues or tissue levels

Concentrations in whole body or specific tissue of interest can be used to develop more precise toxicological estimates of exposure (McCarty et al. 2011). Examples for wildlife include blood Pb (Johnson, Theodore Wickwire et al. 2007, Buekers et al. 2009), liver Pb levels (Ma 2011) or Cd concentrations in kidneys of small mammals (Cooke 2011). Critical tissue levels assume steady-state exposures which seems reasonable for resident species but less so for transients or migrants. However, very few data are available for wildlife (Beyer and Meador 2011). Nevertheless, tissue-specific TRVs for metals have been used in field applications to address binary risk assessment questions (acceptable vs. unacceptable risk) or to use as exposure metrics to correlate soil concentrations with clean-up goals (Sample et al. 2011).


Ecological risk assessments at contaminated sites in the United States and Canada typically are conducted in a phased, or stepwise manner (CCME 1996, 1997; USEPA 1997). The initial phase (termed the “screening phase” by USEPA) broadly addresses the questions of risk in an effort to narrow down the chemicals of interest at the site and focus on the plant and wildlife species that could reasonably be expected to be present. Wildlife TRVs used in the screening assessment tend to be protective and inclusive to avoid the probability of false negatives where often the final value is based on the most sensitive species (that may or may not occur at the site where the Eco-SCV is being developed).

Risk assessors typically adopt wildlife TRVs from either Eco-SSLs or secondary sources (Sample et al. 1996; Mayfield and Fairbrother 2013). However, the approach and much of the data used in both the Eco-SSLs and Sample et al. (1996) are now 15 to 20 years old, so new paradigms such as the benchmark dose and species sensitivity distributions, as well as any new literature, have not been captured. One approach may begin by using the data collected as part of the Eco-SSL effort as a basis from which to build, with updates to the literature in subsequent steps. However, as the risk assessment moves beyond the screening stage, additional refinements of the wildlife TRV are warranted.

Refining wildlife TRVs so they are applicable to a contaminated site once the screening assessment is completed involves focusing on both the chemical stressors and the specific wildlife species of management concern at the site. An assessment of species of management concern allows the risk assessor to concentrate on the physiological attributes (e.g., gut physiology and chemical factors) affecting bioaccessibility of chemicals in an environmental matrix and helps to narrow the toxicological data set. The first logical step is to sort valued wildlife species into specific feeding guilds. Feeding guilds are groups of animals that eat similar foods (e.g., herbivores, soil invertivores, or carnivores). Often species within these foraging guilds share gut physiological structures that enable the risk assessors to use site-specific measures of bioavailability and bioaccessibility and may also include other physiological attributes that reduce the variability of toxic responses. Second, the literature review for the chemicals of concern is updated and reviewed. The search and acceptance methods criteria should be similar to those used for Eco-SSLs (USEPA 2005), but the search should be broad enough to encompass endpoints and species specific to the site being remediated, but can be narrowed to include only those species of concern and appropriate surrogates.

Data on surrogate species with physiological similarities are preferred, but species within the same foraging guild that best reflect physiological gastrointestinal structure can also be used to reduce uncertainty. Example guilds include herbivores (ruminants [e.g., sheep, cattle, deer, and moose] and hindgut fermenters [e.g., horses, rabbits, pheasants, and quail]), omnivores (e.g., waterfowl and raccoons), invertivores (e.g., shrews and robins), and carnivores (e.g., mustelids and hawks). If necessary, cross-species extrapolations can be done on the basis of known mechanisms of toxic action and interspecies physiology. Once the studies are available, all data for each species of concern should be considered to generate species-specific benchmark doses as described in the previous section. Selection of the effects level for the benchmark dose (e.g., BMD10, BMD20, or higher) is made in consultation with the site manager, should not be based on NOAEL or LOAEL values (as in the Eco-SSLs, for example), and will likely be less conservative and more realistic. Considerations of mode of action should help to focus data on relevant endpoints to include in the model and be useful in species extrapolation. Discussion in the above sections describes the attributes to consider when selecting an effects level for derivation of a benchmark dose. As a final check, the TRV for essential micronutrients (e.g., Cu, Mg, Zn, Se) should be above the value considered essential in maintaining health in wildlife. Using the updated and species-specific TRVs, in combination with the refined exposure estimate, an Eco-SCV can be developed that is specific to the location of concern.

Critical tissue levels (CTL) or critical body burdens (CBB) can also be used to further reduce uncertainty. Because tissue concentrations already account for differences and variability due to absorption, distribution, metabolism and excretion, and exposure choices and consequences due to spatial and temporal circumstances of individuals using a site, this approach would be preferable to modeling dietary (or inhalation) exposures. Animals will need to be collected and metal concentrations measured in the appropriate tissue (or whole body, if sufficiently small). This information can be used in a “yes/no” decision to address if they are above the critical level. If the home range and site utilization patterns of the species are known, the area of concern can be reduced by eliminating any areas with animals whose tissue concentrations are all below the critical levels. Animal use patterns can be overlaid on soil concentrations, and the relationship between the tissue and soil values determined. For example, spatially explicit exposure models can be used to integrate site-use throughout the foraging range of species of concern (Wickwire et al. 2011; Hope et al. 2011; Sample et al. this issue2013). The soil concentration that results in a tissue value at the critical threshold level would be the minimum required clean up value. This approach was used successfully at Coeur d'Alene, in Idaho to select clean up levels protective of songbirds (Sample et al. 2011).


Application of TRV improvements (see Table 1) are further examined through a hypothetical case example following the USEPA risk assessment framework for Superfund Sites (USEPA 1997). For illustrative purposes, the example contaminated site includes an industrial facility surrounded by grassland environments. Metal contamination in soils was suspected at the site due to waste generation from an industrial facility, and a preliminary site investigation identified soil concentrations of several metals above their respective Eco-SSLs. Wildlife (mammals and birds) are the preliminary ecological receptor groups of concern. The results of the preliminary assessment led to additional data collection (chemical and biological attributes) to support a screening level risk assessment (SLERA) which, in turn, resulted in a baseline ecological risk assessment (BERA) and the derivation of soil clean-up goals.

In the problem formulation phase of the SLERA, a conceptual site model was prepared to define the ecological attributes of the site, exposure pathways, species of concern, assessment and measurement endpoints. Generic wildlife receptor groups (e.g., herbivores, carnivores, and omnivores) and surrogate species were identified for each receptor group (e.g., American woodcock, northern cardinal, red-tailed hawk, meadow vole, short-tailed shrew). Assessment (e.g., protection of growth, survival and reproduction) and measurement (e.g., environmental media concentrations compared to toxicological benchmarks) endpoints were defined to allow for the evaluation of broad taxonomic groups and screening of large chemical suites and wide spatial extents. Food web modeling was performed using the generic receptor groups, exposure assumptions and NOAEL TRVs as identified in the Eco-SSL documents (USEPA 2005). Based on the results of the SLERA, all ecological receptors, metals, and areas of concern were screened out from further assessment with the exception of one metal (Me) in a grassland portion of the site. At this stage of the risk assessment, a scientific decision management point was reached and further assessment of risks from Me was recommended for this portion of the site through a BERA, following which clean-up values would be determined.

In the BERA, the problem formulation was refined to address ecological receptors at risk and to plan additional data collection (if necessary) to support detailed risk analyses (including TRV refinements as highlighted in Table 1). Refinement of the receptors of concern (for this example) resulted in the selection of 2 herbivorous species (i.e., the meadow vole [Microtus pennsylvanicus] and northern cardinal [Cardinalis virginianus] known to forage and breed in the grassland areas of the site and found to be at risk in the SLERA. Thus, both site-specific and species-specific exposure and toxicological information needed to be reviewed for these 2 wildlife species and the most relevant data sets applied in the BERA. Both the cardinal and the meadow vole are r-selected species (i.e., shorter life-span and ability to reproduce quickly), so reproductive effects are likely to have high biological significance relative to population dynamics. A review of existing toxicological data for this metal indicates that reproduction is a sensitive endpoint and data are available for 2 avian species (i.e., chicken and duck) and 3 mammalian species (i.e., dog, mouse, and rat). Based on the available toxicity data, a cross-species extrapolation assessment is necessary to identify relevant species, because no data are available on the species of concern. In this case, risk assessors should strive to document the sensitivity of various species based on the available toxicity data, mechanistic differences in chemical metabolism and/or sensitivity of similar species exposed to chemicals with similar modes of action. This analysis is intended to identify the most relevant toxicity data set for the species of concern at the site rather than defaulting to the lowest available TRV for any species from the same class.

For the case study, hypothetical toxicity data and dose–response assessments are discussed to illustrate the process for developing alternative TRVs.

  • For the meadow vole, the mouse was selected as the most appropriate species for development of a TRV, because these species have very similar physiologies. A review of the literature was performed and a study was selected that met quality criteria (similar to those in the Eco-SSL guidance) and reported on the reproductive effects of chronic exposure to the metal of concern. Specifically, oral exposure to doses of 0, 1.0, 5.0, 10, 25, and 50 mg·kg−1·d−1 resulted in a dose-dependent decrease in the number of offspring from exposed female mice (Figure 1A).
  • For the Northern cardinal, relevant reproduction studies were available for chickens and ducks. These studies examined chronic exposure to Me for a number of reproductive endpoints (e.g., egg production, egg viability, hormone levels, and hematological parameters) at dose levels of 0, 5, 10, 20, 50 mg·kg−1·d−1 for the chicken and at 0, 3, 6, 10, 30, and 60 mg·kg−1·d−1 for the duck. Dose-dependent decreases in egg production or egg viability were the most sensitive endpoint for both species. Both studies were of acceptable quality, although the chicken study did not follow standard test guideline protocols. Therefore, both sets of data were used to generate dose–response models (Figure 1B and C).
Figure 1.

Hypothetical dose–response analysis for reproductive effects on mice (A), chickens (B), and ducks (C) after oral exposure to a metal contaminant.

Benchmark doses (e.g., BMD10, BMD20) and their lower confidence limits were calculated for each species (Table 2), from which the risk manager could select the appropriate level of conservatism for the site clean-up. A broad review of the literature on reproductive effects of chemicals (both organic and inorganic) was also conducted to provide a qualitative estimate of relative sensitivity among species. This information was provided to the risk manager in the Uncertainty section of the BERA, to guide selection of the benchmark dose with the appropriate level of conservatism to meet the site clean-up goals. The uncertainty of this extrapolation was acknowledged by presenting both species benchmarks with their confidence levels in the final BERA, so risk managers could decide if additional risk reduction strategies should be implemented (see Greenberg et al. [this issue2013] for further discussion of risk management considerations).

Table 2. An illustration of the phased-analysis for an hypothetical case study
Risk assessment parameterSLERA phaseBERA phaseROD phase
  • aEffect doses are hypothetical representations of the results from a benchmark dose analysis. Each effect dose represents the benchmark dose (BMD) and lower confidence interval (BMDL) on the benchmark dose for a 10% or 20% effect relative to controls (see Figures 1A–C).
Species of concernHerbivore (surrogate species used in wildlife models)Meadow Vole and Northern Cardinal (site-specific receptors to be protected)Meadow vole and Northern cardinal (site-specific receptors to be protected)
Endpoint of concernGeneric (e.g., growth, survival, and reproduction)Reproduction (e.g., number of offspring and egg production)Reproduction (e.g., number of offspring and egg production)
Dose–response evaluationConservative estimates (often using the lowest available reliable TRVs from broad taxanomic classes)Species and endpoint specific (developed from exposure response relationships from data representative of a specific species of concern)Species and endpoint specific (developed from exposure response relationships from data representative of a specific species of concern)
Toxicity threshold values (mammalian species)TRVNOAEL = 1.0 mg·kg−1·d−1 (lowest NOAEL from all available mammalian toxicity data)Derived from mouse studyEco-SCVs estimated by combining species and site-specific TRVs and exposure information
BMD10 = 2.6 mg·kg−1·d−1 a
BMDL10 = 2.0 mg·kg−1·d−1 a
BMD20 = 5.4 mg·kg−1·d−1 a
BMDL20 = 4.3 mg·kg−1·d−1 a
Toxicity threshold values (avian species)TRVNOAEL = 5.0 mg·kg−1·d−1 (lowest NOAEL from all available avian toxicity data)Derived from chicken studyEco-SCVs estimated by combining species and site-specific TRVs and exposure information
BMD10 = 12.7 mg·kg−1·d−1 a
BMDL10 = 10.3 mg·kg−1·d−1 a
BMD20 = 18.3 mg·kg−1·d−1 a
BMDL20 = 15.7 mg·kg−1·d−1 a
Derived from duck study
BMD10 = 9.8 mg·kg−1·d−1 a
BMDL10 = 6.8 mg·kg−1·d−1 a
BMD20 = 13.8 mg·kg−1·d−1 a
BMDL20 = 10.6 mg·kg−1·d−1 a

Risk managers should recognize that the example species-specific TRVs (although less conservative in numerical value than the NOAEL) are consistent with the assessment endpoint defined in the problem formulation and the overall goal of environmental protection. In the remedy phase of the site assessment (documented in the record of decision [ROD]), the species and site-specific exposure and toxicological adjustments are applied to determine clean-up values (Eco-SCVs). In addition, field measures of exposure are compared to the laboratory-based dose–response curve for risk determination to evaluate the predictive ability of the dose–response models. Eco-SCVs can be estimated using a food web modeling procedure with site-specific parameter inputs (similar to the Eco-SSL process) or can be developed using concentration–response data developed from laboratory or field collected data (see discussion by Sample et al. [this issue2013]). The final Eco-SCVs are likely to represent a concentration that best balances the site managers' risk tolerance with the results from the BERA.


The development of wildlife TRVs is often limited by the available data. Recent advances in wildlife ecotoxicology and dose–response assessment provide an opportunity to reduce uncertainties in the current practice of ecological risk assessment. Furthermore, these tools can be used to reduce arbitrary conservative assumptions and introduce more realistic analysis into soil cleanup decisions. Therefore, the development of toxicity exposure–response functions for representative wildlife species exposed to the most prevalent metals at contaminated sites should be a priority. Appropriate test designs and endpoints should be established a priori, so the studies provide the information needed for development of a robust TRV (see above and USEPA [2003]). Of course, ad hoc studies will continue to populate the literature, and will be retrieved and reviewed to possibly update the Eco-SSL database during the refinement of the screening assessment. Although data limitations exist in some cases, there are several areas where the TRVs generated for Eco-SSLs can be modified and improved to develop site-specific Eco-SCVs and recommendations are offered below.

  • Because TRV development results in considerable redundant effort across sites as well as potential differences in the selection and use of key references, we advocate the development of an expert panel to develop and regularly update the TRV database. This would constitute a board of professionals that would meet regularly to develop (through consensus) TRVs that would be supported with publically available technical documentation. TRVs would then be updated at regular intervals to keep them current and allow for discussion about the inclusion of new endpoints, and incorporate advances in the science of wildlife toxicology.
  • Toxicity reference value determinations can move beyond simple point estimates (e.g., NOAELs) and include robust cross-species extrapolation and dose–response analysis to tailor the toxicological thresholds to the specific needs of the contaminated site. When appropriate data are available, risk assessors should model exposure and response relationships using statistical tools (e.g., benchmark dose models or other regression techniques). The product of the analysis should provide ranges of adverse effects (rather than single points) to further describe potential risks (and uncertainty bounds) and allow flexibility in selecting clean-up values.
  • Detailed risk assessments (postscreening) should endeavor to include explicit definitions for the receptors and species of concern and the adverse biological endpoint to be protected. At this stage of the risk assessment, the added specificity and quantitative analysis serves to reduce uncertainty.
  • Adverse endpoints need to reflect biologically plausible perturbations of wildlife populations (whether from acute or chronic exposure), and be related to site-specific routes of exposure and species-specific natural history. The risk assessor should refine the biological endpoint of concern and document the link to population-level effects.
  • Risk assessors should ascertain the possibility of using a critical tissue (e.g., blood, liver) threshold for adverse effects if data are more reliable than oral-dose-based estimates based on laboratory exposures. This may provide a more realistic account of contaminant mobility, bioavailability, and assimilation at the site.
  • Compare in silico exposure–response analyses with field data to reduce uncertainty in risk predictions and evaluate remedial alternatives.
  • Future risk assessments may consider information from physiologically-based toxicokinetic models (PBTK) or high through-put genetics screens to support species-extrapolation depending on how well physiological processes or genes are conserved across the species.
  • Finally, thorough documentation of all decisions used to select and interpret toxicity data sets should be provided so that risk managers can understand the uncertainties in the risk assessment and engage on solutions to reduce data gaps and strengthen the final remedy selection.


The authors recognize and thank the workshop sponsors: Copper Development Association, Exponent, International Molybdenum Association, International Zinc Association, Nickel Producers Environmental Research Association, North American Metals Council, Rio Tinto, US Army Environmental Center, and Vale Canada, Ltd. In addition, the authors acknowledge the helpful contributions to this article from Scott McMurray, and thank our colleagues (Bradley E Sample, Christian E Schlekat, Gladys L Stephenson, and Steve P McGrath) for their comments and suggestions on the manuscript. All views and opinions expressed herein are those of the authors and do not necessarily represent those of any other government, public, or private entity. No official endorsement is suggested or is to be inferred.