TRV must relate to problem formulation
Wildlife risk assessments are initiated to assist environmental managers in their dealing with generic questions such as “What will happen to the wildlife if these chemicals remain in (or are introduced to) the area?” However, generic protection goals (e.g., “protect wildlife from contaminants”) are not sufficient for directing TRV selection; dialogue among risk assessors, environmental managers and in many cases, other stakeholders during the problem formulation is required to more clearly link TRVs and assessment and measurement endpoints. Factors such as existing and proposed land uses, geographic location, regulatory requirements, societal concerns, and the degree of risk aversion of both the manager and their organization, and other interested parties (i.e., the public and other stakeholders) will influence the level of protection to be afforded to wildlife species. Risk assessors should formulate assessment endpoints that capture the risk managers' concerns in terms of an attribute and entity to be protected (Suter et al. 2004).
Clarity in the assessment and measurement endpoints is required so the applicable toxicity data can be accessed to derive a TRV. Often, environmental protection goals are directed at population or community level protection, but TRVs are not available for these levels of ecological organization. Because TRVs are thresholds for effects to individual animals, measurement endpoints that are clearly related to the higher level assessment goals (e.g., survival, growth, reproductive output) are typically used to formulate a toxicity threshold. Data describing other responses to chemicals (e.g., behavior or physiological changes) may also be available and can have profound influences (e.g., toxicant-induced lethargy will alter survival when predator vigilance is critical for adults and offspring), but are often not clearly linked to most environmental protection goals. Data should be analyzed, corroborated, and endpoints evaluated without bias of a priori assumptions that survival, growth and reproduction are the only endpoints that lead to adverse population outcomes. Rather, one should assume that any physiological responses that result in direct or indirect changes to the survival, growth, reproduction, or immigration of organisms may have the potential to result in adverse consequences to a population when many organisms are exposed. Although most risk assessments have the inherent assumption that population-level effects will be absent if no effects are predicted for individual organisms, it is incorrect to suggest that the converse is true; that is, effects on individuals do not always result in changes in population density or age/sex structure due to many compensatory mechanisms that are present in ecological systems (Fairbrother 2001). Food chain models that consider exposure levels relative to toxicity thresholds for individuals can be combined with population models to translate effects on individual organisms into estimates of changes in population growth rates as a result of exposure to chemicals (Fairbrother 2001).
Thus, selection of appropriate TRVs will differ with each risk assessment as they are ultimately dependent upon the stated environmental protection goals and assessment endpoints. Risk assessors must be able to articulate this connection clearly, so the results of the assessment can be presented to the environmental managers in terms that are compatible with the decisions to be made.
Historical and ongoing use of NOAELs or LOAELs as TRVs
EDx-based TRVs (see further discussion below) are clearly preferable; however, many TRVs are based on no-observed-adverse-effect levels (NOAELs) or lowest-observed-adverse-effect levels (LOAELs), despite consensus that NOAELs and LOAELs have significant shortcomings. (For the purpose of this study, EDx is defined as a dose resulting in an x% reduction in an endpoint relative to a control group.) Note that the ASTM-I E47 Committee on Biological Effects and Chemical Fate has recently defined these terms as concentrations (NOAEC and LOAECs) and not levels; however, we have opted to use the more commonly used acronyms NOAEL and LOAEL to reflect the common usage among practitioners. This practice has been facilitated by the broad availability of LOAEL- and NOAEL-based TRVs in easy-to-access compendia (e.g., Sample et al. 1996; USEPA 2005 ecological soil screening levels [Eco-SSLs]; USACHPPM 2000; Los Alamos National Laboratory [LANL] ecorisk database); however, this practice has little technical merit for screening assessments and no merit for risk assessment purposes. NOAEL and LOAEL values are not innately related to biologically relevant thresholds and do not provide information about the actual magnitude of effects in the reported studies. NOAELs do not necessarily equate to a “no effect” dose; they reflect only the test concentrations used in the study and are strongly influenced by factors related to statistical power (e.g., study design, replication). NOAELs and LOAELs (or equivalent terms) in ERA have been criticized elsewhere (Hoekstra and van Ewijk 1993; Laskowski 1995; Chapman et al. 1996; Bailer and Oris 1997; OECD 1998); however, their use continues to be widespread, in part due to policy decisions that the NOAEL- and LOAEL-based TRVs provide an adequate basis for evaluating hazards to wildlife and complicity by practitioners who continue to emphasize policy over science (Kapustka 2008). Notwithstanding this policy decision, this study provides a spectrum of alternatives to NOAELs and LOAELs for consideration by risk assessors and policy makers alike.
Selecting appropriate toxicological data
The reliability of the TRV depends on the quality and quantity of data used. A comprehensive literature search with careful evaluation of all data retrieved is the essential foundation of TRV development and not a trivial exercise. Guidance on the literature search and evaluation process (USACHPPM 2000; USEPA 2005) is available and is not duplicated here.
Extrapolating between species is not acceptable
The majority of TRVs are based on common laboratory test species. The near-complete absence of toxicity data for most wildlife species means that extrapolation of toxic responses observed in laboratory test species to species of interest is necessary.
Allometric scaling (e.g., as was used in Sample et al. 1996) is one extrapolation approach that is widely applied in human toxicology and that has been used for wildlife risk evaluations despite its multiple limitations. However, it is no longer recommended for use in wildlife risk assessment (USEPA 2005). First, supporting data are limited. Much of the mammalian data are based on anticancer drugs evaluated in Freirich et al. (1966) rather than contaminants typically evaluated in wildlife risk assessments. Second, the allometric scaling models developed for both human and wildlife risk assessment are all based on acute toxicity data. Their applicability to chronic toxicity data is unknown.
Recently, Raimondo et al. (2007) developed interspecies correlation estimation (ICE) as an alternate approach for quantifying interspecies toxicity relationships. ICE is based on log-linear regression models that describe acute toxicity relationships between pairs of species over a range of chemicals. ICE models were developed for all chemicals for which adequate data were available, and for chemicals grouped by similar mode of action. Consideration of modes of action improved regression models for some, but not all, groups (e.g., neurotoxicants, carbamates, and organophosphates). This indicates that mode of action can be an important determinant in interspecies toxicity extrapolation. Ultimately, although ICE models are a step forward, they are currently similar to allometric models in that they are based solely on data from acute studies.
Because modes of action can vary dramatically for the same chemical over acute and chronic exposures (discussed in more detail below), it is likely that interspecific scaling factors based on chronic toxicity data also will differ from those based on acute toxicity data. Additionally, given the variation in cross-species physiological responses in different organ systems, it is reasonable to expect multiple chronic scaling factors for a given chemical, depending on the mode of action considered. In their current forms, neither allometric scaling nor ICE models represent chronic toxicity, and, therefore, their application to chronic data is not recommended. In the absence of suitable models, we favor the use of toxicity information as reported, because it is often unknown whether target species would be more resistant or more sensitive. To take a biased approach without sufficient information, in our view, is unwarranted. Alternatively, uncertainty factors may be applied to adjust toxicity values. However, generic uncertainty factors (e.g., 10-fold for any uncertainty) should not be used. Rather, if uncertainty factors are used, there should be a scientific basis for their application (Chapman et al. 1998). Whether applying uncertainty factors or not, uncertainty can be minimized by selecting test species that are as taxonomically or physiologically related to the wildlife species of interest as possible.
Extrapolating across taxonomic classes is not acceptable
Cross-class extrapolation of toxicity data has been done when data to support TRV derivation are extremely limited. However, extrapolation between classes is not recommended for development of chronic TRVs. These extrapolations are highly uncertain under the best of circumstances, with uncertainty increasing with greater taxonomic distance. Examples from acute toxicity data for aquatic taxa show that uncertainty increased as taxonomic relatedness decreased (Suter et al. 1986; Suter and Rosen 1988). Few studies have directly investigated cross-class extrapolations for wildlife; however, Luttik and Aldenberg (1997) concluded that differences between birds and mammals preclude extrapolations between these two classes based on LD50 values. Sample and Arenal (1999) observed similar LD50-based allometric scaling factors for birds and mammals for a majority of the chemicals evaluated, but because no clear pattern was observed for differences based on chemical categories, they concluded that extrapolations between birds and mammals should be approached with extreme caution. More recently, Raimondo et al. (2007) observed that uncertainty in LD50-based ICE models increased as taxonomic relatedness between surrogate taxa and the target taxon increased. Similar findings were observed for chronic wildlife toxicity data. Conversely, Johnson, Quinn, et al. (2007) found onset of central nervous system effects (i.e., convulsions) to be remarkably similar between species of 3 classes of vertebrates (reptiles, birds, and mammals) from daily oral exposures of RDX (1,3,5-trinitro-1,3,5-triazine) for 14, 60, and 90 d, respectively. This relationship was not true for two other energetic compounds tested using the same species. No clear pattern of cross-class sensitivity to these other energetic compounds was apparent, and, therefore, different conclusions on relative toxicity for each taxonomic class could be drawn depending upon which compound was considered. Although data are limited, clear patterns in relative cross-class sensitivity are lacking for both acute and chronic toxicity data and, therefore, extrapolations across classes should be avoided.
Extrapolating chronic TRVs from acute data is not acceptable without scientific support
Most wildlife risk assessments are intended to evaluate long-term exposure to low concentrations of chemicals (although there are exceptions such as spills and pesticide applications). However, a significant fraction of the mammalian and avian toxicity data are based on acute (short-term exposure) studies. This is particularly true for mortality studies involving a single dose delivered orally in a highly absorbable form (e.g., in water or vegetable oil). Reproduction studies may involve longer exposure periods; however, these are still typically shorter than an organism's life span. In the past, many risk assessments have extrapolated chronic TRVs from studies with acute or subacute exposure durations. However, these extrapolations are uncertain because the relationships between acute and chronic responses often are not known for most species–chemical combinations. Consequently, extrapolation of chronic effects from acute data is not recommended, unless there are data to support the extrapolation.
There is no generic acute-to-chronic extrapolation factor that can be applied for wildlife TRV derivation. Hill (1994) compared responses of mallards (Anas platyrhynchos) from studies with single dose versus 5-d exposure to a variety of pesticides (organophosphorus, carbamate, and organochlorine compounds) and showed very different responses between the two exposure regimes. Most notably, the 5-d exposure values were much more variable among chemicals than were the single dose values; however, this may be due to differences in concentration to dose conversions. Overall, Hill (1994) found no statistical relationship between the two sets of values. Studies compiled in the aquatic organism database show that acute to chronic ratios (ACR) vary considerably by species and by substance class (Länge et al. 1998). Metals have the largest ACR, and other inorganics also show considerably different responses between acute and chronic exposures (inorganic substance ACRs vary from 20 to nearly 200). ACRs for organic substance are lower, but still vary by an order of magnitude (range: 2 to 28). Thus, there is no empirical basis for a universal ACR to extrapolate chronic exposure TRVs from acute exposure toxicity data.
Acute and chronic exposures result in significantly different physiological effects due to species- and exposure-specific variation in adsorption, distribution, metabolism and excretion (ADME) rates. For example, after administration of a single oral dose of DDT to dogs, the highest concentration of DDT was found in the bile with moderate amounts in the central nervous system (CNS) and blood, and small amounts in the kidney and liver (St. Omer 1970). However, after 2 weeks of feeding, DDT was found in fat, skin, muscle, and kidney; it does not begin to show up in other organs until at least 4 weeks of exposure. Cook and Trainer (1966) showed that lead poisoning from acute exposures in mallards results in a rapid increase in blood lead levels and subsequent mortally due to peripheral nervous system effects, while a lower dose chronic exposure has much slower uptake, as evidenced by lower blood lead levels and neurotoxic effects to the central nervous system. Acute poisoning by DDT is evidenced by CNS signs quickly followed by death (St. Omer 1970), whereas chronic toxicity affects endocrine functions (e.g., prostaglandin synthesis in the eggshell gland mucosa by p,p-DDE, resulting in eggshell thinning; Lundholm 1997) and acts as a potent androgen receptor antagonist (Kelce et al. 1996). Therefore, differences in ADME for acute and chronic exposures may prevent the development of chronic TRVs from acute toxicity data.
Given the likelihood of different ADME rates, toxicological endpoints, and dose–response curves between acute and chronic exposures to the same chemical, it seems unwise to use TRVs based on acute studies when assessing risks from chronic exposures. Chronic TRVs based on acute data will likely be incorrect and targeted on inappropriate toxicological endpoints. Dividing the acute TRV by an uncertainty factor will not correct this misalignment or necessarily result in a more conservative estimate. Rather, the lack of a chronic TRV should be identified as a data gap and discussed as part of the uncertainty analysis. If there are data to support an extrapolation from acute exposures to develop a chronic TRV, then these data should be documented and the extrapolation can be done.
Data-dependent options for deriving TRVs
Ideally, all TRVs should be derived with a thorough understanding of the underlying mechanism of toxicity and physiological differences between species. Dose–response relationships are best used to illustrate these points. In screening assessments, this information can be used to derive single point estimates (e.g., EDx values), while in a risk assessment context (see text box), the underlying dose–response distribution can be used directly for understanding the likelihood and magnitude of potential effects as well as the response to incremental increases in exposure.
The use of EDx-based TRVs (which still leads to the calculation of hazard quotients [HQs]—see next section of the study) is flexible in that different protection goals (i.e., allowable magnitude of effects) can be tailored for each assessment endpoint, which in turn may reflect different land uses or different target species (e.g., rare vs common species). Practically, the development of dose–response relationships for many chemicals and wildlife species often is constrained by data limitations. These challenges have led to the use of TRVs based on NOAELs and/or LOAELs. However, even in data-poor situations, we recommend that TRVs be derived by extracting dose–response information (e.g., dose and effects level for each treatment) from the study reports or publications, rather than relying on the reported NOAEL or LOAEL. We recognize that moving away from NOAEL- and LOAEL-based TRVs is challenging for various reasons (not the least of which is the existing regulatory precedent (cf., Hope 2009). To bridge this practice with that recommended herein, risk practitioners may wish to show where NOAELs and LOAELs fall on the dose–response relationship to provide context to the EDx-based screening assessment.
Because the selection of a TRV for use in wildlife assessment is inherently a data-dependent process, TRV derivation options will vary according to the quantity and specificity of toxicity data used. All options require extraction of dose and response data from pertinent wildlife toxicological studies, when possible. For example, if a study has 5 treatments, individual doses and effect sizes can be extracted for each treatment, instead of simply determining NOAEL/LOAEL values. In some cases, this is relatively straightforward, because the studies were designed for the purpose of establishing dose–response relationships. Unfortunately, this is not typically the case and published studies often do not include enough data to reconstruct a dose–response curve. Thus, “data mining” of the scientific literature needs to be conducted to build a relevant data set. Once a dose–response dataset has been assembled, the decision on which TRV derivation approach is appropriate to follow is dependent on data quantity and receptor specificity, but mostly is driven by the data quality objectives from the problem formulation. Three different potential approaches are described below. (When no toxicity data are available [e.g., evaluation of proposed compounds], QSARs may provide a method of estimating the toxic properties of a compound, using the physical and structural characteristics of this compound relative to a toxicity data set from similarly structured compounds. However, options for QSAR derivation are beyond the scope of this study.)
Ideally, enough data are available to fit species-specific dose–response curves for many species. Point estimate (e.g., EDx) values from each curve could be combined to build a species sensitivity distribution (SSD) (e.g., to estimate an EDx-based TRV protective of y% of species) or variability among the curves could be used to predict a range of possible dose–response relationships for any species. However, because toxicity studies for vertebrates collect a wide range of continuous data and integrate many different methods, all such data are rarely equivalent (an important criterion in developing SSDs). While the SSD approach is commonly used in aquatic risk assessment (e.g., Newman et al. 2000; Baird and Van den Brink 2007), SSDs have rarely been developed for vertebrate wildlife species (e.g., Moore et al. 2006) because of their substantial data needs and differences between study designs and reported results.
It is more likely that TRVs will be generated using single dose–response curves. Receptor-specific models can be developed when sufficient data exist (e.g., Kerr and Meador 1996; Moore et al. 1997, 1999, 2003; Wayland et al. 2007). In cases where data are more limited, it may be possible to combine dose and response data from different species or endpoints and use the whole data set for deriving the model. EDx or Benchmark Dose (BMD; a BMD corresponds to the statistical lower confidence limit on the study dose producing a predetermined level of change in adverse response compared to the response in untreated animals. BMD considers the whole dose–response relationship) methods can be used to derive TRVs (Caux and Moore 1997; Moore and Caux 1997; USACHPPM 2000).
Finally, when available data do not support the formal derivation of dose–response curves, the assessor still has better options for TRV derivation than relying on NOAEL/LOAEL estimates. For example, the dose–response data can be plotted (e.g., a scatterplot) and examined visually; the underlying relationship can be used to select a TRV. The dose–response relationship (strong or weak) can also provide insights into uncertainty and the possible implications of exposure to doses exceeding the TRV.