Key Opportunities to Replace, Reduce, and Refine Regulatory Fish Acute Toxicity Tests

Abstract Fish acute toxicity tests are conducted as part of regulatory hazard identification and risk‐assessment packages for industrial chemicals and plant protection products. The aim of these tests is to determine the concentration which would be lethal to 50% of the animals treated. These tests are therefore associated with suffering in the test animals, and Organisation for Economic Co‐operation and Development test guideline 203 (fish, acute toxicity) studies are the most widely conducted regulatory vertebrate ecotoxicology tests for prospective chemical safety assessment. There is great scope to apply the 3Rs principles—the reduction, refinement, and replacement of animals—in this area of testing. An expert ecotoxicology working group, led by the UK National Centre for the Replacement, Refinement and Reduction of Animals in Research, including members from government, academia, and industry, reviewed global fish acute test data requirements for the major chemical sectors. The present study highlights ongoing initiatives and provides an overview of the key challenges and opportunities associated with replacing, reducing, and/or refining fish acute toxicity studies—without compromising environmental protection. Environ Toxicol Chem 2020;39:2076–2089. © 2020 The Authors. Environmental Toxicology and Chemistry published by Wiley Periodicals LLC on behalf of SETAC.


INTRODUCTION
Originally employed in the nineteenth century as a forensic tool for investigating fish kills from effluent pollution, fish acute toxicity tests have become a core requirement for prospective safety assessment under many global regulatory frameworks for manufactured chemicals, including plant protection products (PPPs) and industrial chemicals (Table 1; Hunn 1989;Hutchinson et al. 2016). They are used to determine the lethal concentration of a substance that causes death in 50% of the test population (LC50) during short-term exposure-over hours or days. Fish acute toxicity studies are now the most frequently conducted vertebrate ecotoxicology tests (Figure 1; Burden et al. 2017) and one of the very few standardized vertebrate ecotoxicity tests where death is the intended endpoint-often Yes • In-country testing may be required. For example, only test reports obtained from qualified testing facilities can be recognized for registration review in China. Unilateral acceptance of data generated following OECD TGs to GLP from overseas laboratories is not possible unless there is a multilateral acceptance of data agreement with China. • Local fish species may be required to be tested; e.g., in Bangladesh 2 local species are required to be tested in-country for formulations. Some countries require testing of specific species for specific uses; e.g., in South Korea loach is required for paddy uses. • Testing of products and ingredients can be required Notes Estimated total number of different species tested in practice for global registration of an active substance = 4 (rainbow trout in the European Union, fathead minnow and sheepshead minnow in the United States, and carp in Japan). Can be higher if multiple Asian countries require specific native species. Multiple additional studies may be required, depending on the number of products requiring formulation testing. causing significant suffering to test animals. This prompts the question: What are the key opportunities to apply the 3Rs principles-reduction, refinement, and replacement (see Table 2)-to the fish acute toxicity test in a regulatory setting? Standardized protocols are followed when data from fish acute toxicity studies are generated to meet regulatory requirements, including the Organisation for Economic Cooperation and Development's (OECD's) test guideline 203 (Organisation for Economic Co-operation and Development 2019a) and the US Environment Protection Agency's (USEPA's) test OCSPP 850.1075 (US Environment Protection Agency 2016a). Results are most often submitted along with data on acute toxicity to invertebrates and algae. For the prospective safety assessment of chemicals (the scope of the present study), information is used for hazard classification and labeling purposes and to assess risk to fish potentially exposed to substances following their use or discharge into the environment. It can also be used for nonregulatory purposes or for internal company decision-making, such as molecule development and optimization, and to help select concentration ranges for testing in longer-term (chronic) toxicity or bioaccumulation studies. Regulatory testing is also conducted for the purposes of measuring water quality (e.g., effluent assessments), although this aspect is not the focus of the present study. Numerous ethical, scientific, business, and legislative drivers are compelling a shift in the current safety-assessment paradigm away from established approaches, which largely rely on the generation of data from whole-animal (in vivo) studies (Burden et al. 2015). Many legislations governing chemical safety assessment, particularly in Europe, demand that vertebrate animal tests are conducted as a last resort, that nonanimal methods are used where possible, and that existing data are shared (e.g., European Commission 2006, 2009a, 2009b. Regional bans on the marketing of cosmetics containing new ingredients tested on animals for human safety assessment purposes have also come into force (e.g., Europe, India, Israel and Brazil;European Commission 2009b;Burden, et al. 2016a; Table 1). In 2019, a bold move was made by the USEPA-the body responsible for regulating industrial chemicals and PPPs in the United States-by announcing its plan to eliminate requests for mammalian safety tests by 2035 (US Environment Protection Agency 2019). Efforts have already started to reduce and refine mammalian acute toxicity tests where they remain mandatory, for example, through activities to modernize the USEPA's PPP data requirements (US Environment Protection Agency 2017; Prior et al. 2019), deletion of the OECD's acute oral toxicity test (test guideline 401; Organisation for Economic Co-operation and Development 1987) in 2002 specifically because of animal welfare concerns, and the recent adoption of OECD test guideline 433 (Organisation for Economic Co-operation and Development 2018) to increase the use of more humane endpoints in place of death in acute inhalation studies (Sewell et al. , 2018. In nonmammalian toxicology there have been similar efforts to improve the welfare implications of avian acute toxicity testing, through introduction and adoption of OECD test guideline 223 (avian acute oral toxicity test) (Organisation for Economic Co-operation and Development 2016a). This test uses fewer animals than the alternative method (OCSPP 850.2100) (US Environmental Protection Agency 2012) and provides refinements, for example, through the option for single-dose limit testing where toxicity is likely to be low .
At the same time, there have been significant developments in biomedical science which will impact how (eco)toxicology and safety assessments are conducted in the future, together with a greater focus on environmentally realistic exposure scenarios. This includes the development of more sophisticated and physiologically relevant cell-based approaches and the application of computational chemistry and mathematical modeling to biological systems. There are many current efforts focused on developing frameworks to allow data from these newer, nonanimal approaches to be used in place of traditional  Applying the 3Rs in fish acute toxicity tests-Environmental Toxicology and Chemistry, 2020;39:2076-2089 in vivo results when making chemical safety decisions (e.g., Organisation for Economic Co-operation and Development 2016b). There may also be a need to rethink how decisions are arrived at as a result of changing environmental factors (e.g., climate change) and the emergence and evolution of new substance types (e.g., nanomaterials). Now is the time to turn the focus on harnessing the opportunities to improve the science underlying fish acute toxicity assessments, while at the same time enabling traditional in vivo approaches to be waived or replaced and, where still absolutely necessary, to reduce or refine in vivo studies to use fewer animals and/or induce less suffering-across all geographic regions and industry sectors. The present study has been prepared by the UK's National Centre for the Replacement, Refinement and Reduction of Animals in Research (NC3Rs) ecotoxicology expert working group, following several discussions around the potential to apply the 3Rs in fish acute toxicity testing. Firstly, in the section Current State of the Science, Testing Frameworks, and Key 3Rs Opportunities, we identify pertinent areas for applying the 3Rs-including novel developments and existing opportunities with scope for wider uptake across a range of regulatory sectors. These are then summarized in the section Harnessing the Opportunities, with specific examples of data analysis or other work needed to clarify some of the uncertainties and enable definitive improvements to fish acute toxicity testing and assessment frameworks.

CURRENT STATE OF THE SCIENCE, TESTING FRAMEWORKS, AND KEY 3Rs OPPORTUNITIES
Question 1: Are new fish acute toxicity data always needed?
The first aspects to consider before carrying out any in vivo test should be whether the data are relevant to the regulatory and scientific question and whether data already exist that could answer that question.
Are fish acute toxicity data needed at all? Exposure considerations. In the "real world," fish may rarely be exposed to chemicals at high concentrations and/or within the time frame of an acute toxicity test (which may better reflect an accidental spill, inadequately treated discharge, or improper disposal scenario). There may be limited uptake by fish if a substance has certain physicochemical properties such as low water solubility or a high melting point (Mayer and Reichenberg 2006). Some substances are readily biodegradable (or highly labile), and exposure in the environment may be very low following removal in wastewater-treatment plants, in regions where these exist. These properties can often be readily identified, particularly if the substance will not dissolve (e.g., solid waxes), or by using modeling and analytical means (e.g., Thomas et al. 2015). Evidence of no or very low uptake may justify waiving acute in vivo tests in some regulatory regimes. Negligible release of a substance into the environment would also lead to no or very low environmental exposure (in the absence of spill events), or release may take place over longer durations. In this case, results from chronic tests may be more relevant. Human and veterinary medicinal products are most likely to be present over the long term at low levels because they mainly enter the environment following patient use. Therefore, acute testing for European pharmaceutical registrations is only triggered when a predicted environmental concentration reaches a threshold level, which has been based on historical acute environmental toxicology data (Table 1; European Medicines Agency 2006). This approach is considered highly protective (Gunnarsson et al. 2019) and has contributed in part to the practical removal of the requirement for fish acute toxicity tests for pharmaceuticals. The European Union's Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) regulation for industrial chemicals also uses a pseudo-exposure-driven approach because only substances produced or imported over 10 tonnes/yr require fish acute toxicity data (Table 1). These "exposure-driven" approaches are lacking in other legislation and regulations including PPPs, where testing is in practice mandatory irrespective of environmental exposure. A stronger focus on exposure-driven approaches has emerged in human health safety assessment (National Research Council 2012; Sewell et al. 2017a) and is equally applicable and beneficial to ecotoxicology. Wider implementation of exposure rather than hazard-based assessments could greatly reduce the number of vertebrate studies. Retrospective analysis of existing data would help to determine suitable exposure triggers for testing (per Gunnarsson et al. 2019). Evidence of low or negligible exposure could at the very least help prioritize vertebrate testing, if not all ecotoxicity testing. Future modeling efforts should consider how chemical property and fate and effect data may be effectively used to predict impacts from variable exposure scenarios such as chemical spills.
Do data already exist to answer the scientific question? Exploiting options to waive new experimental tests. Existing data on the chemical or information on similar chemicals may be available to conduct safety assessments without the need for new data. The first step is to collate and review all available ecotoxicity, fate, and physicochemical data. Fish acute data may not be needed to characterize environmental risk if fish are not always the most sensitive taxa for acute effects, as is the case for many pharmaceuticals, industrial chemicals, and PPPs (Weyers et al. 2000;Hutchinson et al. 2003;Jeram et al. 2005;Rawlings et al. 2019;Lillicrap et al. 2020). Although many regulatory regimes require ecotoxicity data across 3 trophic levels, conducting tests using nonvertebrate species (i.e., invertebrates and algae/aquatic plants) first may provide sufficient information for safety assessment when considered with other data, particularly if the chemical falls into a class where potential toxic effects are well known (see section Grouping and read-across approaches). In chronic assessments of antibiotics, fish testing is now not required because of the known lack of sensitivity (Baumann et al. 2015;Brandt et al. 2015;Le Page et al. 2017; Committee for Medicinal Products for Human Use 2018). Better understanding of evolutionary target conservation (i.e., biological similarity) across ecological taxa would be useful to determine whether fish data are relevant. This has started to be explored for the chronic effects of pharmaceuticals using tools such as ECOdrug (n.d.; Gunnarsson et al. 2019).
Existing fish in vivo data, even with limitations or restrictions, may be sufficient for decision-making. Where regulations allow, studies that do not adhere to a good laboratory practice (GLP) framework or open literature studies may be used to consider relative trophic-level sensitivity or support assumptions of low toxicity. All data should be evaluated in terms of reliability and relevance (e.g., Moermond et al. 2016), in a way applicable to the regulatory framework (Martin et al. 2019). Existing reliable data on chronic fish toxicity may be relevant to acute effects or support the assumption that fish are less sensitive than other aquatic species. The European Union Biocidal Products Regulation guidance already indicates that if a valid chronic study on fish is available, acute data are not needed (European Chemicals Agency 2018). This can also be the case on a substance registration-specific basis for industrial chemicals under the European Union's REACH regulation (Annex VIII;European Commission 2006). Existing data could also provide justification for less animal-intensive or refined acute toxicity studies (e.g., application of an in vivo fish limit test, fish embryo acute toxicity test [FET], or fish in vitro cell line assay; discussed in the section Embryo and in vitro assays).
Grouping and read-across approaches. Existing data on substances that fall within the same "group" as the test substance can be useful. Grouping is based on, for example, structural similarities or the sharing of common metabolic pathways. These properties can be used to predict likely physicochemical, fate, and ecotoxicity profiles, the principle being that the group of substances will exert similar effects. For example, the PETROTOX model has been developed to perform aquatic hazard assessment of petroleum substances based on substance composition (Redman et al. 2017). Here, hydrocarbons of similar structure and size within a petroleum substance are grouped together because they are known to behave similarly in terms of environmental distribution and fate (Concawe 1996). Toxicity or environmental risk limits of these complex mixtures can then be calculated without the need for all components to be tested experimentally.
A similar approach involves read-across of experimental data from one substance (source) to another (target); for an example, see Supplemental Data 2. It is already possible to use read-across for classification and labeling purposes in the European Union if fish acute data are not available (European Chemicals Agency 2017a). Significant documentation is required to justify the validity of grouping or source data and demonstrate the relevance of the grouping or read-across approach, as set out, for example, by the European Chemicals Agency (2008, 2017b), and the OECD (Organisation for Economic Co-operation and Development 2017a). One drawback with read-across approaches is that they cannot be used where there are no existing experimental toxicity data for a new chemical class to read-across from. In addition, when considering read-across for risk management, the uncertainties associated with it may prove unfavorable.
Question 2: Can standard fish acute toxicity studies be replaced?
Information on fish acute toxicity will often be necessary to enable safety assessment and environmental protection. But must this always come from traditional in vivo studies? Wording in the recently revised test guideline 203 (Organisation for Economic Co-operation and Development 2019a) suggests that existing in vivo data as well as data from alternative approaches should be considered prior to conducting a full test guideline 203 test. Several alternative/replacement approaches which could provide useful data exist or are under development. These include predictive computational tools or in silico methods, such as quantitative structure-activity relationship models (QSARs), or experimental techniques which do not require the use of protected animals (defined as vertebrate animals used for scientific purposes, protected under legislation such as European Union Directive 2010/63 [European Commission 2010], including early life-stage embryos).
Computational models. A QSAR is a statistical model used to relate a substance's structure to toxic outcomes. Such models have started to be used in a regulatory context to predict fish acute toxicity. The majority of QSARs conducted for lower-tier endpoints under the European Union REACH regulation were for this purpose (European Chemicals Agency 2017c), and regulatory agencies themselves have developed models, such as the ECOlogical Structure Activity Relationships (ECOSAR) Class Program provided by the USEPA. The ECOSAR program was used to assess the utility of QSARs in predicting fish acute toxicity of PPP metabolites and was shown to be a robust approach worthy of further investigation, with potential for use in regulatory decision-making (Burden et al. 2016b). There is already one example in Europe where ECOSAR was used to estimate the toxicity of metabolites of the biocidal active substance imiprothrin (European Chemicals Agency 2018). Although this is promising, substances can only be assessed using QSARs if they fall into one of the chemical classes for which the model has been validated. Their use for chemicals with novel chemistries is limited until experimental data are available and subsequently incorporated into the models. Further, significant documentation describing the QSAR inputs and model is required to justify the prediction before QSAR endpoints will be considered in a regulatory setting (e.g., see European Chemicals Agency 2008).
Other computational models that could prove useful include the USEPA's Interspecies Correlation Estimation platform (US Environmental Protection Agency 2016b). These models estimate the acute toxicity of a chemical to a species, genus, or family with no test data (the predicted taxon) from the known toxicity of the chemical to a species with test data (the surrogate species). Invertebrate data could therefore be used to make predictions of fish acute toxicity without conducting a fish test, although further work will be needed to support such extrapolations for regulatory acceptance. Alternatively, where experimental data are available for one fish species, they could be used to 1) extrapolate to other fish species (particularly valuable for sectors where multiple species currently require testing; see section Embryo and in vitro assays and Table 1) and/or 2) aid in evaluation of interspecies variability for refinement of risk assessments (European Food Safety Authority 2013). The Sequence Alignment to Predict Across Species Susceptibility tool is another promising approach which could in future be used in a regulatory context to support extrapolation of toxicity information across species (LaLone et al. 2016).
Embryo and in vitro assays. Prior to embryos becoming free-feeding, they are considered to be incapable of experiencing pain, distress, suffering, or lasting harm (European Food Safety Authority 2005) and in many regions are not protected under animal welfare legislation in the same way as juvenile and adult fish (e.g., European Commission 2010). Their use is therefore considered beneficial from a 3Rs perspective, and the potential for fish embryo assays such as the FET to be used in place of studies conducted at later life stages has been extensively investigated. A version of the FET is already used in Germany to assess the fish acute toxicity of effluents (International Organization for Standardization 2007; Bundesministerium für Umwelt, Naturschutz und nukleare Sicherheit 2009; Lillicrap et al. 2016). For prospective chemical assessment there is an internationally validated method (OECD test guideline 236; Organisation for Economic Cooperation and Development 2011a, 2011b. Results have correlated well with standard in vivo fish acute LC50 data for most chemicals tested (Braunbeck et al. 2005;Lammer et al. 2009;Knöbel et al. 2012;Belanger et al. 2013;European Commission 2014). Belanger et al. (2013) indicated that the high correlation observed between the FET and traditional fish acute toxicity studies was similar to that seen for traditional LC50 values generated between different species of fish, further supporting the robustness of the approach. The broad regulatory applicability of this assay has, however, been questioned (European Chemicals Agency 2016; see next section).
Great investment has also been made in developing in vitro cytotoxicity assays. The rainbow trout RTgill-W1 cell line assay (International Organization for Standardization 2019; and OECD test guideline development in progress) demonstrates a good correlation with published in vivo LC50 endpoints for 35 chemicals with a range of properties and modes of action (Tanneberger et al. 2013). It has been shown to be robust and reproducible following a large-scale international interlaboratory ring test (Fischer et al. 2019). The applicability domain of the assay has started to be characterized. For example, the acute toxicity of fragrance chemicals has been accurately demonstrated (Natsch et al. 2018), but the assay is known to be unsuitable for detecting acute toxicity to certain neurotoxicants such as those which act directly via brain tissue. Like many in vitro test systems, it has limited metabolic capacity (Tanneberger et al. 2013), although efforts are being made more widely to increase its metabolic competence and utility (e.g., Luckert et al. 2017).
Replacement opportunity: Applying weight-ofevidence approaches By nature, whole-organism tests incorporate biological complexity; and in reality, no single nonanimal or alternative method can be used in isolation to replace this. It is natural for registrants and assessors to continue to use the "tried and trusted" (and legally mandated) methods such as in vivo fish acute toxicity tests, with which they are most familiar and for which data interpretation is relatively straightforward. There is still a need to build confidence in the use and interpretation of data from alternative methods despite them being validated to a high standard-greater than that of many of the classical in vivo test guidelines.
Despite the availability of a comprehensive data set, FET data are still not considered by regulatory authorities as a direct alternative to standard fish acute toxicity data, with the European Chemicals Agency citing that "a lack of quality data makes it challenging to conclude on several aspects of the applicability domain" (2016). The FET and other alternative data will gain wider acceptance if data from various methods are considered in combination-as part of so-called weight-of-evidence approaches (defined in European Chemicals Agency 2017b; also see European Chemicals Agency 2017d; Organisation for Economic Co-operation and Development 2019b). Examples of the successful use of combination approaches have just started to emerge under REACH (Supplemental Data 2). The practical implementation of the FET within a combination approach has started to be explored (Lillicrap et al. 2020; HUGIN SWiFT n.d.). A current OECD project is also examining the potential to incorporate the FET into the threshold approach (discussed in the section Refinement and reduction opportunity: Reducing the number of test groups), as part of an integrated approach to testing and assessment (IATA) to minimize testing on juvenile or adult fish (unpublished data). Data from assays such as the RT-gill cell line may also be incorporated into the IATA. Once published, the IATA approach would still need to be adopted within regulatory frameworks before it could be applied in practice. The key consideration before this happens is whether the combination approach can sufficiently protect the environment from the potential impacts of chemicals-no attempts to our knowledge have been made to assess how well even traditional methods achieve this. For sectors such as the cosmetics industry, the option to fall back on in vivo study data no longer exists, at least in the assessment of human health effects. In a hypothetical situation where fish acute toxicity tests were no longer an option across all sectors, what would the new data package look like? In practice, changes in safety-assessment practices will realistically need to be driven top-down by legislative and regulatory change.

Question 3: Can standard studies be reduced and refined?
In the shorter term, where standard studies remain absolutely necessary, key opportunity areas center around 1) reducing the number of studies conducted and the number of animals used within studies and 2) refining the tests to decrease the suffering experienced by the test animals.

Reduction opportunity: Harmonizing global test guidelines and regulatory requirements
Because most new substances/products are developed for global use, data packages are usually generated to meet the data requirements of all the regions and countries for which registration and marketing are intended (see Table 1). The most commonly used fish acute toxicity guideline for prospective assessments globally is OECD test guideline 203 (Figure 1). Under the Mutual Acceptance of Data (MAD) agreement, when an OECD test guideline is conducted in one adherent country under GLP, the data generated will be accepted in other countries adhering to MAD-with the aim of avoiding test duplication. Recent estimates predicted that MAD reduces the number of animals needed in testing new industrial chemicals by 32 702 (Organisation for Economic Co-operation and Development 2019c). Despite adhering to MAD, some geographical regions specify a preference for other standardized methods (e.g., US Environmental Protection Agency 2016a), which can have marked differences in study designs-translating to variations in the overall number of fish used (see Supplemental Data 3; n.b., part of the rationale for the last revision of test guideline 203 was to improve harmonization with the US guideline). There are some examples of non-OECD member countries adopting OECD test guidelines or having their own but virtually identical protocols which specify differences such as test species (Table 1). It can also be the case, such as in China (ChemicalWatch 2019), that testing must be conducted within the country of registration using local species. There is no guarantee that data generated in non-OECD member countries will be accepted in other regions.
Reducing the number of fish species used in PPP testing. The acute toxicity testing of PPPs is a prime example where differing regional requirements drive high fish use because 4 or more different species may be required for truly global registrations (Table 1). The test species required in different regions is guided by a range of considerations, including the intended application of the chemical; differing environmental exposure scenario(s) (for example, likely release into fresh or saltwater); commercially or ecologically important species in a given region; or special conservation measures for certain groups of endangered freshwater, estuarine, or marine species (see Table 1). It may be the case, however, that testing of the same substance in multiple fish species is not necessary if they all share the same key biological pathways resulting in adverse effects (Ankley et al. 2010;Groh and Tollefsen 2015). Understanding the similarities and differences in xenobiotic metabolism in different fish species is also important here (Kleinow et al. 1987;Nichols et al. 2007).
The key needs are to 1) better understand and/or 2) account for differences in species sensitivity. In reference to point 1, some investigations have been made to explore differences in sensitivity to acute effects in different fish species. Several papers suggest that rainbow trout (Oncorhynchus mykiss) is the most sensitive species across a diverse range of chemicals (Dyer et al. 1997(Dyer et al. , 2006Lammer et al. 2009), and the authors question the need for separate testing of tropical fish species. An analysis conducted by the European Food Safety Agency (2006) also showed that rainbow trout was generally the most sensitive species (though the comparator species was not the same for each substance, and caution should be exercised in the interpretation of these data). The sensitivity of saltwater versus freshwater fish has also been compared (Hutchinson et al. 1998;Leung et al. 2001;Wheeler et al. 2002Wheeler et al. , 2014Maltby et al. 2005) with this literature, suggesting that saltwater and freshwater fish are similarly sensitive. In line with this there has been a rationalization of testing requirements in certain regions. For PPP active substances in Europe only a fish acute toxicity test conducted using rainbow trout is now required (European Commission 2013), where previously a warm-water species was also mandatory (European Commission 1991). The USEPA in collaboration with the National Toxicology Program's Interagency Center for the Evaluation of Alternative Toxicological Methods is currently conducting a retrospective analysis project to assess whether current US regulatory requirements for the testing of warm-and cold-freshwater plus estuarine/ marine fish can be reduced to 2 or even one species (National Toxicology Program 2020). To point 2, part of the reason for applying "assessment factors" to the lowest (eco)toxicity value in the derivation of predicted-no-effect concentration is to account for variability in species sensitivity. Where species sensitivity differences are not anticipated, additional species testing should not be required because the degree of sensitivity difference should in theory fall within the variation accounted for by the assessment factor. Greater consideration of likely exposure scenarios can also reduce the number of species used for testing (as discussed in the section Reducing the number of fish species in PPP testing). This is currently an option in the United States; if, for example, a PPP is not going to be applied in regions adjacent to mangroves, it may be possible to waive testing in a saltwater species.
Clearly, individual regional provisions that reduce the number of fish species tested are likely to have a limited impact on the 3Rs. Global harmonization of vertebrate regulatory requirements is needed, but this will be a difficult and slow process because local legal and societal needs must be considered.
Reducing the need for formulation testing. For some substance types, likely environmental exposure scenarios mean that fish acute testing of finished/formulated products is not a standard requirement (Table 1)-for example, human medicines are most likely to enter the environment following patient use (German Advisory Council on the Environment 2007) having undergone significant modification from the administered formulated product. However, PPP active substances may be used in differing formulations with, for example, slight differences in the adjuvants/solvents used; and under some regional requirements each mixture must be assessed independently (Table 1). This often requires that in vivo data be generated for each formulation. This is despite evidence that tests conducted with the active substance could potentially predict the formulation toxicity (Schmuck et al. 1994) and that most formulations are not expected to exhibit more than additive toxicity compared with their constituent active substance components (Creton et al. 2014). Even where there are provisions to waive testing of "similar" formulations, in practice this can be challenging. For example, the European Union PPP regulation (European Commission 2009a) states that fish acute toxicity testing shall be performed where the acute toxicity of the PPP cannot be predicted on the basis of the data for the active substance and extrapolation on the basis of available data for a similar PPP is not possible. However, there is a lack of agreed guidance on how to robustly demonstrate that a preparation is sufficiently similar or that toxicity can be reliably predicted. There are already ongoing efforts to decrease the need for mammalian acute toxicity testing of formulations for human health assessments. This includes a program by the USEPA to evaluate the ability of the Globally Harmonised System of Classification and Labelling of Chemicals dose additives mixtures equation to predict acute toxicity categories for PPP formulations/products (US Environmental Protection Agency 2017; note for classification and labeling of industrial chemical mixtures in Europe, hazard classification is already calculated on the basis of the proportional ecotoxicity of components and their percentage of concentrations). More sophisticated platforms are also under development for predicting the toxicity of PPP formulations (e.g., see National Centre for the Replacement, Refinement and Reduction of Animals in Research 2018). Similar approaches for fish acute toxicity data, particularly if the purpose of data generation is solely hazard identification and classification, may be viable.
Refinement and reduction opportunity: Reducing the number of test groups. Limit tests are performed with a single concentration of 100 mg/L (or limit of solubility) using 7 or more fish. In the absence of mortality, there is at least 99% confidence that the LC50 is greater than the tested concentration-considered nontoxic to fish for most safetyassessment purposes (Organisation for Economic Co-operation and Development 2019a). Another way to reduce fish numbers is through the "threshold approach"-where an initial test is carried out in fish at one concentration selected based on the results of Daphnia and algae toxicity tests. The full response concentration series is only triggered if mortality is observed at this threshold concentration. Retrospective data analysis has demonstrated that a reduction of between approximately 38 and 73% of fish use could be achieved using a threshold approach, depending on the sector (Hutchinson et al. 2003;Jeram et al. 2005;Creton et al. 2014). It can also be viewed as a refinement because high doses need not be tested so often. The approach works well for risk assessments which consider the aquatic compartment as a whole, where endpoints are combined to derive an assessment that is applied to the lowest value from the different taxa. However, its application can be complicated for regulations that require risk assessments for individual taxa because varying assessment factors and refinements may be needed (Creton et al. 2014). It has been suggested that FET data could be used in concentration range-finding (Rufli and Springer 2011;Rawlings et al. 2019; Organisation for Economic Co-operation and Development 2019a), although the logistics of incorporating such testing strategies into commercial operations would be challenging (related to scheduling, analytical verification, etc., and the limited number of laboratories offering FET as a routine assay currently because of the lack of regulatory need and thus industry demand). It is unclear how widely threshold approaches are currently being applied in practice.
Test substances often require the use of solvents to aid dissolution, and typically a solvent control group is included in the study in addition to the dilution water control. Following the recent OECD test guideline 203 revision, there is now the option to omit the dilution water control group when a solvent is used, decreasing the number of fish used by 7 per test (Organisation for Economic Co-operation and Development 2019a). The OECD has an ongoing project to determine the statistical power of test guideline 203 studies with solely a solvent control, and results indicate that there is no need to include a water control (unpublished data). It is not clear yet how widely such an approach will be accepted across different regulatory authorities/jurisdictions, particularly those which prefer non-OECD test guidelines. The recently updated OECD guidance document 23 on aqueous-phase aquatic toxicity testing of difficult test chemicals (Organisation for Economic Co-operation and Development 2019d) includes revisions to reduce occasions when solvents need to be used for poorly water-soluble substances. This includes better use of different generator systems for dosing, including passive dosing methods. Where solvent use is avoided, the solvent control group is not necessary. However, it should be acknowledged that there are situations where solvent-assisted test item delivery is required to achieve appropriate exposure and ensure that a valid test is performed (Green and Wheeler 2013). In such cases, it is better to have employed a solvent control and deliver a robust study that does not subsequently need repeating. The provision of improved technical guidance such as the updated guidance document 23 can also help to reduce the need to repeat studies by supporting the fulfillment of recommendations or validity criteria within test guidelines (Burden et al. 2017).
Refinement opportunity: Using humane/early endpoints in place of lethality Early euthanasia of moribund fish is often practiced in Europe and Canada (with European Union Directive 2010/63/ EU stating that death should be "substituted by more humane endpoints using clinical signs that determine the impending death" [European Commission 2010]). Humane endpoints are defined by the OECD as "the earliest indicator in an animal experiment of severe pain, severe distress, suffering, or impending death" (Organisation for Economic Co-operation and Development 2000). The routine, global use of "sublethal" Places a relatively high financial and resource burden on industry and requires cross-industry alignment in the approaches applied. Regulatory agencies would need to be open to and encouraging of such an approach and enter into formal agreements to retrospectively assess and publish scenarios where the nonstandard approaches are acceptable.
Increase proficiency/capacity to conduct nonstandard approaches in contract laboratories.
Currently challenging because of lack of regulatory need/industry demand. May be incentivized by provision of safe haven approaches. Collect case studies on registered chemicals where alternative information types have been generated/used successfully. This could involve a "blinded" comparison of regulatory decisions/ outcomes based on traditional vs alternative data.
This would also support the definition of the applicability domain of alternative approaches (based, e.g., on chemical class or mode of action).
Support regulatory bodies in interpretation of integrated approaches to testing and assessmentstyle assessments for the fish acute toxicity endpoint, drawing on the new tools available.
This would be aided by the provision of relevant and clear guidance.

Improving global harmonization of data requirements
To definitively determine whether 1) multiple species in PPP and biocide testing and 2) formulation in addition to active substance testing is needed and, where appropriate, align global data requirements.
Conduct retrospective data analysis to determine the difference in interspecies sensitivity on a range of compounds encompassing a broad range of modes of action that have been tested on a range of fish species. This should allow conclusions to be made on whether there is a generally sensitive fish species on which the application of assessment factors would provide suitable levels of protection.
Such an analysis should, if possible, also consider whether there are any significant differences in marine, brackish, freshwater, cold-water, or warm-water species and whether there is a difference between classes or subclasses of fish. Publicly available databases that may be drawn from, such as the EnviroTox database ) and the OECD's eChem portal. Conduct retrospective data analysis comparing the acute toxicity of a range of different formulation types with the same active substance tested in the same species, to determine whether there are any patterns and if any general rules can be established. Further exploration of the use of reliable prediction methods (e.g., machine learning techniques), particularly for hazard identification and classification where substances are assigned into broad categories.
Calculated formulation toxicity could be compared with existing in vivo formulation testing data to validate the calculation approach. Guidance would then be prepared to advise on when a preparation is sufficiently similar to the active substance, negating the need for new testing. Prediction approaches may also be applicable to assess the toxicity of metabolites/degradation products, particularly for PPPs. This has started to be explored using QSAR models (see Question 2: Can standard fish acute toxicity studies be replaced?; .

Reducing/refining animal use in mandatory in vivo studies
To increase the uptake of the threshold approach and use of humane endpoints in place of lethality.
Identify the extent of and barriers to uptake of the threshold approach within industry and regulatory bodies, and devise efforts to overcome these. Establish who will be responsible for collating sublethal clinical signs data with the cooperation of industry (as owners of the data). Conduct data analysis (retrospectively and through prospective data collection) to determine which sublethal signs are relevant and predictive and could be used as the modified endpoint.
These aspects were the subject of discussion at the 2020 UK-Defra-funded workshop (manuscript in preparation). This will require consistency between laboratories regarding identification and recording of signs within individual fish, which may not be possible retrospectively. Critical that there is global agreement on the standard use and definition of humane endpoints so as not to undermine the mutual acceptance of data agreement if some countries continue to require the lethality endpoint (note the authors have not experienced this to be the case to date). Defra = Department for Environment, Food and Rural Affairs; OECD = Organisation for Economic Co-operation and Development; PPP = plant protection product; QSAR = quantitative structure-activity relationship.
humane endpoints in place of death would substantially reduce the degree of suffering experienced by test animals. It may also be considered more environmentally relevant and indicative of "ecological death"-in the wild these fish would not survive (e.g., because of predation). There is already a precedent in mammalian acute toxicity testing where "evident toxicity" is used as the endpoint in place of death (e.g., Organisation for Economic Co-operation and Development 2002Sewell et al. 2015). "Evident toxicity" is defined as clear signs of toxicity without causing severe toxic effects or mortality, which predict that exposure to the next highest concentration will cause severe toxicity or death/moribundity in most animals. There is, however, no international consensus on which sublethal clinical signs define moribundity or are predictive of death in fish. Early termination of studies when fish are showing signs of "considerable suffering" is featured in other test guidelines, but these are designed to examine less severe endpoints, for example, OECD 240 (Organisation for Economic Co-operation and Development 2015). The 2019 revision to OECD test guideline 203 requires the recording of certain sublethal clinical signs, with the option to record further/more detailed signs. The aim of collecting these data is to enable the development of detailed guidance on how to identify when individual fish should be humanely terminated before the end of the test. A UK Department for the Environment, Food and Rural Affairs-sponsored expert workshop was held in early 2020 with the aim of identifying the knowledge gaps impeding the standardized recording and reporting of sublethal clinical signs of toxicity in fish and ultimately moving away from using mortality as the endpoint in OECD test guideline 203. It is critical that the signs identified are genuinely predictive of death because the use of humane endpoints could affect experimental endpoint outcome by lowering LC50 values. Lower LC50 values could trigger unnecessary higher-tier testing within the same risk assessment framework and lead to the use of more animals.

HARNESSING THE OPPORTUNITIES
The present exploration of the current situation has identified opportunities for further work to reduce uncertainty in key areas and genuinely support greater application of the 3Rs. This will only occur if strong scientific evidence is provided which demonstrates that protection goals will not be compromised. Table 3 summarizes examples of how these opportunities could be harnessed, under the themes of accelerating the acceptance of alternative approaches, improving global harmonization of data requirements, and reducing/refining within mandatory in vivo studies.

IN SUMMARY
There is undoubtedly value in the different industry and regulatory sectors coming together in this unique way to share experiences and explore how approaches applied in one sector may be applicable to another. A key example includes considering the wider relevance and application of exposure-driven approaches. For some sectors, such as the cosmetics industry, generating in vivo data is already legally no longer an option in many regions to assess human health effects, and absolute replacements must be utilized to address safety concerns. Global harmonization of requirements is paramount in maximizing the 3Rs impact on regulatory change. Pertinent questions remain: 1) Can alternative approaches be used in existing or modified regulatory hazard and/or risk assessment schemes to achieve at least the same level of protection compared with current practices (and if the appropriate level of protection is not afforded, what would be required to ensure that it is)? 2) What is actually needed to address perceived data gaps? There are further scientific reasons for reconsidering the current paradigm, to meet the challenges that changing environmental landscapes and technologies will raise. Ultimately, change in practice needs to be driven by a "top-down" approach which compels the community to think differently-in line with the recent commitment by the USEPA to eliminate mammalian safety testing by 2035. It is the perfect time to reflect on the current situation and consider how to make best use of the new techniques and approaches available, while still ensuring that testing is sciencedriven and protective-to genuinely reduce the number of fish experiencing severe suffering in the quest to assess chemical-induced acute toxicity.
Supplemental Data-The Supplemental Data are available on the Wiley Online Library at https://doi.org/10.1002/etc.4824. Acknowledgment-We thank N. Gellatly and F. Sewell (NC3Rs) for their input into the preparation of the manuscript. We also thank P. Xirogiannopoulou (Syngenta) for her assistance in devising Supplemental Data 3.
Disclaimer-The views and statements expressed in the present study are those of the authors alone. The views or statements expressed in this publication do not necessarily represent the views of the organizations to which the authors are affiliated, and those organizations cannot accept any responsibility for such views or statements. Data Availability Statement-Data, associated metadata, and calculation tools are available from the corresponding author (natalie.burden@nc3rs.org.uk).