Big Question to Developing Solutions: A Decade of Progress in the Development of Aquatic New Approach Methodologies from 2012 to 2022

In 2012, 20 key questions related to hazard and exposure assessment and environmental and health risks of pharmaceuticals and personal care products in the natural environment were identified. A decade later, this article examines the current level of knowledge around one of the lowest‐ranking questions at that time, number 19: “Can nonanimal testing methods be developed that will provide equivalent or better hazard data compared with current in vivo methods?” The inclusion of alternative methods that replace, reduce, or refine animal testing within the regulatory context of risk and hazard assessment of chemicals generally faces many hurdles, although this varies both by organism (human‐centric vs. other), sector, and geographical region or country. Focusing on the past 10 years, only works that might reasonably be considered to contribute to advancements in the field of aquatic environmental risk assessment are highlighted. Particular attention is paid to methods of contemporary interest and importance, representing progress in (1) the development of methods which provide equivalent or better data compared with current in vivo methods such as bioaccumulation, (2) weight of evidence, or (3) ‐omic‐based applications. Evolution and convergence of these risk assessment areas offer the basis for fundamental frameshifts in how data are collated and used for the protection of taxa across the breadth of the aquatic environment. Looking to the future, we are at a tipping point, with a need for a global and inclusive approach to establish consensus. Bringing together these methods (both new and old) for regulatory assessment and decision‐making will require a concerted effort and orchestration. Environ Toxicol Chem 2024;43:559–574. © 2023 The Authors. Environmental Toxicology and Chemistry published by Wiley Periodicals LLC on behalf of SETAC.


Increased awareness for sustainability needs
In 2012, 20 key questions related to hazard and exposure assessment and environmental and health risks of pharmaceuticals and personal care products in the natural environment were identified (Boxall et al., 2012).A decade later, the present study examines the current level of knowledge around one of the lower-ranking questions at that time, number 19 (out of 20): "Can nonanimal testing methods be developed that will provide equivalent or better hazard data compared with current in vivo methods?"The increase in production and diversification of synthetic chemicals poses a global challenge because of complex human and environment exposure scenarios, coupled with a lack of toxicity data for the majority of chemicals in the environment.While ranked low at the time, today the need for nonanimal-based methods is widely recognized as being essential for all three pillars of sustainability (economic, environmental, and social).Nonanimal methods offer potential economic and ethical opportunities for "greener" chemicals and new assessment tools with high-throughput testing services, allowing for a safer environment with broad societal support.This promise has heralded shifts at the top levels of government and industry to reduce reliance on in vivo animal testing for risk and safety assessment purposes represented by an increasing number of groups supporting and working on nonanimal-based methods (Supporting Information, Table S1).

Human and environmental risk assessment share similar challenges
Early on, efforts to reduce reliance on in vivo animal testing were largely focused on developing new methods to replace individual tests or at least to find ways to refine or reduce the number of animals used in a laboratory study.However in the past 10 years the terminology of nonanimal testing methods has evolved to new approach methods reflecting the idea that nonanimal methods are new approaches providing new types of data that usually need to be used within a structured assessment process based on multiple methods data, and which need new concepts for risk assessment.The abbreviation "NAMs" now integrates both "nonanimal methods" and "new approach methods" for a multimethod-based approach, and is used interchangeably for both.Chemical risk assessment and management was established as a scientific field over 40 years ago, with principles and methods developed on how to conceptualize, assess, and manage risk.Historically, human risk assessment (HRA) and environmental risk assessment (ERA) developed independently within specific regulatory chemical sectors (chemicals, biocides, pharmaceuticals, etc.), resulting in the use of different terminology, separate databases, and varying regional requirements.Traditionally, HRA includes an assessment of possible exposure pathways, kinetics including the potential for bioaccumulation within the organism, sensitive organs, modes of action, and no-effect levels.Conceptually similar, ERA encompasses an assessment of exposure pathways and the fate (i.e., kinetics) of a substance within the environment (surface water, sewage treatment plants, soil, sediment, and groundwater), including its persistence and bioaccumulation in addition to its impact on numerous organisms.However, unlike HRA, toxicity to multiple organisms (aquatic, terrestrial, and microorganisms) is examined, and no-effect concentrations in the more sensitive organisms are identified.Both HRA and ERA require extrapolation of results from one or a few experimental species within their specific artificial environments to define protection levels for other relevant organisms, be it humans or various environmental.Notably, HRA aims to provide specific information on a multitude of human organ systems, modes of action, and their interaction for humans with their variable genetics and lifestyle, whereas ERA aims to provide specific information on a multitude of organisms and their interactions within their variable environment (Figure 1).Extrapolation can be carried out pragmatically using standard assessment factors or adaptations thereof or, more scientifically, engaging probabilistic data-based extrapolation models.There is also growing recognition that comprehensive and reliable identification of similarities and differences between organisms will enhance cross-species extrapolation of potential adverse toxicological effects (Rivetti et al., 2020), with international consortiums established to start to address the challenges in extrapolating knowledge across classes (e.g., International Consortium for Advancing Cross-Species Extrapolation in Regulation).
The use of animals in HRA and ERA is still increasing, despite growing awareness of the need to reduce reliance on animal testing, in line with the "reduction" and "replacement" aspects of the 3Rs principles (with the third R being refinement, centered on minimizing the pain, suffering, distress, or lasting harm that research animals might experience) first proposed over 60 years ago.In Europe, the total number of animals used in testing in Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) testing (European Chemicals Agency, n.d.) has essentially doubled in 4 years, from 1.1 million as reported in 2016 to 2.4 million animals (European Chemicals Agency, 2020).This trend is likely to continue for REACH given the imminent addition of testing requirements for endocrine disruptor identification and that similar changes are expected in other regions which may further increase animal use.
A major challenge in terms of the use of laboratory animals arises from the need to evaluate bioaccumulation potential, which may lead to chronic outcomes not necessarily revealed in current regulatory toxicity testing.Bioaccumulation generally relies on determining the bioconcentration factor (BCF) as the sole decisive metric, with the recognition that slow metabolism can result in potentially higher bioaccumulation, with implications for both the environment and human health.It is important to highlight that there are more complex metrics which can also be calculated including the BCF (aqueous exposure routes), the biomagnification factor (BMF; dietary exposure route), the bioaccumulation factor (all possible exposure routes), and trophic magnification factors (TMFs) derived from mesocosm studies.While standard test protocols are available for the determination of BCFs and BMFs under well-defined laboratory conditions, a BCF is typically required in risk assessment to estimate concentrations in prey for the investigation of risks from secondary poisoning.
For example, the current European Medicines Agency (EMA) guideline for environmental assessment of human medicinal products requires a fish bioconcentration test (Organisation for Economic Co-operation and Development [OECD], 2012) in Phase I for persistence, bioaccumulation, and toxicity screening of drug substances with a log octanol-water partition coefficient (K OW ) > 4.5 and in Phase II for those with a log K OW > 3. It is important to note that this guideline is currently under revision, however, because, per the log K OW criteria in the guideline currently in effect, an animal study is required.Although the test can use hundreds of fish per chemical, efforts to reduce organisms are reflected in the current guidelines.Specifically, the option to use a "minimized" test requires fewer organisms to estimate kinetic BCF using fewer sampling time points, provided that uptake and depuration are expected to follow first-order kinetics.In addition, a FIGURE 1: Outline of conceptual similarities between human health risk assessment (HRA) and ecological risk assessment (ERA): Risk assessment require tasks which are conceptually similar for HRA and ERA.Current regulatory approaches are based on a battery of animal tests for HRA and ERA, which are slowly being replaced by new approach methodologies (NAMs).Challenges for extrapolation from the testing models to reality relate for HRA to the level of detail for the multitude of modes of action, organs, and interactions (all used for globally harmonized system of classification and labeling of chemicals), as well as human variability.For ERA, target organs of environmental organisms are not of immediate interest for regulators with the extrapolation challenges relating to the identification of an overall low-/no-observed-effect concentration for the multitude of organisms, populations, their interactions, and environmental variability.Importantly, mode of action* information in ERA currently is critical only for the identification of endocrine disruption.Despite these conceptual similarities, availability of NAMs and related guidance is much limited for ERA compared with HRA.OECD = Organization for Economic Co-operation and Development.
single test concentration may be used in the full or minimized test design when it is likely that the BCF is independent of the test concentration (Burden et al., 2017).
Integration of NAMs in HRA and ERA, at disparate pace Significant efforts to reduce the use of animals via technological, computational, and scientific advances have given rise to NAMs or nonanimal methods, both of which have been used interchangeably in the literature and specifically reference any nonanimal technology, methodology, approach, or combination thereof that can be used to provide information on chemical hazard and risk assessment that avoids the use of intact animals (US Environmental Protection Agency [USEPA], 2022).NAM includes various predictive in silico methods and models (e.g., quantitative structure-activity relationships [QSARs], physiologically based toxicokinetic modeling [PBTK]), in vitro testing (e.g., cell-based, cell-free, biochemical assays).In addition, embryo testing is considered to represent a NAM (e.g., whole-animal exposure prior to independent feeding such as the fish embryo toxicity [FET] test).Assays such as FET use organisms in the eleutheroembryonic stage that are less capable of independent feeding.It has been considered that at this stage the embryos are not capable of experiencing pain, distress, suffering, or lasting harm (Strähle et al., 2012); and the assays are considerably shorter in duration than for traditional test guideline amphibian and fish assays.Furthermore, NAMs may also include a variety of state-of-the-art methods, such as "high-throughput screening" and "high-content methods," as well as some of the more conventional methods that aim to improve understanding of toxic effects using toxicokinetic-toxicodynamic (TK-TD) knowledge.Further information can be found in the Supporting Information.Providing information on chemical hazard and risk assessment that avoids the use of intact animals, the OECD has standardized and internationally approved test guidelines for several NAMs for HRA used to evaluate dermal absorption, dermal irritation, eye irritation/corrosion, and dermal sensitization potential and genotoxicity.Further, in progress at the OECD level is HRA NAM validation for carcinogenicity and developmental neurotoxicity.In contrast, OECD standardized ERA NAMs are available for acute aquatic toxicity, aquatic bioconcentration/clearance, and some tests for endocrine mechanisms.
Originally suggested as an alternative to animal studies, NAMs may also be used as a complement to animal testing, increasing our understanding of internal concentrations of compounds and how they relate to mechanisms of toxicity, as well as answering scientific questions which cannot be well addressed by current in vivo regulatory testing.A series of reviews (European Commission, n.d.) highlight the nonanimal models that are being used for basic and applied biomedical research such as on neurodegenerative diseases and immune oncology.Indeed, several of the NAMs established for HRA or biomedical research are also relevant for ERA, although they have not been applied in that context yet.While progress toward change in the field of HRA draws on numerous articles, experiences, and recommendations resulting in an explosion of scientific and policy initiatives, a similar level of engagement and change has so far not been observed for ERA (Figure 1).

The (need for) actions toward the use of NAMs
Catalyzed by the announcement of the USEPA (2019) to reduce animal testing and funding by 30% by 2025 and eliminate it by 2035, numerous new USEPA policies and new guidance for the use of NAMs have been issued for HRA.Likewise, direct European Union funded research projects in combination with research partnerships have facilitated the development and use of NAMs in combination with the establishment of new assessment frameworks for regulatory toxicology (e.g., EU-ToxRisk [European Union, n.d.], ASPIS Cluster [n.d.], the Partnership for the Assessment of Risk from Chemicals).
Some countries are outpacing others, at least in HRA, and appear to be acting as global catalysts for change.Ongoing discussions with relevant stakeholders across the globe have resulted in some movement in terms of adoption of some NAMs, often reinforced by changes to legislation, examples of which are summarized in Supporting Information, Table S1, with further examples for specific working groups and associated legislations also briefly outlined.Yet further action within numerous individual overlapping sectors but especially at the governmental level will be key to ERA-specific change, with some suggestions outlined in Figure 2.These actions by various sectors create the necessary change to support the adoption of NAMs in aquatic ERA and specifically the priority research questions outlined (Textbox 1).
The increasing availability of NAMs, combined with political change, has now created an opportunity to redefine how we carry out risk assessment, providing a rare opportunity to enhance knowledge of associated hazard and exposure.Although the technological and methodological landscape has evolved rapidly in support of these changes, regulatory acceptance of alternatives to in vivo testing methods, capacity, and training in them have not kept pace, with various reasons cited (Mondou et al., 2020).One of the important reasons relates to the fact that the regulatory use of NAMs needs consensus of hundreds of experts and stakeholders, which is far beyond any usual scientific review process.However, even in the consumer products sector, where animal testing has been phased out in some global regions by law, conflicting requirements mean that traditional toxicity tests continue to be conducted in addition to NAMs (Fentem et al., 2021).More optimistically though, there are recent examples of regulatory agencies actively encouraging discussion with registrants regarding the use and submission of data from innovative technologies including NAMs (e.g., EMA Innovation Task Force [EMA, n.d.]).Further action by regulatory agencies appears essential for the evolution toward a NAMbased regulation.Moreover, for success and a truly sustainable transformation, parallel initiatives by academic and industry actors and collaboration between all sectors are necessary, and we indicate some ideas toward these goals in Figure 2.
The aim of the present study is to focus on the status quo and potential evolutions in the use of NAMs in ERA, in particular fish or other aquatic organisms.A nonexhaustive list of articles of interest which benefited developments in ERA is supplied in the Supporting Information.Focusing on the past 10 years, only works that might reasonably be considered to contribute to advancements in the field or methods of particular special contemporary interest and importance are highlighted.

CURRENT STATE-OF-THE-SCIENCE FOR AQUATIC SPECIES
Environmental risk assessment of chemicals is largely based on aquatic ecotoxicology and faces several scientific challenges including the large number of species that are potentially affected in addition to the large number of chemicals emitted into the environment, various life stages, potentially chronic exposures, and the need to assess impacts at a population level, with a vast array of abiotic and biotic modifiers (Textbox 1).Several of the aforementioned variables can be modeled using interspecies correlation estimates (Raimondo et al., 2015), species sensitivity distributions, chemical toxicity distributions, and the ecological threshold of concern (EcoTTC).Building on available experimental data, they can in principle be used with input data from in vivo methods or NAMs, allowing the prediction of acute fish toxicity, for example.Further, work is refining the available knowledge as to what extent other plants or invertebrates are more sensitive than fish for the evaluation of acute toxicity of many compounds (e.g., Rawlings et al., 2019) because this could provide better returns for the protection of ecology, while also benefiting efforts to reduce animal testing.Moreover, scientifically improved predictions for environmentally safe concentrations may be generated by large multidisciplinary studies which incorporate both the development and use of NAMs that provide mechanistic data as well as the compilation of systematic knowledge about evolutionary conservation of (eco)toxicological mechanisms among species.Several online resources for comparative and predictive toxicology supporting this goal are available.
Yet, some assessments (e.g., bioaccumulation) are being carried out in fish, not just as an important component of ERA for ecosystems but also for human health protection (i.e., ingestion of contaminated fish via the diet).Thus, the fish as a model organism is pragmatically being employed in various fields including toxicology, pharmacology, and etiology of human disorders.Because of its small size, rapid growth, and freely accessible embryonic stages, it may provide some practical advantages compared with mammalian species.Based on its evolutionary relationship with mammals, it may also offer opportunities for molecular mechanistic studies and cross-species extrapolation.Furthermore, if testing is limited to the embryonic stages, then a higher throughput and a reduction in laboratory animal usage are achievable, while still protecting the environment.However, to be fully implemented, more fundamental knowledge concerning mechanisms of toxicological outcomes in nonmodel or target organisms needs to be generated.Nevertheless, and despite the slow pace, there have been some recent achievements (not exhaustive) in aquatic nonanimal alternatives, in terms of regulatory accepted methods but also emerging technologies which offer glimpses of a nonanimal-based framework for aquatic ERA (see below).

Aquatic embryo testing and weight-of-evidence assessment
Assessment of acute fish toxicity is an integral part of environmental hazard and risk assessment regulations and is classically carried out using the acute fish toxicity test, which is conducted according to OECD test guideline 203 (OECD, 2019a) or similar guidelines, although other ecotoxicological endpoints are also used depending on specific need.The acute fish toxicity test is the most frequently used vertebrate ecotoxicology assay because it is required in nearly all global regulatory schemes for the purposes of risk assessment in addition to classification and labeling of chemicals (Burden et al., 2020).Although this test has a number of recognized limitations, such as being low-throughput, lacking in mechanistic information, and reports of significant uncertainties, in addition to the severe suffering involved because of the nature of the test, there is currently a lack of consensus by regulators on an alternative approach.However, studies are emerging which demonstrate that many of these issues can be addressed using alternative methods (see Paparella et al., 2021).
Currently, two experimental methods are standardized and approved as OECD test guidelines as alternatives to the in vivo fish toxicity test: the fish gill cell line acute toxicity test using rainbow trout (Oncorhynchus mykiss) Rtgill-W1 cell line (OECD, 2021a) and the fish embryo acute toxicity test (OECD, 2013), although so far their use has been limited because of a continued preference for traditionally accepted approaches and perceived difficulties in interpreting and combining new data types.While the FET test has not been accepted as a standalone replacement for regulatory purposes such as under the REACH regime (Sobanska et al., 2018), it can provide significantly more information about test compounds than originally envisioned during the guideline development (see von Hellfeld et al., 2022).In part because of the versatility of the protocol, it has been beneficial to the development of numerous decision-making tools, some of which will be discussed in later sections.Furthermore, other alternative regulatory assessment assays are emerging, with the OECD recently TEXTBOX 1 Priority research and implementation questions for the next 5-10 years to support adoption of new approach methodologies in aquatic environmental risk assessment.The order of the questions does not indicate their relative priority.
1) What is necessary to evolve the discussion and mutual understanding between developers and end-users (industry/regulatory agencies) on when and how to further methods/approaches as new science or regulatory change emerges?2) How can potential divergence of environmental noobserved-effect concentrations be better assessed?Can variability between taxa and attributable to multiple environmental modifiers, like chemical mixture effects, abiotic stressors including climate change, and biotic stressors like variable food-webs be incorporated?3) Increase fundamental research on the following: o Toxicokinetic-toxicodynamic divergence in sensitivity between a wider selection of organisms o Kinetic in vitro-in vivo extrapolation models which protect numerous organisms' population demographics o Better use/reuse and interoperability of -omicsbased risk assessment toxicology data.4) (How) Can knowledge about the uncertainties of traditional tests for environmental protection be better used to define benchmark criteria for the scientific and regulatory acceptance of new approach methodologies data?
releasing test guidelines utilizing transgenic Xenopus laevis, Danio rerio, and Oryzias latipes embryos for evaluation of potential endocrine activity (OECD, 2019b(OECD, , 2021b(OECD, , 2022)).Further guidelines are in draft form at various levels using Oryzias latipes (rapid estrogen activity in vivo assay) and Daphnia magna (shortterm juvenile hormone activity screening assay using Daphnia magna).Notably, the latter tests currently cannot contribute to any reduction or replacement of adult animal tests that definitively assess endocrine disruption because they are considered to inform on endocrine mode of action only and not on potential adversity as required to meet the current definition of an endocrine disruptor (International Programme on Chemical Safety, 2002).However, guidance outlining the specific conditions for the use of the Xenopus eleutheroembryonic thyroid assay as a mechanistic assay to detect thyroid active substances as an alternative to the in vivo amphibian metamorphosis assay (OECD, 2018a) for plant protection products has recently been published by the European Chemicals Agency and the European Food Safety Authority (Andersson et al., 2018).For a broader up-to-date review of the current and potential use of NAMs in the assessment of endocrine activity and disruption, please refer to Mitchell et al. (2023).
Weight-of-evidence (WoE) assessment is frequently cited as necessary for a wide variety of decision-making needs because of the complexity of environmental data (Hall et al., 2017).It is generally understood as a method for decision-making which relies on multiple sources of information and lines of evidence and is expected to be the game changer for the regulatory acceptance of NAMs for acute fish toxicity.One such example of this can be found in the European Chemical Industry Council's Long-Range Research Initiative ECO51 project SwiFT (HUGIN, 2020), which has developed a comprehensive online toxicity assessment system with built-in examples to facilitate the acceptance of NAM data to routinely fill the regulatory requirements currently provided by the acute fish toxicity test.This Bayesian model integrates FET data with numerous lines of evidence including toxicity data from algae, daphnids, and the Rtgill-W1 cell line (OECD, 2021a) and information on fish neurotoxicity and biotransformation in addition to QSARs and diverse ecotoxicological and physicochemical data sets (Moe et al., 2020).Combined, there is over 87% agreement with acute fish toxicity test outcomes (when the aim is to predict if the median lethal concentration is above or below 1 mg/L), which, considering the uncertainties of acute fish toxicity test data, represents a practically perfect correlation.Although the latter project demonstrated how a quantitative WoE approach can successfully lead to animal test replacement for acute fish toxicity test, it has become increasingly clear that the integration of numerous lines of evidence will also be important in progressing the development and acceptance of nonanimal methods in ERA, beyond this specific endpoint.

Omics technologies, complex data, and computational approaches
Emerging and existing technologies have a significant impact on toxicological investigations and regulatory science alike.Increased access to mechanistic information via -omic approaches, can directly inform adverse outcome pathway (AOP) frameworks, assisting in identification of mode of action.Perspectives on how high-content -omic data sets can support ERA through the AOP framework was recently summarized (Brockmeier et al., 2017).Technical guidance for the use of AOPs in developing integrated approaches to testing and assessment (IATA) was harmonized at the OECD level (OECD, n.d.a).However, although these methods (e.g., proteomics, lipidomics, metabolomics, and transcriptomics) have been reviewed in numerous contexts including in the development of AOPs, chemical risk assessment, and the prospects and challenges of multi-omics data integration in toxicological research, these technologies have had limited acceptance for regulatory purposes (Textbox 1; Viant et al., 2019).Their absence in decision-making has been attributed to the lack of best practice, standardization, and reporting guidance, all of which build confidence in methodological results.Different multistakeholder groups, including the OECD and specific advisory groups (OECD, n.d.b), have collaborated to improve the adoption of these approaches in ERA through the development of guidance documents and frameworks (Harrill et al., 2021;Viant et al., 2019).But without a clear strategy to evaluate emerging technologies which are both rapid and appropriate, their full potential will remain largely unrecognized and unused (Anklam et al., 2022).Yet, notable change is emerging.In April 2022, the first transcriptomics-based NAM, the Genomic Allergen Rapid Detection Test Method for Skin Sensitization (GARD™skin), was approved at the OECD level, representing a breakthrough for regulatory acceptance of such technology.Furthermore, this sets a new standard for new OECD test guideline development, which will be beneficial to the whole field.
In the adoption of -omic technologies, an enormous volume of complex data will be generated, which, when combined with experimental databases, should provide sufficient data to enable in silico modeling of the ecotoxicity of existing and new chemicals.In parallel, this may also increase confidence in -omics data and computational approaches via their mutual support.Although numerous examples of developed and validated computational approaches for hazard assessment are available from academia, their application outside this sector is confined (see Luechtefeld, Marsh, et al., 2018), with limited regulatory acceptance of in silico modeling.To facilitate the adoption of such tools, the OECD has developed the QSAR Toolbox to improve regulatory acceptance for computational approaches, which can be used for both the prediction of simple toxicity and fate properties and for more complex endpoints such as reproductive or repeated dose toxicity.Especially for the latter, the combined use of (Q)SARs (see Burden et al., 2016) and experimental in vitro data may be useful to support an expert-based chemical category formation and read-across of traditional in vivo data between chemicals within the same chemical category.Other (Q)SAR software tools, like the free VEGA platform, include the use of read-across via a user-friendly interface which enhances transparency for the uncertainty of the proposed model versus experimental data for the structurally nearest neighbor.Furthermore, advances in technology are enabling complex algorithms in the form of machine learning and artificial intelligence to be applied in this context, underpinned by curated compound libraries (e.g., Tox21) specifically designed for the purpose of gaining better understanding of the chemical basis of toxicology (chemoinformatics).The use of artificial intelligence approaches (alongside other computational models) is also driven by the increasing availability of data whereby in vivo or in vitro databases for exposure or effect endpoints can be leveraged to increase the generalizability of predictive models.Machine learning algorithms are reported to be more powerful than traditional (Q)SAR approaches in terms of predictivity (see Chen et al., 2022;Huang et al., 2016).In addition, they may enable interspecies extrapolation for chemical safety and the prediction of chemical hazard across fish taxa for prioritization purposes (Wu et al., 2022), all without additional animal testing.Although cases have demonstrated that these machine learning models could generate equivalent or better hazard data (Luechtefeld, Rowlands, & Hartung, 2018), significant barriers remain to artificial intelligence/machine learning adoption in the regulatory setting (Miller et al., 2018).Addressing these issues will be directly beneficial in reducing animal use.
In addition, although acute fish toxicity appears to correlate well with the in vitro rainbow trout gill cell line assay (Rtgill-W1), other situations where NAM data could be used to predict specifically systemic toxicity (e.g., toxicity to fish liver cells) require further information.Specifically, in linking the observed effects related to the internal concentrations of a chemical at the target site (e.g., in the blood or in a target tissue), rather than concentrations in the water, information on the absorption and interaction of chemicals within the living fish is critical.In silico techniques such as physiologically based pharmacokinetic (PBPK) models, or absorption, distribution, metabolism, and excretion PK (ADME-PK) are increasingly being adopted to link exposure to physiological and health outcomes.One such example lies in prior in vivo work with fish demonstrating that pharmaceuticals with comparable pharmacological in vitro activity can have highly different in vivo risk because of their different uptake and PK profile (Margiotta-Casaluci et al., 2016).Moreover, it was previously demonstrated that the explicit consideration of internal exposure parameters (i.e., blood concentrations) can dramatically improve the accuracy of toxicity prediction for pharmaceuticals and facilitate the extrapolation of clinical and preclinical data to fish species (Margiotta-Casaluci et al., 2014).The systematic implementation of fish-specific PK considerations in the ERA process would allow the development of in vitro-in vivo extrapolation (IVIVE) approaches to interpret the relevance of in vitro data for the specific fish toxicity, in line with significant efforts ongoing in HRA (Punt et al., 2020).To overcome the challenge of species specificity and availability of chemicalspecific parameters, Wang et al. (2022) recently reported on a generalized fish PBK model that can be applied to a broader range of fish species and chemicals.A recent perspective article highlighting the application of PBTK model coverage combined with external exposure modeling provided better support for protective decisions allowing a shift toward new technologies that allow holistic evaluation of chemicals (Textbox 1; Cohen Hubal et al., 2019).In this, specific priorities requiring further work to build sufficient confidence were identified in a joint European Partnership for Alternative Approaches to Animal Testing (EPAA)-European Union Reference Library for Alternatives to Animal Testing (EURL ECVAM) ADME workshop (Bessems et al., 2014).As with most computational approaches, barriers to their use have been recognized as a multipronged issue that relates to the availability and reliability of training data, the extrapolation of outcomes beyond the model's domain of applicability, and the lack of computational literacy among relevant stakeholders (Miller et al., 2018).

Toward increasingly complex in vitro culture systems
Bioaccumulation potential of chemicals is traditionally assessed in terms of a reductionist BCF, though not necessarily ecologically relevant for hydrophobic chemicals because dietary exposure, and hence the potential for biomagnification, is not included.More over, the complexity of environmental food webs, possibly assessed via TMFs using mesocosm studies, may affect bioaccumulation and biomagnification in real environments via a wealth of biological modifiers.Nevertheless, on purpose, regulators usually do not consider any of these biological modifiers and prefer to regulate by reductionist BCF or BMF values because these allow between-chemical comparisons of intrinsic bioaccumulation or biomagnification potential without complex biologic modifiers.The latter may be relevant for one mesocosm or environment but not for another, and they practically cannot be assessed comprehensively.This scientifically well-defensible preference for reductionist approaches focusing on relative effect sizes between chemicals rather than on absolute effect sizes in real environments may be thought-provoking for the utility and acceptability of NAMs in general.Specifically, to what extent may we reduce complexity within the test systems, recognizing that regulatory toxicology can only assess comparative toxicity between chemicals?
Long-term research has resulted in the development of NAMs to assess the bioaccumulation potential of chemicals, resulting in the development of in silico and in vitro methods to estimate bioaccumulation potential comparable to in vivo methods (see Kropf et al., 2020), with accompanying information on reliability through an international ring trial (Nichols et al., 2018).Acceptance of the in vitro biotransformation assays with rainbow trout (fish) primary hepatocytes and S9 fractions by the OECD (2018b, 2018c) as well as OECD guidance referencing the uncertainties for the computational in vitro data integration for BCF prediction as well as the uncertainties of the experimental BCF value (OECD, 2018d) represents a significant step forward in this field, although discussions on their harmonized regulatory application for HRA (using human or rat hepatocytes) is still ongoing (Louisse et al., 2020).Likewise, the recent acceptance of the fish gill cell line acute toxicity test as a predictor of acute fish toxicity is the result of decades of work (Fischer et al., 2019).Further OECD validation work is ongoing using the freshwater amphipod Hyalella zteca (bioconcentration test) for bioaccumulation testing.These recent steps toward the ethically and scientifically desired regulatory acceptance and use of these protocols represent solid foundations from which we can expand.
Regulatory acceptance of NAM data in this field may be considered "low-hanging fruit" given the broad use of pragmatic log K OW or QSAR-based regulatory decision tools for environmental assessments as well as the established regulatory acceptance of in vitro mammalian data in the drug development and approval process.But to realize its full potential, regulatory acceptance of IVIVE would be beneficial; and this requires a concerted effort.For example, by generating fish in vitro hepatocyte and liver S9 data for various compounds, application of the IVIVE approach for the estimation of bioaccumulation potential can be established, providing there is the necessary information for case study development which builds confidence.Such work may profit from consensus growing in the use of IVIVE for human safety and efficacy assessments (Bell et al., 2018).
For decades, two-dimensional cell cultures have been primarily used as in vitro screening tools to evaluate toxicity and predict drug impact in humans and, more recently, fish.Several NAMs have been developed and deployed to generate fishspecific ADME parameters in different compartments, including both in vitro systems for relevant organs (i.e., gills, liver, intestine) and computational approaches integrating multiorgan or system-level data.Examples of these include the fish gill cell culture systems used to predict chemical BCFs and uptake/excretion dynamics (E.D. Chang et al., 2021).A more sophisticated metabolically competent three-dimensional (3D) in vitro system based on spheroidal aggregate cultures (spheroids) was first applied to humans and later successfully applied using rainbow trout (Oncorhynchus mykiss) liver to study chemical metabolism, with promising results (Baron et al., 2017;Hultman et al., 2019;Lammel et al., 2019).
The intestine also represents a major site of chemical interaction and toxicity, but until recently data on uptake through food chains were almost nonexistent, although this is changing slowly.Several groups have demonstrated that rainbow trout primary intestinal cells can be maintained in vitro in both two and three dimensions and used to investigate chemical metabolism in this important but often overlooked compartment (Langan et al., 2018).Limitations associated with the use of primary cells have been overcome with the generation of an immortalized rainbow trout intestinal cell line (RtgutGC; Kawano et al., 2011), which has been successfully used to understand chemical transfer (Schug et al., 2019) and improved further using coculture with an intestinal fibroblast cell line (RtgutF; Drieschner, Vo, et al., 2019).In keeping with the WoE approach, the development of a tiered testing strategy which integrates these in vitro systems could increase the chance that regulators accept risk assessments without data from the in vivo OECD 305 test (OECD, 2012) to determine bioaccumulation in fish.Furthermore, the data generated with such NAMs could be used to accelerate the growing field of fish-specific PBPK models, which is currently limited (Wang et al., 2022), with a recent review highlighting how in vitro toxicity data can be used in risk assessment and decision-making in the European Union (X. Chang et al., 2022).
In line with human organ-on-chip (OOC) development, it is possible to foresee the application of microfluidic devices to enhance the biological relevance of the in vitro models mentioned above or to even integrate multiple organs in a single chip as a complex model recapitulating the complexity of an intact living organism.Since the early 1990s, microfluidics has been increasingly used in chemical and biological research because of its potential numerous benefits, including improved physiological complexity and emulation of systemic effects in vitro.Also known as microphysiological systems, OOC devices have seen dramatic advances in the sophistication of biology and engineering over the past decade, facilitated by the convergence of multiple previously disparate technologies.Although progress has been primarily driven by human studies, with the common use of 3D human liver-on-a-chip (see Moradi et al., 2020), the application of OOCs in aquatic toxicity testing is limited.Despite early developments in toxicity testing on flow-through Rtgill-W1 cultures (Glawdel et al., 2009) and the development of a two-compartment intestinal barrier model with similar properties to salmonid intestines (Drieschner, Könemann, et al., 2019), little progress has been made in other fish organs.Furthermore, despite the availability of primary and immortalized static 3D fish hepatic cultures, progress on adaptation of this technique to the development of a fish-specific liver-on-a-chip in vitro model to evaluate bioaccumulation potential has been limited.
The key drivers of (human) OOC development, including improved and longer-term phenotypic maturity such as expression and activity of xenobiotic metabolizing enzymes, are expected to improve the IVIVE of fish and other organism hepatocyte models.Yet, there are certain fundamental characteristics of microfluidics which still render these assays technically demanding, especially for metabolic clearance predictions.For example, in flow-through systems, the chemical's residence time within the cell culture chamber (~10 mm long) is defined by the linear flow rate and is thus inherently very short.Consequently, the majority of the OOC experiments focus on rapid measurement of pharmacological/toxicological endpoints or transepithelial transport and (human) disease modeling in vitro.To achieve hours-long chemical exposure times on microfluidic devices, often necessary in metabolic clearance determinations, dedicated recirculation systems need to be established for both mammalian and aquatic organisms alike.Furthermore, technological advances for measurement of the associated tiny volumes (microliters) and increases in the limit of detection of analytical methodologies for more environmentally relevant concentrations are necessary for wider adoption.Although this technology is relatively new, its potential impact and implementation in the context of risk assessment of chemicals is already underway (Nitsche et al., 2022).To realize its potential, further work must address the lack of standardization of applicable materials (culture platforms) and protocols (e.g., shear force), which presently poses major challenges to interlaboratory comparisons and regulatory acceptance of microfluidics-derived data (Allwardt et al., 2020).In this regard, the European Committee for Standardization and the European Committee for Electrotechnical Standardization established in 2021 the results of an EURL ECVAM survey (Batista Leite et al., 2021), and the respective activities by the Standards Coordinating Body in the United States are foreseen to accelerate wider adaptation of the OOC concept to in vitro assessment of human and fish pharmacokinetics alike.

Integrated approach to testing and assessment
The methods outlined provide some of the raw data or information about the hazard and/or exposure to a chemical but still require a framework and workflow to be used for regulation.Thus, any regulatory testing and assessment would require the initial definition of the regulatory purpose and preferably a societal agreement about suitable goals and tools used for the evaluation are suitable for this purpose.A conceptual example may build on OECD guidance for HRA (OECD, 2017) and is outlined here for environmental protection: Step 1: Establish if conservative estimates for environmental protection levels are sufficient for the specific regulatory situation.Such estimates may be based on computational EcoTTC values or refinements (already being applied in HRA), including a very broad set of in vitro bioactivity data and kinetic modeling (termed next-generation risk assessment [NGRA]; Friedman et al., 2020) intending to protect, but not to predict, any specific adverse effects at the organism or population level.For ERA, such an approach may include, besides NAMs, a relevant number of plants and invertebrates to provide a useful environmental no-adverse-effect-concentration.
Step 2: Where more information on possible vertebrate-level effects appears necessary, NAM-derived data and kinetic information may be used to derive a mode-of-action hypothesis, which can be tested with increasingly complex in vitro methods (e.g., OOC/microphysiological systems) with a stepwise improvement in kinetic modeling with increasing information.If relevant and available, additional in vivo vertebrate data from similar chemicals may be integrated by read-across.
Importantly, the approach referenced here builds on exposure considerations and integrates these with hazard data.Exposure information is legally required for pesticides, biocides, and chemicals.Especially in Europe, regulation is also based on the Globally Harmonized System classification, such that a potential hazard may have severe regulatory consequences independent from any exposure considerations, for example, endocrine-disrupting properties or chronic aquatic toxicity in combination with persistence and bioaccumulation criteria.Nevertheless, exposure information may lead to adaptations of regulatory hazard information requirements on a case-by-case basis, and future regulations could implement new default approaches, if this could increase the sustainability of regulations (as recently discussed, e.g., for HRA; Ball et al., 2022).
However, where to stop a tiered assessment, as indicated above, would depend on available resources, acceptable uncertainties, and societal values.Such an approach could be positioned as part of an IATA and would logically include the interplay of TK-TD, environmental/interspecies extrapolation, and an appropriate and transparent level of uncertainty (Laroche et al., 2018).Integrated approaches to testing and assessments were intended to be flexible, but some elements can be standardized, which are referred to as defined approaches, consisting of a testing strategy and a fixed data interpretation procedure.Support for such approaches has resulted in guidance documents by the OECD in addition to case studies demonstrating proof of concept for regulatory fit (OECD, n.d.a).Moreover, such IATAs may also include the integration of NAM data to support the read-across of traditional animal test data between chemically and biologically similar chemicals.

Importance of collaboration
Numerous methods have been developed and implemented at various stages of the risk assessment process; however, direct efforts are needed which allow for the integration and connection of these different approaches (Textbox 1).A significant advancement of scientific confidence in the practical application integrating NAM data into readacross approaches for risk assessment was achieved during the HORIZON 2020 EU-ToxRisk project (European Union, n.d.).This resulted in a unified strategy for the development of case studies established in partnership with regulatory agencies and contextualizing NAM data in a scientifically defensible way (Krebs et al., 2020), related workshops on how to make this approach global (Rovida et al., 2020), and ultimately recognition within the OECD Mutual Acceptance of Data system.The HORIZON 2020 RISK-HUNT3R project builds on prior outcomes to establish an overall human-centric NGRA framework for chemicals which are designed to promote a combination of computational toxicology, in vitro toxicology, and systems biology, assuming this approach will lead to faster and more riskaccurate procedures (Pallocca et al., 2022).Such an approach may also be undertaken in other organisms, with precedent already established.
Importantly, such activity alongside the 4C's principle (communication, cooperation, commitment, and coordination) may start to overcome some of the previously reported crosssectional barriers to the adoption of NAMs, which include uncertainty about the value of the new models, the lack of harmonization of regulatory requirements and acceptance criteria, and the high levels of risk aversion (Punt et al., 2017).Integrated approaches to testing and assessments building on NAMs will provide different types of data, and this will require a new/expanded scientific understanding of risk and uncertainty and a multistakeholder agreement on the regulatory use of these different data.Therefore, collaboration between government, regulators, academia, industry, and nongovernmental organizations is key to success; and we outline some options for respective and practical collaborative actions in Figure 2.

Importance of standardization
To improve confidence in the validity of NAMs and reproducibility and (re)usability among end users, transparent and comprehensive reporting may be furthered with an approach similar to the Animal Research: Reporting of In Vivo Experiments guidelines (Percie du Sert et al., 2020), which were developed to promote robust and reproducible animal research.In addition, an important component of reducing animal usage is enhancement of the reusability of data, for which the findable, accessible, interoperable, and reusable principles for scientific data management were developed (Wilkinson et al., 2016) and which should be applied to optimize knowledge growth.These guidelines set out the minimum information that should be included in any publication reporting the use of animals and have been endorsed by over 1000 journals and highly cited.However, although standardized guidelines for reporting animal studies are available, standardized reporting for NAM methods is limited.Improved reporting standards is not new in the ecotoxicology field, although more effort should now be made to focus on transparency within NAM studies going forward.Reproducible science requires reproducible reporting, building confidence, and trust in the process.In this respect, having a unified strategy on reporting facilitates adequate interpretation of the data to ensure overall scientific and toxicological validity.In line with this need for harmonized and comprehensive reporting, the EU-ToxRisk project built on earlier work for the standardization of nonguideline methods, reporting on the results of a case study for the regulatory use of 23 NAMs, which involved regulators reviewing the case studies and reporting established method documentation, data processing, and chemical testing pipelines (Krebs et al., 2020).The method documentation readily incorporated well-established guidance documents on good in vitro methods practice and good cell culture practice (GCCP), both of which apply to all in vitro testing irrespective of organism.It should further be noted that these documents are not static, and with increasing technological advances in addition to increasingly complex culture systems, proposed strategies need space to incorporate these complementary recommendations for increased reproducibility and transparency, such as the latest version of the GCCP guidelines (Pamies et al., 2020).Likewise, the National Centre for the 3Rs has initiated the development of reporting guidelines for in vitro research (Reporting In Vitro Experiments Responsibly).As with the established reporting guidelines, once NAM-specific guidelines are developed, there need to be mechanisms employed to ensure their uptake, including endorsement by internationally renowned journals.This need is also echoed in Figure 2.

Importance of the recognition of uncertainties
Following decades of research, the reality is that routine toxicity testing cannot fill the large gaps that experimental scientists and assessors/regulators regularly identify.The shift away from studying whole organisms, sometimes in advance of legislative change, to increased data availability and predictive power will assist in strengthening our confidence in the establishment of cause-effect relationships-a basic tenet of risk assessment.For pragmatic reasons, regulatory toxicology is based on data from relatively few methods, which were internationally standardized.Such a practically manageable, wellstandardized set of methods should allow a minimum safety standard and a harmonized regulation of chemicals with minimal trade barriers from differential data availability.However, an additional scientific estimate for the toxicity of many chemicals might be provided using systematic reviews of all available data going forward, including scientific literature.Within such a new assessment, the uncertainties and inconsistencies could be spelled out in a scientifically correct way.
Traditionally, the approach for NAM validation was to assess NAM data relative to data from the existing regulatory standards, rather than all the available scientific evidence.Yet, we do not know empirically if these standard tests actually represent the best science or how predictive they are of environmental outcomes.This is especially true for some of the more complex animal tests (see OECD, 2015aOECD, , 2015b)), which could not be validated during their development because of cost.So, how can one scientifically demonstrate that any new regulatory approach is at least as useful as the current one?As a first step, this requires full transparency of practical limitations and scientific uncertainties of the current animal-based approaches (Textbox 1).Scientists, together with regulators, must start to routinely report and discuss these in their daily work, which may provide the necessary common understanding for any next steps (Figure 2).Consequently, applying a systematic approach to characterize methods, current and new approaches can be qualitatively or semiquantitatively compared for practicalities and scientific uncertainties.This has been demonstrated for human developmental neurotoxicity (Paparella et al., 2020) and fish toxicity (Paparella et al., 2021), with the same principle also applied to quantify the variability of rodent repeat dose studies, recognizing that any new method cannot predict data from the traditional method more precisely than the traditional method can predict itself via replication (Pham et al., 2020).

CONCLUSIONS
Regulatory toxicology must be recognized as a natural and social science, allowing for the regular reexamination of its basic concepts, by asking questions such as the following: What type of testing for which type of chemicals is required from an economic, societal, and ethical perspective?Besides cosmetics, are there other chemicals for which animal testing may become unacceptable, such as biocides which can be replaced by nonchemical alternatives?When do we need to predict adverse effect types, and when are estimates for nonadverse-effect concentrations sufficient?What level of uncertainty is acceptable for decision-making, and what level of pragmatic precaution can be taken to compensate for uncertainties?Inclusive societal forums and deliberate actions may need to be established to discuss and answer such questions (for example, see Figure 2).
The challenge to change from animal testing to NAMs is different between sectors.Personal education, moral values, professional experience, opportunities, and hierarchies as well as peer group forces are influential in the development, use, and selection of scientific methods.However, in principle, academic researchers are free to formulate their research questions in a way that such questions may be directly addressed with NAMs.Here, innovation and exploration of new methods is encouraged, enabling "big data" generation while avoiding animal use.Such activity may significantly contribute to building evidence, confidence, and potentially case studies to support regulatory needs.Indeed, this is the de facto case for many regions where animals in science are increasingly strictly regulated by government or institutions.So, what needs to be done to effect transformation?We outline some suggested actions for the various stakeholders in Figure 2. As highlighted in the EPAA Blue Sky workshop, "disruptive thinking" is required to reconsider chemical legislation and validation of NAMs and to embrace the opportunities to move away from reliance on animal tests (Mahony et al., 2020).
Therefore, the answer to the initial question of "Can nonanimal testing methods be developed that will provide equivalent or better hazard data compared with current in vivo methods?" is nuanced but clearly yes for some of the available approaches, while also recognizing that NAMs may be used in a protective standalone approach without predicting any specific in vivo method, applying to HRA and ERA alike.A divergence in views occurs with identifying which regulations can exclusively be based on NAMs and further when this will occur.The answer to such questions heavily depends on resources invested in regulatory evolution, policy change, and the readiness by all for substantive changes versus minor adaptations of regulatory practices.Regardless, to realize the full potential of NAMs, more work is needed, much of which overlaps with various publications following HRA workshops outlined in supplemental additional reading, in addition to progress in other sectors and countries paving the way for the adoption of NGRA (see Bhuller et al., 2021;Escher et al., 2022;Friedman et al., 2020).To accomplish the goal of providing equivalent or better hazard protection compared with current in vivo methods, the authors acknowledge that NAMs and their combinations in a standardized WoE approach have been developed which could provide equivalent or better hazard protection, while also acknowledging that research is always required to move ahead.In light of this review, the authors recommend prioritizing the research questions outlined in Textbox 1. Furthermore, we have also outlined some policy-research questions (Supporting Information, Table S2) required to increase pace and diversity in addition to future-proof the area of toxicology and risk assessment.To answer all outlined questions will require funding, leadership, guidance, and active endorsement.Looking to the future, we are at a tipping point, with a need for a global and inclusive approach to establish consensus.Bringing together all of this work for regulatory assessment and decision-making will require a concerted effort and orchestration.
Supporting Information-The Supporting Information is available on the Wiley Online Library at https://doi.org/10.1002/etc.5578.

FIGURE 2 :
FIGURE 2: Suggested activities by various stakeholders which could support the necessary change toward the increased use of new approach methodologies.NAM = new approach methodology; NC3R = National Centre for the 3Rs; RA = risk assessment; ADME = absorption, distribution, metabolism, and excretion; ERA = environmental risk assessment; AOP = adverse outcome pathway; ARRIVE = Animal Research: Reporting of In Vivo Experiments.