Precise Yet Uncertain: Broadening Understandings of Uncertainty and Policy in the BPA Controversy

Bisphenol A (BPA) is one of the most studied and most controversial chemicals used by the food packaging industry, because of its endocrine disruptive properties. Part of the controversy is due to the uncertainty that surrounds the effects of BPA on the endocrine system. Uncertainty includes data gaps, methodological hurdles, incompatibilities between toxicology and endocrinology‐based approaches, and so on. In this article, we analyze how uncertainty has been conceptualized and treated. We focus on the European Food Safety Authority assessments of BPA, and study how exposure and hazard assessments have evolved over time, how uncertainty has been analyzed, and how the agency responded to controversies. Results show that in the attempt to reduce knowledge gaps, assessments have become progressively larger, including more references, evidence, and effects. There is a tendency toward greater precisions and specification of results, and toward protocolization of all processes included in the assessment (from literature review, to uncertainty assessments, and public consultation). Yet, the uncertainty has not diminished following the increase in evidence. We argue that the strategy used to reduce uncertainty within risk assessment, namely including more variables, studies, data, and methods, amplifies the uncertainty linked to indeterminacy (as more results increase the fragmentation of the knowledge base due to the open‐ended nature of complex issues) and ambiguity (as complexity gives way to multiple nonequivalent interpretations of results). For this reason, it is important to consider different types of uncertainty and how these uncertainties interact with each other.

disruptor is contested, and entire studies have been devoted to this discussion-see for example the 2018 special issue in the Journal of Molecular and Cellular Endocrinology "Is BPA an endocrine disruptor?" 1 The specific toxicity of BPA challenges the conventional testing and risk assessments of chemicals. For example, BPA can induce different effects at different life stages, effects can be induced at (very) low doses and can appear many years after exposure. Also, it is still debated whether there is a safe threshold of exposure for this chemical and to what extent the dose-response is nonmonotonic (Vandenberg, Maffini, Sonnenschein, Rubin, & Soto, 2009).

Different types of uncertainty in the Bisphenol A controversy
There are different definitions of risk and uncertainty in different bodies of literature. In the field of risk analysis, uncertainty is understood as a main component of risk (Aven, 2018). "Simple" risk is often described as a function of probabilities and (specific) consequences (Aven & Renn, 2010) and contrasted to complex, uncertain, and ambiguous "risk problems" (Aven & Renn, 2020) that go beyond probabilistic approaches. Qualitative descriptions of risk and uncertainty exist too, where instead of probabilities (and expected values) one addresses known unknowns and unknown unknowns (ignorance) (Wynne, 1992) and, instead of specific consequences one addresses decision stakes (Funtowicz & Ravetz, 1985), outcome stakes (Rosa, 1998), or severity of consequences (Aven & Renn, 2009). The Society for Risk Analysis defines the two main features of the concept of risk as: (i) consequences with respect to something that humans value and (ii) uncertainties (Aven, 2018;SRA, 2015). Common to these approaches is the understanding that risk goes beyond probabilities and uncertainty cannot be handled solely as a technical problem of quantitative modeling but reveals deeper problems of incomplete knowledge, complexity, and ambiguity (Funtowicz & Ravetz, 1993;Stirling, 2003;Wynne, 1992).
The standard approach, used by the European Food Safety Authority (EFSA) in the assessment of BPA, is to conduct a risk assessment (where risk is understood as "a function of the probability of an adverse health effect and the severity of that effect, consequential to a hazard"; EFSA, 2012, p. 21) and to analyze the uncertainties within this assessment. Risk assessment is expected to help regulate the use of chemicals by establishing an acceptable level of risk, defined through the tolerable daily intake (TDI). The uncertainties in the estimation of the TDI of BPA, however, are so high, that an increasing controversy has emerged to the point where the possibility of establishing a TDI for BPA regulation has been questioned (Heindel et al., 2020). We argue that the controversy lingers also in part because of the untamable parts of uncertainty in BPA assessment. Because every risk assessment is an imperfect reduction of a complex practical problem (Ravetz, 1971), a mismatch between the complexity of the issue and its technical representation in a risk assessment will often exist. Gaps in knowledge and plurality of perspectives may not be a temporary problem, but an underlying condition in which decisions have to be made.
Different types of uncertainty make it possible to theorize uncertainty beyond challenges of quantification and include epistemic, social, and ethical considerations-all aspects that are central to the BPA controversy. Our intention is to consider a broad understanding of uncertainty to see which insights can be gained about the BPA controversy by thinking outside of the box of EFSA's risk assessment practice.
EFSA limits its consideration of uncertainty to uncertainty within risk assessment (as defined by EFSA), which is associated with "all types of limitations in available knowledge that affect the range and probability of possible answers to an assessment question" (EFSA, 2018, p. 4) (including variability, absence of data, etc.)-that affect the assessment of risk. In this case, reducing the uncertainty (preferably through quantification) improves the reliability of risk assessments and their policy relevance. In doing so, EFSA does not sufficiently account for the following: (1) Indeterminacy, which is the uncertainty that emerges when cause-effect relations are not fully determined because of their open-ended nature (Wynne, 1992). Because the issue is more complex than what can be captured by its limited technical representation in the risk assessment, no definitive answers can be given even as more data and better analysis are produced. In the case of BPA, indeterminacy emerges with regard to low-dose effects, for instance, an issue in which more quantification does not provide definitive answers for policy. This uncertainty points to untamable types of uncertainty.
(2) Ambiguity is defined as a specific type of uncertainty created by pluralism and complexity (Kovacic & Di Felice, 2019;Sarewitz, 2004;Stirling, 2003Stirling, , 2010. Ambiguity can be understood as "the plurality of scientifically justifiable viewpoints on the meaning and implications of scientific evidence" (SAPEA, 2019, p. 15). Different scientific views are difficult to combine into a single coherent perspective without losing information, for this reason, we define ambiguity as the type of uncertainty which emerges from the difficult task of handling and making sense of multiple scientific perspectives. In this context, assessments can be, and are, contested (Oreskes & Conway, 2010). In the case of BPA, ambiguity arises as both the body of knowledge related to toxicology and that of endocrinology can be mobilized, giving rise to scientific differences. Differences in interpretation may be attributed to expert bias, value discrepancies (defined as normative ambiguity in Johansen & Rausand, 2015), may arise from legitimate pluralism in the knowledge base (defined as interpretative ambiguity by Aven & Renn, 2020), or may be a result of both normative and interpretative differences, which are seen by many scholars as interwoven (Andersen, Anjum, & Rocca, 2019;Douglas, 2009;Elliott, 2017;Funtowicz & Ravetz, 1993;Jasanoff, 2004;Sarewitz, 2004;Stirling, 1998;Wynne, 1989). Although desirable, it may not always be possible to neatly separate normative and interpretative ambiguity. This is the case of assigning weights to different criteria: weights reflect a normative valuejudgment about the relative importance of different criteria and at the same time are part of standard practice in the construction of composite indicators as an interpretative analytical tool (Munda & Nardo, 2005). We highlight how both aspects of ambiguity are interwoven to argue that even if methods that minimize expert bias are deployed, pluralism in the knowledge base may persist. In this article, we are interested in the manifestations of ambiguity as a type of uncertainty, rather than on discerning the origin of ambiguity (whether interpretative, normative, or both), therefore, we use the general term "ambiguity" in our analysis and discussion of results.
These different types of uncertainty are interrelated: complexity may give rise to both ambiguity by allowing for multiple legitimate but nonequivalent interpretations and to indeterminacy by creating open-ended causal relations. The existence of multiple problem framings generates both indeterminacy and ambiguity. It should also be noted that indeterminacy and ambiguity are not reduced by quantification and increased precision. We hypothesize that if uncertainty is only considered within its technical and methodological dimensions, other types of uncertainty such as ambiguity may be overlooked.
The article asks: How does EFSA's treatment of uncertainty in BPA risk assessments fare with respect to other types of uncertainty? Our analysis suggests that the strategy used to characterize risk (quantification, increasing precision, protocolization) may have the opposite effect in terms of indeterminacy and ambiguity, as more and more data and methods can lead to more and more interpretations of how to manage the risks of BPA, deepening the controversy rather than acknowledging and fostering epistemic pluralism in risk appraisal.

METHODS
In order to understand how uncertainty assessment procedures have evolved and which types of uncertainty have been taken into account, we analyze the assessments of BPA published by EFSA between 2002 and 2020. EFSA is an interesting case study because it plays a role both in assessment and in advising governance, although it is not directly responsible for food-related risk management in the European Union.
The conclusions on whether or not BPA poses a risk to human health vary greatly between assessments-even among those where the scientific evidence is identical (Beronius, Rudén, Håkansson, & Hanberg, 2010;Gies & Soto, 2013). For this reason, we take a broad approach to the analysis of risk, uncertainty, and controversy. We broaden our analysis of uncertainty by taking into account not only possible knowledge gaps but also the quality of existing knowledge, consistent with, for example, Aven (2018).
For our analysis, we focus on the reports published by EFSA and the main studies that have informed, or influenced, EFSA's work. We do not assess uncertainties ourselves, but rather analyze which types of uncertainties have been considered and how they were assessed by EFSA. For the text analysis, we have selected: (i) the Opinion of the Scientific Committee on Food (SCF) (SCF, 2002) (published the same year EFSA was created), which is used as reference for subsequent studies, and (ii) EFSA's as-sessments from 2006 to 2015 (EFSA, 2006(EFSA, , 2008a(EFSA, , 2010b(EFSA, , 2015b and preliminary results of the ongoing assessment due in 2020 (EFSA, 2017a).
The texts are analyzed according to the following criteria: (i) Criteria that refer to the pedigree of the assessment, namely, the type of assessment (report, full risk assessment, updated advice on BPA), the size (number of pages), the time needed to complete the assessment, the number of references, the critical study used for reference in the hazard assessment, and the literature review supporting the risk assessment. The concept of pedigree was first introduced by Funtowicz and Ravetz (1990) as a tool to assess quantitative data. Vesnic-Alujevic, Breitegger, and Pereira (2016) apply the pedigree approach to the assessment of the genealogy of text documents, as a systematic assessment of the quality of knowledge inputs in decision-making processes. The criteria we analyze help us assess the quality of the BPA publications and understand how quality criteria have evolved over time (for example, by increasing the literature review), as uncertainty and controversies increased. (ii) Criteria that refer to the analytical choices that underpin the assessment, namely: parts of the assessment; analytical choices of the exposure assessment (population; type of exposure; source of exposure; methodology); analytical choices of the hazard assessment (methodology; effects analyzed). Analytical choices make it possible to assess the procedural quality of BPA assessments, as well as their ability to assess different types of uncertainty. Analytical choices that define well-determined cause-effect relations may overlook indeterminacy and present a reductionist account of uncertainty. We observe how procedures evolve over time, paying particular attention to the adoption of protocols and guidelines. (iii) Criteria that are used to manage uncertainty.
We consider three types of uncertainty: uncertainty within risk assessment, indeterminacy, and ambiguity. We focus on uncertainty as defined and analyzed in BPA risk assessments, including uncertainties in the exposure estimates, weight of evidence, uncertainties in hazard characterization, and so on. Indeter-minacy leads to the revision of the analytical choices and can be observed in the changes in assessment methods. Changes include shifts in focus, longer lists of parameters, guidelines and protocols put in place, and changes in the degree of precision with which quantified results (including protocols, guidelines, as well as TDI values) are produced. We analyze ambiguity by taking into account the increase in studies and factors considered. (iv) Criteria that create controversy in the BPA assessments: request received by EFSA; initial concerns regarding BPA; effects analyzed in the hazard assessment. The focus on controversy draws on Marres' (2007) work on controversy studies. Marres argues that the study of controversies is issue-oriented, and places great emphasis on the process of issue definition, which she contends is a crucial dimension of democratic politics. Our categories aim to capture the issues that are considered in the BPA controversy, both as inputs to the assessments (initial concerns) and as outputs (effects).

RESULTS
We present the results in four parts: (i) pedigree of the assessments, (ii) analytical choices, (iii) types of uncertainty, and (iv) controversies.

Pedigree of the Assessments
The pedigree analysis aims at assessing the quality of the BPA assessments and how quality criteria have evolved, as new uncertainties emerged and controversies increased over time. Results are summarized in Table I.
Most assessments consist of full risk assessments (SCF, 2002;EFSA, 2006EFSA, , 2015b, with the exception of EFSA (2008a) and EFSA (2010b) studies which were updates of the 2006 study, and the assessment of 2017 that does not include an exposure assessment (EFSA, 2017a). With regard to size, the number of pages shows a clear tendency to increase, taking more and more information into account, and specifying more and more aspects of the study. The time needed to complete the assessment also increases from study to study, indicating that the consideration of more information (references, and as we discuss below, criteria, protocols, public consultations, etc.) makes the assessments more and more onerous. The assessments are becoming costlier, take longer time, are substantially more complex, experts need training in the new methods (i.e., systematic review methodologies, uncertainty analysis) and the division of tasks can lead to fragmentation of results and conclusions. With respect to EFSA's hazard assessment protocol draft of 2017, ANSES (French Agency for Food, Environmental and Occupational Health & Safety) commented: "It is also likely to increase the workload and make difficult for the experts to have a complete view on the entire database and therefore, minimize an integrated evaluation" (EFSA 2017a, p. 15).
In terms of the genealogy of inputs, the number of references in SCF and EFSA's full assessments increases over time, showing that a greater diversity of studies is considered. References should not be read as endorsement, but the tendency to increase the number of references signals that EFSA is less and less able to overlook, or ignore, other studies and has to, at minimum, engage with the increasing literature on BPA.
In contrast to the increasing number of references, we find that the critical study used as reference for the hazard assessment does not change. Two publications by the same researchers (Tyl et al., 2002(Tyl et al., , 2008 have informed all hazard assessments done since 2006, that is, since EFSA's first in-house assessment. The biggest changes in quality can be observed with regard to literature review, which evolves from simple and then more structured narrative reviews (SCF, 2002;EFSA, 2006EFSA, , 2008aEFSA, , 2010b, all the way to a systematic literature review in the on-going assessment (EFSA, 2017a).
The analysis of pedigree shows both a clear trend to increase inputs to the assessments and a strong continuity with regard to the critical study. The quality of the assessments seems to be ensured by adherence to the same approach. This strategy can contribute to ambiguity: while the use of the same criteria allows for consistency, it does not open up the debate to different ways of understanding quality. Maxim and van der Sluijs (2014;2018) have found that the assessment of the quality of BPA studies also depends on disciplinary background. Continuity in this case, may narrow down the range of interpretation of results, improving the uncertainty within risk assessment, but also leaving space for controversy as alternative criteria are not considered.

Analytical Choices of the Assessments
Analytical choices are considered to assess the procedural quality of the assessments and how procedures evolve over time. The openness or protocolization of procedures reflects how responsive (or irresponsive) the assessments are to anomalies of endocrine disruptors as compared to other toxicological studies. Openness allows for cases of indeterminacy to be seen, while protocolization reduces the ability of assessments to account for undetermined causal relations. The general tendency is for assessments to become wider in scope and more refined in methodology and results. Table II summarizes the analytical choices taken into account.
The main three parts of the risk assessment (the exposure assessment, the hazard assessment, and the risk characterization) remain the same throughout the period analyzed. However, each individual part gets much more detailed, for example: from an exposure assessment based on external exposure estimates in SCF (2002) and EFSA (2006) to an exposure assessment based on external, internal and aggregated exposure estimates in EFSA (2015b). The "one step" approach of 2002, 2006, 2008, and 2010 (that is, the preparation of the risk assessment and the endorsement by the Panel) is substituted in 2015 by the "three steps" approach, comprising two public consultations and reviewing the drafts taking into consideration the input from the public consultations (EFSA, 2015b).
We further discuss methodological and analytical choices for the exposure assessment and the hazard assessment separately. In general terms, both types of assessments become broader and include more and more elements, as shown by Table II.

Exposure Assessment
The first exposure assessments of BPA from SCF (2002) and EFSA (2006) focused exclusively on oral exposures to BPA through the diet. Dietary exposure estimates were calculated for the general population, with a refinement in 2006 to consider the exposure of infants.
Dietary exposure estimates were based on conservative assumptions, using high intake estimates from model diets. For the presence of BPA in food, mean concentrations were used, with the exception of the refined exposure estimate for infants in 2006, where the assessors derived an additional exposure -Call for data.
-Public consultation of drafts on exposure and on hazard assessment and risk characterization.
-Revision of the assessment after receiving comments from public consultation.
-Development of Hazard assessment protocol draft.
-Public consultation and workshop on hazard assessment protocol draft.
-Revision of hazard assessment protocol.
-Test of the hazard assessment protocol.
-Call for data.
-Hazard assessment and risk characterization using the hazard assessment protocol (including detailed uncertainty analysis).  (2015) (Continued) Mammary gland effects were also identified as likely. But given that the BMD approach could not be applied to the data set on mammary gland effects due to its variability, the TDI was derived from Tyl et al. (2008) for kidney effects.

Groups of the
scenario accounting for possible high levels of BPA migration from polycarbonate feeding bottles. Both exposure assessments concluded that infants had the highest dietary exposure: 1.6 µg/kg bw/day of BPA (SCF, 2002) and up to 13 µg/kg bw/day of BPA (EFSA, 2006). When EFSA reassessed BPA's exposures in 2015, there was a move toward more refinement and transparency-including an uncertainty assessment and a public consultation of the draft assessment (EFSA, 2015b). This assessment was also unique in that, for the first time, nondietary sources of exposure were included (i.e., exposures via thermal paper, cosmetics, indoor air, and dust, toys, and pacifiers). To this purpose, the methodology was expanded to conduct a multiroute aggregate exposure assessment (ESFA, 2015b), resulting in one of the most extensive chemical exposure assessments ever conducted for a food chemical (von Goetz et al., 2017). A thorough literature review was used to retrieve all available data on BPA migration, occurrence, and biomonitoring. Additional literature was also collected via a call for data and the selection of all studies was based on predefined eligibility and quality criteria. The information on food consumption patterns was more complete and representative than in previous years. The assessors used the relatively new EFSA's Comprehensive European Food Consumption Database from 2011 to estimate dietary exposures for very specific groups (13 subgroups in 2015 as opposed to four in 2002 and 2006). These were stratified according to age, diet, and sex when necessary to represent the most vulnerable segments of the population.
The conclusions with respect to dietary exposures were similar to those of previous reports, namely, that the diet is still the main source of BPA exposure and that infants (and toddlers) are the most highly exposed via the diet (up to 0.857 µg/kg bw/day) (EFSA, 2015b). However, the exposures were 4-15 times lower (depending on the age group considered) than previously estimated by EFSA in 2006. This was explained as the result of better databases on food consumption and occurrence, and less conservative assumptions for the exposure calculations, leading to more precise and less uncertain dietary estimates.
The evolution of the exposure assessments thus shows a trend toward more sophisticated methodologies, greater precision, and specification of exposures, which are characterized with progressively more significant digits (from 1.6 µg/kg bw/day to 0.857 µg/kg bw/day), but no significant changes in the conclusions about the possible risk from aggregated sources of exposure. This evolution suggests that even though more causal chains are taken into account, causeeffect relations are considered as well-determined, and the challenges of indeterminacy may be underestimated.

Hazard Assessment
The early hazard assessments of BPA were based on the identification of one (or few) key toxicity studies following standardized OECD test guidelines and performed in compliance with good laboratory practice. Such so-called "guideline studies" were then used for the identification and characterization of the critical (most relevant and sensitive) adverse effects. In 2002, the Scientific Committee on Food identified the guideline study of Tyl et al. (2002) as the most appropriate to establish a temporary TDI for BPA of 10 µg/kg bw/day. This TDI was based on a No-Observed-Adverse-Effect Level (NOAEL) of 5 mg/kg bw/day for effects in body and organ weight, applying an uncertainty factor of 500 (the default 100 for inter and intraspecies differences, and five for uncertainties in the database on reproductive and developmental toxicity) (SCF, 2002). The same study of Tyl et al. (2002), together with the new guideline study by Tyl et al. (2008) (made available to EFSA before publication) were also pivotal in EFSA's (2006) reevaluation of BPA's toxicity. This time, the assessors set a full TDI for BPA at 50 µg/kg bw/day based on an overall NOAEL of 5 mg/kg bw/day for effects on the liver and applying a default uncertainty factor of 100. The assessors also found that the new study by Tyl et al. (2008) cleared previous uncertainties and removed the additional uncertainty factor of 5 used by SFC in 2002. In the same vein, in EFSA's (2010b) review of BPA's hazard assessment covering new evidence from the period 2007-2010, the previous TDI of 50 µg/kg bw/day, based on Tyl et al. (2002Tyl et al. ( , 2008, was confirmed.
One of the reviewers of this article proposed an alternative interpretation, namely that "the evolution of the debate shows the desperation with which the European chemical risk assessment establishment wants to cling to the outcome of their past assessments on BPA." While it is possible that the continued use of the same critical studies reflects some type of anchoring in the BPA case, we may also speculate that changing critical studies could possibly increase the controversy. Both hypotheses support the argument that technical approaches to the reduction of uncertainty can be contested when there is indeterminacy because there is no obvious "right" choice from a scientific point of view.
In more recent assessments, EFSA places a stronger focus on methodological rigor and transparency. In EFSA (2015b) for example, a more complex hazard assessment approach was used, including a systematic literature review (with a call for data), a structured weight of evidence approach, an elaborated uncertainty analysis, and a public consultation of the draft on the BPA health risk. Transparency is based on the use of clearly predefined criteria-although the selection of criteria is not subject to debate. The weighting of the evidence was also more systematic and transparent, using a likelihood scale system that gives numerical scores to the different lines of evidence via expert judgment (for more details see Table II). In 2017, the level of standardization and protocolization was substantially increased. The on-going reevaluation of BPA's toxicity makes use of several methodology guidelines that EFSA has developed in the last 10 years-among which application of systematic review methodology (EFSA, 2010a), expert knowledge elicitation (EFSA, 2014), (updated) use of the Bench Mark Dose (EFSA, 2017d), weight of evidence approach (EFSA, 2017b), and uncertainty analysis (ESFA, 2018). First, EFSA developed a protocol detailing a strategy for collecting data, appraising, and integrating evidence (EFSA, 2017a). Then, the draft protocol was opened to public consultation, amended accordingly (EFSA, 2017c) and tried on a pilot scale (EFSA, 2019) before starting the actual assessment.
What we observe in the hazard assessment is a tendency toward more protocols, guidelines, and methodologies. This is motivated by a move toward greater transparency, which serves to justify EFSA's assessments and protect the results from criticism. The pursuit of transparency through protocolization means that assessments take longer to be developed, and that the controversies are displaced, from a matter of identifying concerns to a series of technical puzzles to be tackled by analytical methods. We also observe a fragmentation of the assessment, which can be a consequence of indeterminacy: causal relations can only partially be determined, leaving small "gaps" in the assessment, which becomes somewhat of a patchwork of multiple results and methodologies.

Types of Uncertainty
Despite the increase in precision, protocolization, and transparency in the exposure assessments and the hazard assessments, the uncertainty has not been tamed. Specific issues of lack of data may have been solved, but as new sources of exposure and effects are detected, uncertainty remains a major obstacle to the establishment of recommendations and conclusions that are to withstand criticism from member states. This situation may suggest the existence of indeterminacy. In this section, we thus turn to the analysis of how uncertainty has been addressed, including the types of uncertainty identified and tackled by EFSA, and the higher-level uncertainties that "resist" quantification. Results are reported in Table III.
Overall, we observe that the type of uncertainty considered is exclusively the uncertainty within risk assessment. Uncertainty is tackled in a more systematic (protocolized) way starting from 2015. In previous assessments, uncertainty was reduced to an issue of lack of data and was expected to be "solved" by including more studies, more evidence, and more factors (of exposure, effect, etc.). We report how some of the uncertainties were treated in the exposure assessment and the hazard assessment.
(1) One important uncertainty in the exposure assessment concerns the sources and routes of exposure to BPA. The position of SCF in 2002 and that of EFSA (2006) was that only dietary exposures were relevant. In 2006, the uncertainties in the assessment related to the absence of data (i.e., concentrations of BPA in drinking water) and factors affecting BPA's release (i.e., heating of plastic containers) (EFSA, 2006).
In the following years, human biomonitoring studies started reporting small amounts of bioactive BPA in the blood. These results challenged the conclusions of toxicokinetic models, which predicted a rapid and efficient clearance of bioactive BPA from the body after oral absorption. These discrepancies suggested either that clearance was not as efficient as previously assumed or that additional routes of exposure exist (Gies, Heinzow, Dieter, & Heindel, 2009;Vandenberg, Hunt, Myers, & Saal, 2013). Besides initiating a heated debate about the toxicokinetics of BPA (specifically addressed by EFSA 2008) and the analytical approaches used in biomonitoring studies, these differences also led to the recognition of the importance of other routes of exposure besides food uptake. This is an example of indeterminacy, as Hazard characterization based on guideline study.
Use of refined uncertainty factor.
Follow a systematic review methodology described in the hazard assessment protocol that was previously developed and validated. the well-defined character of known causal relations is questioned. The need to consider new sources of exposure required a broadening and contestation of the problem framing, which we have associated with ambiguity. However, the issue was treated as a matter of uncertainty within the risk assessment. In 2015, EFSA conducted a mult-iroute aggregate exposure assessment, including dietary and nondietary sources. Average and high aggregate exposure estimates were calculated from source concentrations and corresponding use frequencies (i.e., food intake, handling of thermal paper) for different age groups and compared with estimates based on urinary biomonitoring. With respect to dietary exposures, the more comprehensive databases solved previous data gaps, but also revealed unanticipated uncertainties (i.e., the detection of high levels of BPA in food of animal origin which could not be linked to food packaging). The emergence of new uncertainties at every assessment suggests the existence of indeterminacy as an underlying condition. Aggregated exposures (dietary and nondietary) were highest for adolescents (up to 1.449 µg/kg bw/day) and results had large uncertainty ranges. The uncertainties were mainly related to the exposure estimates for nondietary sources if data on necessary parameters were scarce (i.e., patterns of thermal paper handling, absorption of BPA through the skin, how to convert dermal exposures to oral equivalents), leading to conservative assumptions and extrapolations. Expert judgment was used to assess the uncertainty of each source of exposure and sensitivity analysis was used to assess how uncertainties combined across multiple sources. The assessors reported that the good correlation between internal and biomonitoring exposure estimates suggested that no major exposure sources had been overlooked. However, they also noted that the uncertainties in both estimates were considerable (EFSA, 2015b).

Overall uncertainty analysis
(2) In the hazard assessment of BPA, the main source of uncertainty has been the very diverging views on the quality of low-dose studies (reporting effects below 50 mg/kg bw/day) and the overall significance of low-dose effects in the risk assessment of BPA (Beronius et al., 2010). Some of these diverging views come from institutionalized practices in the regulatory context that give more weight to guideline studies, considering them reliable by default (Beronius, Molander, Rudén, & Hanberg, 2014). At the same time, the reliability of nonguideline studies and their significance for hazard identification is questioned-for reasons such as limitations in the execution and reporting of the experiments, or the apparent inconsistencies in the database (SCF, 2002;EFSA, 2006EFSA, , 2010bEFSA, , 2015b. Guideline studies, on the other hand, have been criticized for their limitations in identifying and evaluating adverse health effects caused by endocrine disruptors, and it has been noted that nonguideline studies, using novel methods, are more sensitive and relevant for this purpose (Kortenkamp et al., 2011;Myers et al., 2009).
The way EFSA has addressed uncertainties in 2015 was by introducing a more structured weight of evidence approach (to make better use of the whole body of evidence) and a more refined uncertainty analysis (where instead of using default uncertainty factors, the assessors tried to quantify as much as possible all remaining uncertainties) (EFSA, 2015b). Yet, as the public consultation feedback reveals, these technical solutions are not enough to solve the underlying uncertainties " (…) although the presentation of the tool is welcome, as things stand it is not possible to judge if the tool successfully distinguishes between better studies and worse. There appears to be a certain amount of confusion about the sort of criteria which differentiate good research from bad versus direct from indirect, precise from imprecise, included from excluded, and research which is conducive to calculating a TDI against research that is not. These are not the same thing, yet they appear to be conflated in this tool" (EFSA, 2015a, p. 263).
There were discrepancies not only about individual studies, but also regarding the overall weight of evidence assessment. For example, while EFSA (and the British Food Standards Agency) considered the body of evidence with regards to neurotoxicity as insufficient to draw conclusions on this effect, the French, Danish, and Swedish regulators found the available evidence as sufficient to include these effects in the hazard characterization. The Swedish agency suggested: "We would encourage the EFSA panel to re-assess this endpoint as the overall assessment of the data in our opinion leads to the conclusion that neurodevelopmental effects are likely (…)" (EFSA, 2015a, p. 178).
We argue that in addition to uncertainty with regard to data gaps and models (i.e., how to extrapolate from animal to human) within risk assessment, the evolution of the debate shows increasing ambiguity. The increasing number of issues that are taken into account in exposure and hazard assessments gives way to an increasing number of interpretations of the data and an increasing amount of instances in which it is difficult to disentangle interpretative differences from differences in value-judgments. More information amplifies, rather than settling, the discussion. In the next section, we discuss how ambiguity feeds the never-ending controversy on BPA.

Controversy
As uncertainty is only recognized within the risk assessment, the issues of indeterminacy and ambiguity remain unanswered, generating controversy. In Table IV we show how controversies are closely linked to the many rounds of assessment and reevaluation of BPA, and we comment below on how assessments from different agencies have reignited the ongoing debate.
An analysis of the different requests sent to EFSA shows that, as time passes, the requests get more specific, for example from the evaluation of hazard to human health from BPA in foodstuff in SCF (2002), to the much more specific request in EFSA (2015b), specifying which vulnerable groups to include, which sources of exposure to address, and which type of toxicological information to include. The request gets even more specific in EFSA (2017a) where among other things, the European Commission specifies which methodology has to be used (a protocol detailing criteria for study inclusion and for toxicological evidence appraisal) and places greater demands in terms of transparency.
When it comes to the specific concerns, we see a tendency to include more and more factors: from the initial interest in general toxicity and carcinogenicity in SCF (2002), to concerns about low-dose effect on reproduction and development in EFSA (2006), to neuro-behavioural effects in EFSA (2010b), to a full array of endocrine-related endpoints in EFSA (2015b, 2017a) including effects on the mammary gland and effects on the immune and metabolic system. The increase in scope suggests that initial concerns are not resolved, rather they accumulate. For example, the early concerns about low-dose effects on development and reproduction from SCF (2002) and the neurobehavioral effects from EFSA (2006) are still unresolved in 2020.
As studies become more specific, and as more factors and more specialized scientific evidence are taken into account, controversy also increases. The proliferation of studies does not lead to a shared scientific understanding, but rather to a variety of different opinions, conclusions, and recommendations.
Risk assessments at the international, European and, member state levels report different conclusions with respect to the risk posed by BPA, mainly due to differing views on the significance attributed to the lowdose data for human health risk assessment (for a review see Beronius et al., 2010). As a consequence, EFSA has been constantly asked to reevaluate their opinion on BPA (for example when new studies are published), to advise on regulatory decisions taken by member states, or due to questions raised to the Commission. Just between 2008 and 2010, EFSA received five different requests (EFSA, 2008a(EFSA, , 2008b(EFSA, , 2010b, including a reevaluation all new evidence published between -2010(EFSA, 2010b. From 2006 onwards, the controversy concerning possible health risk during sensitive phases of development (in particular for unborn and young children) intensified. New animal studies suggested an increased risk for these segments of the population, while infants have repeatedly been identified as the most highly exposed. Academic circles criticized how EFSA weighted the evidence in the 2006 and 2010 assessments, basing the conclusions on the results of two standardized guideline studies and not taking into consideration the results of hundreds of available nonguideline studies (Gies et al., 2009;Myers et al., 2009). The 2006 assessment also raised concerns with regards to lack of transparency and regarding possible conflict of interests among Panel members (Gies & Soto, 2013). There were increasing questions concerning the contribution of nondietary exposures and the fact that EFSA had not systematically reviewed all available information in its assessments (Gies et al., 2009).
Member state agencies have also challenged the sufficiency of current TDIs to protect the most vulnerable (and highly exposed). In 2012, for example, a report from the Swedish Chemicals Agency (KemI), proposed an alternative TDI 2 to 3 orders of magnitude lower than the then current TDI (50 µg/kg bw/day). The report concluded that "although no single study reviewed here was considered reliable enough to serve as a key study for the derivation of an alternative reference dose, if the data is considered as a whole, effects are consistently observed at doses well below those which serve as the basis for the current TDI for BPA" (KEMI, 2012, p. 6). Similarly, after the publication of EFSA's 2015 comprehensive reassessment, the Danish National Food Institute concluded that the new temporary TDI (4 µg/kg bw/day) was not sufficiently protective against BPA's effects on the mammary gland and proposed an alternative TDI an order of magnitude lower (DTU-Food, 2015). In 2013, the French food regulator ANSES reported they could "identify risk situations for the unborn child, associated with exposure to BPA during pregnancy" (ANSES, 2013, p. 9). The Dutch National Institute of Public Health and the Environment (RIVM) reached a similar conclusion in 2016, arguing that EFSA has overestimated the safe daily exposure to BPA-this time concerning the effects on the immune system of unborn and young children (RIVM, 2016). One response to the increasing controversy has been to arrange meetings with member state experts to "build mutual understanding" (EFSA, 2013) and more recently, to include public consultations (engaging a broader spectrum of stakeholders) in the assessment process. EFSA has used comments from the public consultations to revise the 2015 and 2017 draft protocols, and to address suggestions made by other national agencies (i.e., reanalyzing specific studies and including them in the hazard identification, or addressing effects that were left out via an additional uncertainty factor) (EFSA, 2015a, 2017c).

DISCUSSION
As the controversy over BPA increases and assessments are contested by more and more parties, EFSA's approach has been to increase precision and methodological specification, to pursue transparency through protocolization, to rely on the characterization of uncertainty within the risk assessment through quantification, and to tame uncertainty by providing conservative approaches to conclusions and recommendations. We argue that this approach renders controversies and uncertainty as technical problems, limiting the understanding of uncertainty to a matter of lack of precision, incomplete assessments, and methodological underspecification. As a result, more and more data are produced in the hope of reducing the technical uncertainties within risk assessment through greater precision, specification, and refinement. Paradoxically, the proliferation of studies leads also to a proliferation of interpretations about how to govern BPA, generating other types of uncertainty such as indeterminacy and ambiguity. We discuss these two types of uncertainty in turn.
Indeterminacy can be related to the fact that cause-effect relations are not fully defined, creating several sources of uncertainty in the exposure assessment. Besides the inherent variability of some of the parameters (i.e., eating habits), there are also uniden-tified sources of exposure and a lack of clarity about what different routes of exposure imply in terms of "toxicological relevant" exposures (bioactive BPA). During the last decades, EFSA has dealt with these uncertainties by paying particular attention to the exposure of vulnerable groups, as well as including more routes and sources of exposure and improving the methodology (i.e., to compare different estimates, to aggregate different sources and routes of exposure and by quantitative analysis of the uncertainties). Our results show that, even if dietary estimates are more precise and reliable than before, surprises can still occur (i.e., the identification of fresh animal products as the second largest dietary source of exposure). Although central estimates of aggregate exposures (dietary and nondietary) are below the TDI, these have very large uncertainty ranges. Even if the contribution of nonoral exposures to total exposures is clearer, the toxicological relevance of those relatively low dermal contributions is still unknown. Low dermal exposures could be of equal or higher toxicological relevance than dietary exposures because they bypass clearance mechanisms in the liver and the intestines, resulting in higher circulating levels of bioactive BPA (von Goetz et al., 2017). Additionally, the toxicological relevance of current BPA exposures in the context of real-life coexposures with other chemicals (e.g., other similar acting bisphenols) is unknown. This reflects how greater specification in the context of indeterminacy produces fragmented evidence, as we only have knowledge of some parts of problem for which causal relations can be determined. This illustrates the untamable nature of uncertainty in the exposure assessment, where more data and better analysis cannot always give a definitive answer. Some of these issues might be resolved in the future (i.e., dermal exposure), but in today's context, such irreducible uncertainty results in the mobilization of different legitimate interpretations of what is the best thing to do. For example, including additional potential sources of exposure (i.e., medical equipment, toilet paper), using different analytic approaches (i.e., biomonitoring studies based on blood instead of urine), choosing between different pharmacokinetic (PBPK) models, using mixture allocation factors for coexposures, and so on, can lead to diverging safety recommendations.
The existence of multiple nonequivalent interpretations of the same data also generates ambiguity, which fuels the controversy. In the case of BPA, different (interpretative and normative) understandings have resulted in very different assessments of the body of the toxicological data available. Differences concern, among other things: the assessment of the relevance and quality of low-dose studies, the selection of key studies and most sensitive endpoints to set a health-based guidance value (including considerations on the adversity of the effect and their relevance to humans), the characterization of the dose-response curve, the selection of the most suited methodology to reach risk conclusions, the evaluation of the overall uncertainties, and so on. We consider one example in detail, to show how ambiguity in the hazard characterization can lead to divergent safety recommendations. While both EFSA and the Danish DTU identified effects on mammary gland proliferation as likely (based on the study by Delclos et al., 2014), 2 they disagreed on whether or not a point of departure could be identified for the derivation of a reference dose. EFSA reported that the dose-response relationship in Delclos et al. (2014) was "not suitable to derive with any confidence a reference point for this endpoint" (EFSA 2015b, 70), while DTU, using a different statistical analysis, identified a reference point (LOAEL) from this study that was used to derive an alternative TDI of 0.7 µg/kg bw (an order of magnitude lower than EFSA's) (DTU-Food, 2015). Ambiguity may be hard to distinguish from expert bias, when it comes for example to preferences for guideline studies over nonguideline studies. EFSA has tried to address the issue of bias through protocols-for example, systematic literature reviews, structured weight of evidence methodologies, sophisticated uncertainty assessment (including expert elicitation methods), to promote a more equal treatment of available studies, and to guard against (some cognitive) biases. However, the plurality of interpretations is also due to the complexity of the BPA case, and to the extent that complexity cannot be reduced, neither can the plurality of the scientific knowledge base.
When it comes to policy recommendations, the new protocols and their resulting assessments do not warrant enough evidence to support a change in conclusions. We take the issue of establishing the TDI as a case in point. TDI recommendations have undergone significant, if slow, changes, from 10 µg/kg bw/d in 2002 (SCF, 2002), to 50 µg/kg bw/d from 2006 to 2010 (EFSA, 2006(EFSA, , 2008a(EFSA, , 2010b, to the much lower recommendation of 4 µg/kg bw/d in 2015 (EFSA, 2015b). Notwithstanding the difference between 2006-2010 and 2015 (an order of magnitude!), the controversy about low-dose effects rages on. The reliance on quantification, as well as the increase in precision and specification, helps reduce the uncertainty within risk assessment but more data also increase the ambiguity. We argue that the technical treatment of the uncertainties generated by ambiguity and the insistence on providing a single metric as the answer to the mounting doubts that surround BPA, speak of the possible need to complement the risk management approach with the analysis of other types of uncertainty in the context of untamable uncertainty.
EFSA's limited responsiveness to new studies does not mean that the agency has not devoted impressive amounts of work to the assessment of BPA. There has been a lot of effort in the creation of protocols, guidelines, and so on. Our results show a clear trend toward greater protocolization, standardization, and refinement of methods. Many of the changes in EFSA's risk assessment practice are commendable (taking all evidence and toxicological endpoints into consideration, explicit appraisal of the reliability and relevance of the studies). Yet, methodological advancements can only deal with technical types of uncertainty and fall short of addressing indeterminacy and ambiguity since there is not a unique (and uncontested) strategy for aggregating and summarizing the BPA evidence. Multiple scientific strategies can be chosen reflecting varying levels of comprehensiveness and detail, objectives, methodology, criteria, and so on (see for example Whaley et al., 2020), each having different social ramifications.

CONCLUSION
In the conclusion, we return to the question of how different types of uncertainty interact with each other. Our results show that between 2002 and 2020, EFSA's risk assessments of BPA have become progressively more comprehensive, including more references, types of evidence, and effects. There is a tendency toward greater precisions and specification of results, and toward protocolization and standardization of all processes included in the assessment (from literature review, to uncertainty assessments, and public consultation). Yet, results are also more fragmented. We argue that fragmentation of the knowledge base is a result of indeterminacy, which does not allow for a complete and coherent aggregation of all results, as causal chains are only partially understood and determined. A further consequence of fragmentation is the irreducible pluralism of results, which generates ambiguity. Because ambiguity stems from the coexistence of multiple tenable interpretations of evidence, from differences in value-judgments about how to deal with nonequivalent types of evidence, and from the sometimes interwoven nature of interpretative and normative differences, a tension is created between the strategies used to deal with different types of uncertainty. On one hand, the uncertainty within risk assessment is expected to be reduced through better specification of methods and results, and on the other hand, the increase in data, protocols, and methods increases the number of possible interpretations of the results leading to greater ambiguity and controversy.
Our results confirm the findings of several scholars working at the science-policy interface (Funtowicz & Ravetz, 1993;Sarewitz, 2004;Stirling, 2003), that scientific knowledge plays an essential role in informing policy decisions, but is not sufficient. In the context of untamable uncertainties, it is not clear that more research helps inform policy decisions. For this reason, some scholars are critical of the tendency to invest significant resources and efforts in refining results. "Scientific resources end up focused on the meaningless task of reducing uncertainties pertinent to political dispute, rather than addressing societal problems as identified through open political processes." (Sarewitz, 2004, p. 399). Complementary mechanisms for decision making are required in case of high ambiguity. This does not discard the science, but it does raise doubts about the level of precision that is needed. For this reason, Saltelli and Giampietro (2017) speak of spurious precision. Because of the fragmented nature of evidence in the BPA case, advances in specification of results do not warrant sufficient confidence for adjustments in overall conclusions.
Together with Jasanoff (2007), we claim that there is a need for more humility in policy advice about BPA. An alternative approach that considers multiple types of uncertainty would embrace the temporary and fragmented nature of evidence and may help construct a more precautionary model of governance, in which uncertainties are not seen as a problem to be quantified and managed, but as an indication that decision making cannot avoid social and ethical considerations alongside quantitative evidence. In their research on controversies on endocrine disruptors, McIlroy-Young, Leopold, and Öberg (2021, p. 481) note that "when dealing with complex problems, all scientific perspectives are limited, often in ways that are invisible to those involved. As such, an evaluation process that engages with conflicting perspectives is beneficial." We agree, and propose moving from a single, overconfident result to plural, conditional advice. A more prudent approach to governance may be to regulate all bisphenols, not just BPA, reducing their usage as much as possible and investing in the development of (inherently safe) alternative substances and materials, rather than in more research about harm.