Looking Back to Look Forward: A Review of FDA's Food Additives Safety Assessment and Recommendations for Modernizing its Program

Authors


Direct inquiries to author Maffini (E-mail: mmaffini@pewtrusts.org).

Abstract

 Scientists participating in 2 multistakeholder meetings in 2011 and in other events have identified a number of ways in which the methods the U.S. Food and Drug Administration (FDA) uses to assess the safety of chemicals in human food should be improved and updated. We evaluated whether FDA's current methods, including its decision-making process, are outdated, as alleged by its critics. We examined a 1982 report by the Select Committee on GRAS Substances (SCOGS) that included suggestions to enhance food additive safety. FDA established SCOGS to review the safety of “generally recognized as safe” (GRAS) substances in response to a directive by President Nixon. When evaluating FDA's response to SCOGS’ suggestions, we found that many remain unresolved and relevant today. Our analysis demonstrates that in many cases FDA has not kept pace with scientific developments. Although difficult to pinpoint, we concluded that this situation became more significant after 1997, when FDA launched the voluntary GRAS notification program aimed at enticing manufacturers to inform the agency of their own safety decisions. Looking forward, we recommend that the agency convene an unbiased and independent expert workgroup to conduct a comprehensive review of FDA's science and decision making and develop a path to modernize food additives safety assessment. Areas of concern include toxicology test guidelines, tools used to predict health outcomes, conflict of interest in manufacturers’ decisions, lack of a reassessment strategy, and lack of a definition of harm.

Introduction

In 1969, President Richard Nixon requested that the U.S. Food and Drug Administration (FDA) reexamine the structure and procedures used by the agency to approve chemical additives to ensure they were “fully adequate.” The administration initiated “a full review of food additives” with particular focus on the safety evaluation of chemicals that FDA designated as “generally recognized as safe” (GRAS) (Nixon 1969).

In response, FDA established the Select Committee on GRAS Substances (SCOGS) with the task of “evaluating the safety” of more than 450 GRAS substances. The committee conducted its analysis from 1972 to 1982; it was made up of experts in biochemistry, pharmacology, and medicine and was operated by a 3rd party (Contract Nr FDA-223-81-2394). Its final report, “Insights on Food Safety Evaluation” (SCOGS 1982), published in 1982, contained SCOGS’ conclusions for each of the chemicals and provided FDA with a roadmap to evaluate the safety of additives by making suggestions to enhance the agency's program.

Since that time, an array of peer-reviewed scientific studies and published reports has questioned the safety of certain artificial colors, sweeteners, preservatives, and pesticide residues in food. The controversy over bisphenol-A intensified the questions about whether FDA uses the best scientific evidence to make safety decisions.

The food additive safety issue came to a head in 2010 with a commentary by the editors of the journal Nature (Nature 2010) calling for “scientific reform” to limit the “approval of chemicals of questionable safety.” The editorial prompted a thoughtful response by 2 senior FDA scientists (Lorentzen and Hattan 2010). They stated that “safety regulation depends upon scientific consensus” and that “without a consensus over the state of science for a particular issue, there is no end in sight to the claims that can be made for which a consensus does not exist.” They continue by saying that “[t]o abandon this conceptual objective for any temporary, subjective, personal convenience is a formula for a general regulatory impasse.” The discussion initially focused on bisphenol-A because of its impacts on the hormonal system; however, the questions about FDA's use of science in safety decision making have expanded beyond endocrine disruption.

In light of these criticisms, The Pew Charitable Trusts, Nature, and the Institute of Food Technologists hosted 2 multistakeholders events (hereafter Pew workshops) in 2011 to explore the roots of these criticisms. FDA was a key participant in the discussions and material preparation. The discussions made clear that there are significant questions about hazard assessment, estimating dietary exposure, and the decision-making process for food additives (Maffini and others 2011; Alger and others 2013).

Tasked with both assessing the science and making policy decisions, FDA's actions have enormous public health implications, significant consequences for the food industry, and could have impacts on the cost and availability of foods. Therefore, controversies should be expected. The objective for a regulatory agency grounded in science, such as FDA, is to base decisions on the best available and most up-to-date data and develop science-based policies. To earn the confidence of industry, public health advocates, and consumers, the agency's methods must also be clearly explained and transparent.

Here we present a detailed evaluation of FDA's use of science to identify whether the agency's safety assessment methods are outdated, as alleged by its critics. We compared the current controversies surrounding food chemical safety identified by the Pew workshops’ participants with the suggestions made by SCOGS in 1982 and analyzed the common areas of concern. We used SCOGS’ final report as a reference because it was the most comprehensive and structured analysis of the science and decision making used in food additive safety available and was prepared in close coordination with FDA. We conclude with our recommendations for FDA looking forward.

Methods

We reviewed SCOGS’ final report and identified statements where the committee members explicitly described a measure that, in their opinion, would enhance safety assessment. For simplicity, we refer to each of SCOGS’ points as suggestions, although the committee may have characterized them as an insight, opinion, recommendation, or action the agency should take. For each suggestion, we evaluated FDA's handling of them and also determined whether a similar issue was raised by participants at either of the Pew workshops.

We also reviewed the proceedings from 2 Pew workshops held on 2011 to identify whether the issued raised by SCOGS were also raised as concerns by the workshop participants. These workshops dealt with chemical hazard identification and characterization and dietary exposure assessment (Maffini and others 2011; Alger and others 2013).

We identified the SCOGS suggestions that FDA did not implement and compared them to the ongoing concerns raised in the workshop proceedings. Where we found matches, we consolidated them into broad “areas of concern.”

For simplicity, when we use the term “food additives” we include food and color additives as well as GRAS substances. In addition, we refer to FDA only in the context of its food additive regulatory program.

Background

FDA's criteria for food additives safety assessment

In 1982, the same year of SCOGS’ final report, FDA released its “Toxicological Principles for the Safety Assessment of Direct Food Additives and Color Additives Used in Food,” also known as the Redbook, which contained guidance for industry in performing toxicological tests. The 1st edition was more than a compilation of recommended test methods. It contained FDA's criteria for assessing safety and ensuring that chemicals remain safe after their approval and introduction into the market (FDA 1982). The criteria were based on the concept of “concern” where the “degree of concern” was determined by the “extent of human exposure and the toxicity of the additive.”

The concern levels determined “the extent and type of basic toxicological testing of an additive” (FDA 1982; Rulis and Hattan 1985). The levels were developed using:

  • Estimated per capita exposure based on information that food and additive manufacturers voluntarily submitted to the National Academy of Sciences; and
  • Chemical structure based on a qualitative decision tree assigning chemicals to 3 broad categories established by inferences between structure and known toxicity.

The higher the concern level, the higher the extent of toxicity testing. The levels are still used today to determine the “minimum toxicity tests to be performed for safety evaluation of direct food additives and color additives” (FDA 1982, 2006b).

The 1982 Redbook also recognized that additional toxicity information may be needed because either the original data were insufficient to make a final determination or “[a]dditives once approved do not always remain static relative to the exposure and toxicological criteria used originally to evaluate their safety.” The agency developed a framework for reassessment that considered public health concerns, demonstrated toxic potential at levels present in the food, and accounted for economic and administrative realities (FDA 1982).

Since 1982, science and technology have made great strides and, as a result our knowledge of normal body functions and disease development and our capability to measure chemicals have grown dramatically over time. Historically, as our understanding of new scientific findings has developed, regulatory programs have incorporated them into guidance and rules.

Introduction to SCOGS’ suggestions

The SCOGS report provides a useful starting point for understanding FDA's response to scientific progress and the origin of some of the current controversies for 3 reasons: 1st FDA established the committee and “concurred that a critique of the experience was appropriate” (SCOGS REF) so it was fully aware of the suggestions; 2nd, the SCOGS report was the most comprehensive and structured analysis of the science and decision making used in food additive safety available; and 3rd, FDA has had sufficient time to respond to the suggestions.

Introduction to the Pew workshops

The 2 workshops held in 2011 were hosted by the Pew Charitable Trusts, Nature, and the Institute of Food Technologists. More than 80 scientists, lawyers, and public health advocates representing industry, academia, public interest organizations, and government agencies. These workshops dealt with chemical hazard identification and characterization and dietary exposure assessment.

When the organizers developed the agendas, they were not aware of the SCOGS report. Therefore, there was not any attempt to synchronize the topics of discussion with the committee's suggestions. Several months after the last workshop, the authors of this article read SCOGS final publication and noticed startling similarities between many of the issues even though 30 y had passed.

Results

Analysis of SCOGS’ suggestions

SCOGS made 35 suggestions some of which overlapped. We grouped them into 9 topics:

  1. Substance identification
  2. Exposure assessment
  3. General toxicity testing
  4. Specific testing needs
  5. Role of human studies
  6. Evaluating evidence
  7. Consistency in approaches
  8. Definition of safety
  9. Revisiting assessments.

Table 1 summarizes the suggestions and our analysis of their current status based on whether FDA addressed each of them. The 1st column contains SCOGS suggestions; we kept much of the original language and structure. The 2nd column contains their current status.

Table 1. Summary of the current status of the suggestions SCOGS made to FDA in 1982
What did SCOGS say?Authors’ summary of current status
Topic 1. Substance identification 
Fully characterize chemicals to ensure material actually used in food is the same as the product on which toxicity tests were performed. Use Food Chemicals Codex (FCC) specifications to facilitate process. (Chapter III, Identity and characterization of substances added to foods)FDA has explicit requirements that petitions for agency approval must have detailed characterization of the chemical. Its decisions establish specifications in rules and frequently reference the FCC. As the FCC has been updated, these references are outdated.In the late 1990s, FDA moved away from petitions for direct additives to an informal voluntary notification approach without rulemaking or regulations. Our review of a representative sample of notifications indicates that 50% do not adequately describe the chemical.
Clearly establish the identity of all food ingredients. Include in specification the levels of toxicologically relevant components. (Chapter VI, Specific suggestions, Identity of substances added to foods) 
Carefully consider byproducts of reactions of chemicals in food. (Chapter III, Identity of substances consumed)OFAS considers byproducts on a case-by-case basis.
Topic 2. Exposure assessment 
Estimate migration from packaging. (Chapter III, Identity of substances consumed)FDA developed methods to measure chemical migration from packaging and published guidance of their use. It continues to refine the methods.
Develop accurate estimates of consumption of food products using data from those who eat the food and determine upper percentiles of intake. (Chapter III, Food consumption data)FDA developed sophisticated methods to use federal food consumption information and supplements it with market research reports. It assesses safety based on the exposure by “high” consumers, typically those who consume quantities of food at the 90th percentile level.
Need to know total quantity added to foods and per capita disappearance when chemical is widely distributed in food supply. (Chapter VI, Specific suggestions, Consumer exposure data)FDA monitors postmarket exposure by analyzing retail food for selected elements, nutrients, pesticides, and pollutants. It does not have the authority to systematically collect information from food manufacturers after approving an additive's use.
Consider subpopulations who:FDA typically does not consider all dietary sources such as pesticides, chemicals in drinking
• Are particularly susceptible;water and dietary supplements, which limits the accuracy of the exposure assessment. The
• Consume substantial quantities of a type of food; andaccuracy is also limited by uses allowed through GRAS determinations made by food
• May benefit from nutrient supplementation.manufacturers about which the agency is not notified. See Subpopulations, and
(Chapter VI, Specific suggestions, Consumer exposure data)Reassessment and consistency across substances sections for more information.
Topic 3. General toxicology testing 
Do not assume there is no hazard below an arbitrary threshold. (Chapter IV, Toxicological insignificance)In 1995, FDA exempted chemicals from all toxicological testing if they are not suspected carcinogens and the amount in the diet is estimated to be less than 0.5 ppb. It also calls for only genotoxicity testing for chemicals with estimated cumulative exposure between 0.5ppb and equal or less than 50 ppb. These levels were based on a limited set of data that do not consider hormonal or behavioral impacts. See Toxicological insignificance section for additional analysis.
Know how chemicals are absorbed, distributed, metabolized and excreted (ADME) by the body. In studies, use doses at levels relevant to human exposure. Develop tools to measure relevance of different test methods. (Chapter V, Relevancy)FDA’ Redbook requests ADME information for all but direct additives with the lowest level of concern. These levels define minimum toxicology testing needs. Authors reviewed a representative sample of GRAS notices for which FDA had no questions and found that the notifier did not provide ADME information in 50% of the notices. Level of concern was not identified in any of the notices. See ADME section for additional analysis.
 FDA does not call for animal studies to use doses relevant to consumers’ exposure.
Use health statistics to set priorities to develop and implement test methods considering cardiovascularFDA considers health statistics on a case-by-case basis. For example, it is evaluating the contribution of the additives sodium and trans fats to cardiovascular disease.
disease, cancer, hyperactivity, neurological effects,immune system impacts, and reversibility ofpotential danger. (Chapter VI, Future priorities)In response to concerns with cancer, OFAS calls for genotoxicity tests for all direct additives and for food contact substances that may exceed 0.5 ppb in the diet.
 FDA does not systematically reassess safety of substances or set priorities for review of prior decisions. See Reassessment and consistency across substances section for additional analysis.
Develop reliable test methods to evaluate the significant health hazards relevant to food ingredients. (Chapter VI, Future priorities)In 1982, FDA was on the leading edge of safety assessment, releasing its Redbook guidance for industry in performing toxicological studies. It updated the document in 1993 and 2000, and periodically since then. However, the document has fallen behind guidance by the Organisation for Economic Co-Operation and Development (OECD) and EPA. See Endocrine systems, Subpopulations and Behavioral impacts sections for additional analysis.
Maintain a balance between a comprehensive battery of tests and simpler, less time-consuming and less expensive model systems to evaluate toxicity. (Chapter VI, Future priorities)FDA established a battery of tests in the Redbook and integrates predictive models into that analysis. It is participating in Tox21 to develop less expensive tests and has supported and used computational models. See Behavioral impacts and Endocrine systems sections for additional analysis.
Topic 4. Specific testing needs 
Use in vitro tests as an indicator of need for in vivo tests for potential carcinogenicity. (Chapter IV, Single-cell systems and submammalian models)FDA systematically uses in vitro tests for genotoxicity.
As tissue hormone receptor tests become available, use them to identify potential toxicity concerns for further evaluation. Test at doses relevant to actual exposure, combined with potentially interactive chemicals. (Chapter IV, Nonhuman mammalianFDA's Redbook considers endpoints such as reproductive toxicity that may result from endocrine disruption. It does not recommend measuring more sensitive but less evident endpoints that reflect potentially significant harm from hormonally active chemicals. Nor does it require doses that are relevant to actual human exposure. When alerted to potential problems, FDA considers these issues on a case-by-case basis.
systems)FDA does not recommend any of the available validated screening tests. See Endocrine systems section for additional analysis.
Develop animal tests of relative simplicity to provide quantifiable and reproducible information on behavioral effects at levels relevant to humanFDA's Redbook recommends screening for neurotoxicity by observing animals for seizures, paralysis, and so on and examining the histopathology of the brain and nervous system. When problems are identified, FDA recommends further testing on a case-by-case basis.
exposure. (Chapter VI, Specific suggestions, Behavioral tests)The Redbook makes no use of the many available tests to quantify behavior impacts of significance to humans, such as those for learning and memory adopted by EPA and OECD. See Behavioral impacts section for additional analysis.
Develop methods to detect and study hypersensitivity. (Chapter VI, Specific suggestions, Hypersensitivity)FDA does not recommend tests for hypersensitivity or severe allergic reactions. See Subpopulations section for additional analysis.
Topic 5. Role of human studies 
Take special care to consider subpopulations for nutrients and consider human prospective studies when possible. Do not ignore behavioral tests to assess safety. (Chapter IV, Observations on human subjects)FDA considers human studies on a case-by-case basis and does not call for their use. There are several examples where FDA relied on and, in at least one case, requested, human studies.
Evaluate special risks of subpopulations, especially fornutrients. (Chapter IV, Subpopulations)Its analysis of nutrient exposure in diet may be incomplete since FDA's exposure model does not include consumption of nutrients from dietary supplements even though that information is available through NHANES.
Require controlled evaluation in human subjects.(Chapter VI, Specific suggestions, Human data)FDA does not systematically consider subpopulations except for infants and young children. See Subpopulations and Behavioral impacts sections for additional analysis.
If hypersensitivity is possible, conduct humanprospective studies considering subpopulations withdifferent sensitivities. (Chapter IV, Subpopulations)FDA guidance examines potential hypersensitivity concerns for genetically engineered plants. However, it recommends no tests for hypersensitivity or severe allergic reaction and does not expect human prospective studies to be conducted. See Subpopulations section for additional analysis.
 FDA requires labeling for 8 common ingredients associated with severe allergic reactions. The agency accepts consumer reports of allergic reactions, but the system is rudimentary and not consumer-friendly.
Topic 6. Evaluating evidence 
Conduct literature search for relevant biological information on related substances to assure direct functional ties between literature compilation and safety evaluation. (Chapter III, Literature surveys and data collection)FDA requests that industry compile and summarize the literature and relevant data in the notifications and petitions it receives. It conducts additional searches to confirm completeness.
Consider bioavailability of nutrients since their safety issues are different from other additives. (Chapter III, Efficacy of substances added to foods)FDA does not appear to consider nutrients differently from other additives.
Carefully consider reproducibility and coherence with other data in the overall pattern of the biological response. (Chapter V, Credibility of data)Be cautious when animal consumption scenarios are not similar to human scenarios and effects on animals may be manifested differently in humans. (Chapter V, Particularization of judgment)FDA closely scrutinizes all available studies. However, its analysis is often based on professional judgment and does not make use of the more rigorous methods that have been developed, such as the Cochrane system, to compare various studies in a transparent or reproducible manner. See Weight of the evidence section for additional analysis.Current guidance does not call for testing at doses relevant to consumers.
Rely on independent evaluations by nongovernment scientists without conflicts of interest where there is a public controversy. (Chapter III, Selection of scientific panel)Ensure experts are sensitive to their personal bias regarding safety, adequacy of data, conventional wisdom, and unconfirmed studies. (Chapter V, Extra-scientific factors)As a result of various federal initiatives, FDA conscientiously manages personal bias in its own decisions through ethics standards, as well as requirements for peer review of its decisions and conflicts of interests for advisory panel members. However, it does not have standards for conflicts of interest for scientists making safety determinations for GRAS substances despite situations where the person is an employee or consultant for the company marketing the product. The problem is especially significant when FDA is not informed of the determination. See Personal bias and conflicts of interest section for additional analysis.
Topic 7. Consistency in approaches 
Phase out the GRAS list. Assess potential health hazards of chemicals using a single system. (Chapter VI, Specific suggestions, Phaseout of GRAS list)Develop consistent approach between food, drugs, and environmental contaminants. (Chapter VI, Future priorities)GRAS list still exists. FDA has not resolved issues raised by SCOGS for 18 of the chemicals on the GRAS list.FDA has harmonized its analysis of the additives it regulates: color additives, food additives and nonflavor GRAS substances. However, the system takes a significantly different approach than EPA for pesticides and chemicals in consumer products. As a result, the agencies may reach and, in some cases, have reached different outcomes for the same chemical. See Weight of the evidence section for additional analysis.
Topic 8. Definition of safety 
For nutrients, potential health benefits should be compared to risks. (Chapter V, Risk/Benefits)FDA definition of safety at 21 CFR 170.3 does not allow the agency to consider the benefits of a substance or its potential efficacy.
Require demonstration of efficacy of GRAS-listed substances. (Chapter VI, Specific suggestions, Efficacy of substances added to foods) 
Seek modification of Delaney Clause to provide flexibility. (Chapter VI, Specific suggestions, Modification of Delaney Clause, Page 34)FDA narrowly interprets the Delaney Clause.
Topic 9. Revisiting assessments 
Assign its conclusions assessing the evidence of safety to one of 5 conclusions that cover the majority of situations. These statements identify situations where the use is not clearly safe or unsafe, such as when postmarket studies or monitoring is needed because of limited data. The statements are most useful as a trigger for reassessing safety. (Chapter III, Formulation of questions and answers)FDA does not assign its safety conclusions to one of the 5 SCOGS conclusions. It believes that it approves only chemicals that SCOGS would conclude are safe and unlikely to warrant future review. Chemicals that SCOGS would assign to one of the 4 other conclusion categories are rejected. FDA does not have a system to reassess the safety of existing chemicals.The authors reviewed FDA's database tracking the toxicology data on almost 4,000 chemicals allowed to be added directly to food. They found that OFAS's own comments on almost 25% of these chemicals stated that there were no or insufficient toxicology data. See Formulation of questions and answers and Reassessment and consistency across substances sections for additional analysis.
Stay current as rapidly advancing science develops, and be prepared to change prior decisions. Adapt to changing public perceptions of acceptable degree of risk. (Chapter V, Dynamics of safety evaluation)Develop consistent approach to treat new information, reversing standing approvals, and burden of proof in borderline cases. (Chapter VI, Future priorities)Develop mechanism to provide temporary clearance followed by monitoring of health complaints and appropriate epidemiological surveys. (Chapter VI, Future priorities)FDA reassesses prior safety decisions when prompted by stakeholders through citizen petitions, food additive petitions, and to a limited extent, notifications. In 2011, it revoked the GRAS status of a chemical combination.The U.S. Government Accountability Office reported in 2010 on FDA's poor response to various citizen petitions involving GRAS decisions.FDA has not established a consistent or systematic approach to prioritizing or conducting its review of existing safety decisions. In the 1970s, it adopted standards granting interim approval of 4 chemicals but has not used this approach since the SCOGS report.By contrast, for pesticides and chemicals in consumer products, EPA and the European Food Safety Authority have reassessment programs well underway. Both agencies have significantly greater authority than FDA to require additional testing, obtain use and exposure information, and be alerted to unpublished toxicology studies. See Reassessment and consistency across substances sections for additional analysis.

It is important to note that SCOGS was not tasked with evaluating FDA's food additives decision making; rather, its final report contained the expert opinions of scientists who spent 10 y reassessing the safety of chemicals already on the market. The agency was not obligated to implement the committee's suggestions. However, FDA has acted on several of them. For example, the agency:

  • Developed sophisticated methods to use food consumption data provided by the National Health and Nutrition Examination Survey (NHANES) to more accurately estimate human exposure to food additives;
  • Created methods to estimate migration from food contact materials, including use of modeling to complement its test protocols;
  • Embraced the use of in vitro testing for genotoxicity and computational methods to gather biological data on related substances;
  • Established methods to routinely consider differences between animal models and human scenarios and recommends animal feeding studies;
  • Recommended a battery of tests in the Redbook and integrated predictive models into the toxicity analysis.

There are, however, a number of SCOGS’ suggestions not followed that are similar to those indicated by the Pew workshop participants 30 y later.

Comparison of SCOGS suggestions to workshop proceedings

We reviewed the workshop materials and determined that 29 of the 35 suggestions were related to issues on the agenda for discussion. Participants did not raise concerns with one issue related to a SCOGS suggestion involving estimating migration of chemicals from packaging. Excluding these 6 suggestions not raised and the 1 not raising concerns, we identified 28 SCOGS suggestions that were similar to concerns raised by Pew workshops participants in 2011. Table 2 provides a side-by-side comparison between these suggestions and what the participants said.

Table 2. Similarities between the suggestions SCOGS made to FDA in 1982 and the concerns raised by Pew workshops participants in 2011
What did SCOGS say?aWhat did workshop participants say?
  1. a

    The workshop participants and its organizers were not aware of the 1982 SCOGS report. The authors reviewed the workshop materials and determined that 29 of the 35 suggestions were related to issues on the agenda for discussion. Participants did not raise concerns with one issue related to a SCOGS suggestion involving estimating migration of chemicals from packaging. The 6 suggestions not raised and the 1 not raising concerns were excluded from this table.

Topic 1. Substance identification 
Fully characterize chemicals to ensure material actually used in food is the same as the product on which toxicity tests were performed. Use Food Chemicals Codex (FCC) specifications to facilitate process. (Chapter III, Identity and characterization of substances added to foods)At November 2011 workshop, concerns were raised with the agency's ability to ensure compliance when substances are poorly characterized.
Clearly establish the identity of all food ingredients. Include in specification the levels of toxicologically relevant components. (Chapter VI, Specific suggestions, Identity of substances added to foods)At April 2011 workshop, concerns were raised about proper characterization of nanomaterials.
Topic 2. Exposure assessment 
Develop accurate estimates of consumption of food products using data from those who eat the food and determine upper percentiles of intake. (Chapter III, Food consumption data)Need to know total quantity added to foods and per capita disappearance when chemical is widely distributed in food supply. (Chapter VI, Specific suggestions, Consumer exposure data)At November 2011 workshop, participants generally acknowledged the significant improvements FDA has made in exposure estimation; in particular, the use of the National Health and Nutrition Examination Survey (NHANES).
Consider subpopulations who:• Are particularly susceptible;• Consume substantial quantities of a type of food; and• May benefit from nutrient supplementation.(Chapter VI, Specific suggestions, Consumer exposure data)Concerns were raised with:• Cumulative exposure to all dietary sources;• Analysis for sensitive subpopulations such as children and pregnant women; and• Standards based on protecting 90th percentile of people who may consume the additive.
Topic 3. General toxicology testing 
Do not assume there is no hazard below an arbitrary threshold. (Chapter IV, Toxicological insignificance)At April 2011 workshop, concerns were raised about FDA's use of thresholds to determine the needed toxicology testing or to exempt some chemicals from any testing, especially in cases where scientific evidence indicates that chemicals may have adverse effects at very low doses.
Know how chemicals are absorbed, distributed, metabolized and excreted (ADME) by the body. In studies, use doses at levels relevant to human exposure. Develop tools to measure relevance of different test methods. (Chapter V, Relevancy)At April 2011 workshop, participants generally agreed that ADME data were important for all chemicals but particularly for nanomaterials and endocrine disruptors. While acknowledging challenges, concerns were raised that NHANES biomonitoring data are not used to inform ADME, and that toxicology guidance does not call for using doses relevant to human exposure.
Use health statistics to set priorities to develop and implement test methods considering cardiovascular disease, cancer, hyperactivity, neurological effects, immune system impacts, and reversibility of potential danger. (Chapter VI, Future priorities)At April 2011 workshop, concerns were raised that FDA needs to develop a prioritization system for validating new test guidelines to focus FDA's limited resources on the most pressing public health concerns and with known relationships to the national rates of morbidity and mortality such as diabetes, obesity and high blood pressure.
 At November 2011 workshop, participants generally agreed that FDA should use the NHANES biomonitoring data to set reassessment priorities.
Develop reliable test methods to evaluate the significant health hazards relevant to food ingredients. (Chapter VI, Future priorities)At April 2011 workshop, concerns were raised that the Redbook has not kept pace with scientific developments and needs tests and endpoints that serve as early markers of health problems: for example, diabetes, obesity and high blood pressure.
Maintain a balance between a comprehensive battery of tests and simpler, less time-consuming and less expensive model systems to evaluate toxicity. (Chapter VI, Future priorities)At April 2011 workshop, concerns were raised about the need for priorities between limited resources and benefits of additional tests.
Topic 4. Specific testing needs 
As tissue hormone receptor tests become available, use them to identify potential toxicityAt April 2011 workshop, concerns were raised with:
concerns for further evaluation. Test at doses relevant to actual exposure, combined with potentially interactive chemicals. (Chapter IV, Nonhuman mammalian systems)• Whether current methods were sufficiently sensitiveto identify endocrine disruption;
 • Lack of screening methods for potential hormonally active chemicals; and
 • No definition of harm for endocrine impacts.
Develop animal tests of relative simplicity to provide quantifiable and reproducibleAt April 2011 workshop, concerns were raised with:
information on behavioral effects at levels relevant to human exposure. (Chapter VI, Specific suggestions, Behavioral tests)• The lack of definition of harm for behavioralimpacts; and
 • The existing animal tests, which may not capturethe subtle yet complex human behaviors that maybe of concern
Develop methods to detect and study hypersensitivity. (Chapter VI, Specific suggestions, Hypersensitivity)At November 2011 workshop, concerns were raised about methods to evaluate severe allergic reactions.
Topic 5. Role of human studies 
Take special care to consider subpopulations for nutrients and consider human prospective studies when possible. Do not ignore behavioral tests to assess safety. (Chapter IV, Observations on human subjects)Evaluate special risks of subpopulations, especially for nutrients. (Chapter IV, Subpopulations)Require controlled evaluation in human subjects. (Chapter VI, Specific suggestions, Human data)If hypersensitivity is possible, conduct human prospective studies considering subpopulations with different sensitivities. (Chapter IV, Subpopulations)At November 2011 workshop, concerns were raised that FDA does not appear to consider subpopulations that rely on nutrients from dietary supplements when evaluating additives that are similar.At April 2011 workshop, some participants observed that guidance is needed for clinical behavioral studies.At November 2011 workshop, concerns were raised about severe allergic reactions.
Topic 6. Evaluating evidence 
Carefully consider reproducibility and coherence with other data in the overall pattern of the biological response. (Chapter V, Credibility of data)Be cautious when animal consumption scenarios are not similar to human scenarios and effects on animals may be manifested differently in humans. (Chapter V, Particularization of judgment)Rely on independent evaluations by nongovernment scientists without conflicts of interest where there is a public controversy. (Chapter III, Selection of scientific panel)Ensure experts are sensitive to their personal bias regarding safety, adequacy of data, conventional wisdom, and unconfirmed studies. (Chapter V, Extra-scientific factors)At April 2011 workshop, concerns were raised about reproducibility and transparency of the safety assessment when the assessor evaluates multiple studies with conflicting results. Participants also suggested that testing should include doses relevant to human exposure.At April 2011 workshop, concerns were raised about impacts of the safety assessor's professional judgment, especially in comparing hypothesis- and guideline-based studies.
Topic 7. Consistency in approaches 
Phase out the GRAS list. Assess potential health hazards of chemicals using a single system. (Chapter VI, Specific suggestions, Phaseout of GRAS list)Develop consistent approach between food, drugs, and environmental contaminants. (Chapter VI, Future priorities)At November 2011 workshop, participants generally agreed on the need for a harmonized system between agencies with respect to the science.At April 2011 workshop, participants raised similar concerns.
Topic 9. Revisiting assessments 
Assign its conclusions assessing the evidence of safety to one of 5 conclusions that cover the majority of situations. These statements identify situations where the use is not clearly safe or unsafe, such as when postmarket studies or monitoring is needed because of limited data. The statements are most useful as a trigger for reassessing safety. (Chapter III, Formulation of questions and answers)Stay current as rapidly advancing science develops, and be prepared to change prior decisions. Adapt to changing public perceptions of acceptable degree of risk. (Chapter V, Dynamics of safety evaluation)Develop consistent approach to treat new information, reversing standing approvals, and burden of proof in borderline cases. (Chapter VI, Future priorities)Develop mechanism to provide temporary clearance followed by monitoring of health complaints and appropriate epidemiological surveys. (Chapter VI, Future priorities)At November 2011 workshop, participants generally agreed that methods were needed to identify and fill gaps in exposure and toxicology data.At April 2011 workshop, concerns were raised about gaps in toxicology dataAt both workshops, concerns were raised about the lack of a systematic reassessment of prior safety decisions and the slow rate of incorporation of scientific knowledge into guidance documents in particular, and safety assessment in general.

Areas of concern still relevant 3 decades later

After reviewing Table 1 to identify those suggestions not implemented by FDA and Table 2 to identify those issues that were still relevant in 2011, we consolidated the common issues into 9 broad areas of concern. They are:

  • Behavioral impacts;
  • Endocrine systems;
  • Subpopulations;
  • Toxicological insignificance;
  • Absorption, distribution, metabolism and excretion (ADME);
  • Classifying assessment decisions for consistency and clarity;
  • Personal bias and conflicts of interest;
  • Reassessment and consistency across substances; and
  • Weight of evidence.

Because science has substantially advanced since 1982, the workshops’ participants also raised concerns about FDA's considerations of new scientific developments such as nanotechnology, the interagency Tox21 program designed to screen chemicals for potential toxicity, and biomonitoring. However, we did not include these issues in our analysis since they did not exist at the time of SCOGS.

Additionally, there was an overarching issue that concerned workshop participants: the lack of definitions of “harm” or “adverse effects” in FDA rules and public guidance documents. Participants noted that this gap resulted in FDA interpreting adverse effects on a case-by-case basis, an approach considered less than predictable (Maffini and others 2011). In contrast, other agencies dealing with chemical safety, such as the U.S. Environmental Protection Agency (EPA) (EPA 2013) and the Joint World Health Organization/Food and Agriculture Organization Expert Committee on Food Additives (JECFA) have formal definitions of adverse effects (IPCS 2011). We did not include this issue as an area of concern; rather, we briefly mentioned discussions about harm or adverse effects in the context of the areas of concern Behavioral impacts and Endocrine systems.

Below is a detailed discussion of the 9 areas of concern listed above.

Behavioral impacts

At the time SCOGS was reviewing GRAS substances, a pediatrician proposed that some children may have heightened susceptibility for certain additives in the diet that are manifested by hyperactive behavior and some brain dysfunction (Feingold 1975). The proposal prompted scientists and physicians to gather evidence suggesting that artificial colors and flavors, as well as other additives, in the diet influence the behavior of some children (Weiss 2012).

With this scientific debate as a backdrop, SCOGS recommended that FDA develop guidelines for behavioral testing concluding that

“[m]uch effort is needed in the development of animal tests of relative simplicity that may provide quantifiable and reproducible information on behavioral effects in animals at the levels of intake relevant to human exposure.” It stated that to answer whether “foods and food ingredients” may cause or aggravate some behavioral disorders, the development of such tests “should command a far more aggressive attack than it has up to now,” concluding that “a firm foundation can then be achieved within the next decade or two.”

Participants at the Pew workshops (Maffini and others 2011) raised similar issues stating that:

  • A definition for what constitutes harm in the context of behavioral impacts is needed;
  • Existing screens do not detect more subtle effects on the structural or functional integrity of the nervous system, such as learning, memory, anxiety, or hyperactivity; and
  • Animal tests need to be better designed to reflect complex human behaviors.

They also noted that, although there are many endpoints listed in the Redbook, they capture “obviously abnormal behavior” and “will not detect more subtle effects.”

The current Redbook Neurotoxicity Studies guidance (FDA 2000) recommends a flexible, tiered approach, based on a case-by-case assessment of the available toxicity information of a given compound. FDA recommends performing a systematic clinical evaluation of animals used for basic toxicology testing to include endpoints such as seizure, tremor, paralysis, or other signs of neurological disorder; the level of motor activity and alertness; and any other signs of abnormal behavior or nervous system toxicity. It also recommends conducting a pathological examination of the brain, spinal cord, and peripheral nervous system (FDA 2000). Finally, it suggests that “[a]s appropriate, more sensitive and objective indices of neurotoxicity, such as tests of learning and memory, and quantitative measures of sensory function and motor behavior, could be included as part of the screen” (FDA 2000); however, there are no endpoints or tests recommended.

During the Pew workshops (Maffini and others 2011), FDA scientists noted that “the cost and efficiency of some of these studies have precluded the inclusion of testing for some of the more subtle aspects of behavior” into the guidelines. However, others have reached different conclusions.

During the 1980s, EPA promulgated regulations describing how to conduct behavioral and developmental neurotoxicity testing, respectively, and adopted specific behavioral tests for learning and memory. In the late 1990s, EPA updated its toxicology data requirements for pesticides used on food (40 CFR Subpart F 158.500). It established screening tests that rely on a semiquantitative evaluation of a functional observational test. It includes evident behavior endpoints and requires developmental neurotoxicity tests.

Similarly, the Organization for Economic Co-operation and Development (OECD) has published a guidance document for neurotoxicity testing (OECD 2004) as well as guidelines (OECD 1997, 2007) to be used in chemical testing programs. For OECD, behavioral testing and endpoints provide “one of the most sensitive strategies to reveal subtle functional deficits,” adding that “behavioral endpoints can uncover alterations in neural or extraneural substrates for which no compensatory alternate behavioral response is available.” Its guideline (OECD 2007) provides details of study design, frequency of observations, and endpoints, including a section on learning and memory tests.

In summary, FDA has not aggressively pursued the development of test methodologies for behavioral impacts. It has not incorporated into its Redbook methods that EPA and OECD adopted years ago.

Endocrine systems

Hormones have been known for hundreds of years to play fundamental roles in basic physiological functions. This understanding led to the development of drugs to manipulate the endocrine system either by correcting problems such as low levels of hormones or by blocking natural hormones from acting in target organs. Furthermore, scientists have shown that some man-made chemicals, not specifically designed to affect human health, can also bind to hormone receptors and trigger agonistic or antagonistic biological effects. Similarly, these chemicals can also interfere with the synthesis or the breakdown of hormones. Chemicals with such actions are called endocrine disruptors (Wingspread Conference 1992).

Recognizing the potential for additives to affect the endocrine system, SCOGS said:

“Recent advances of knowledge on specific tissue receptors disclose additional potential targets for chemical effects or competitive interactions of an added substance with endogenous messengers. For example, in the case of estrogenic hormone receptors and an agent with a high-binding affinity for the receptor, modification of hormonal response might occur at very low concentrations of the agent. While reliable tests of such effects are not currently available, it can be expected that a number will be developed in the future. These tests might indicate unusual binding affinity of agents that cause certain teratogenic effects or influence reproductive or growth patterns by interactions with the relevant hormonal receptors regulating these physiological processes.”

As predicted by SCOGS, in the last 2 decades, scientists have produced a large body of research results on endocrine disrupting chemicals (Arbuckle and others 2008; Woodruff and others 2011; Vandenberg and others 2012).

Participants at the Pew workshops (Maffini and others 2011) identified several issues regarding FDA's handling of endocrine disruptors, including:

  • Disagreement on what should be considered an adverse effect and its relevance to human health;
  • Difficulty in selecting health-related endpoints such as biomarkers that could predict disease outcomes;
  • Disagreement over whether current Redbook tests and endpoints are sufficiently sensitive and encompass significant modes of action, including those important during early life or that become apparent long after exposure; and
  • Agreement on the need to understand ADME for endocrine disruptors.

We did not find evidence that FDA acted on or attempted to address SCOGS’ suggestion. Its scientists have maintained that the studies recommended in the Redbook (such as multigenerational reproductive and developmental testing) can and do provide valuable insights into potential endocrine activity due to the possible manifestations of adverse effects (Lorentzen and Hattan 2010). Although open to the possibility of using alternative methods, FDA has yet to recommend that stakeholders use available screening tests for endocrine disruption.

By comparison, other offices at FDA have acted on endocrine disruptors. In the mid-1990s, the agency initiated the Endocrine Disruptor Knowledge Base (EDKB) project (FDA 2010b) with the intention “to serve as a resource for research and regulatory scientists to foster the development of computational predictive toxicology models and reduce dependency on slow and expensive animal experiments” (Ding and others 2010). The project resulted in the development of the EDKB database, which is based on quantitative structure-activity relationships coupled with an integrated system of experimentation and modeling that predicts biological activities. These core features underwent a “rigorous validation” (Tong and others 2002) via the interagency agreement with EPA. The free database currently includes more than 1,800 chemicals, of which more than 200 are allowed in foods.

EPA has also developed and validated screening tests for endocrine disruptors. In response to a 1996 Congressional mandate (Food Quality Protection Act of 1996 P.L. 104-170. 21 U.S.C. 346(a)(p); Amendment to the Safe Drinking Water Act of 1974. 42 U.S.C.§ 300j-17, 1996), in 2009, the agency launched its Endocrine Disruptor Screening Program (EPA 2012), making available test guidelines for validated in vitro and in vivo assays. FDA has expressly rejected the use of at least one of these assays despite the Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) determining the assay was validated as “a screening test to identify substances with in vitro ER [estrogen receptor] agonist or antagonist activity” (Birnbaum 2012). FDA responded that “ICCVAM test recommendations are not acceptable for satisfactorily fulfilling the test needs for FDA regulated products” and that “FDA does not envision a use for this method in its current regulatory framework” (Goodman 2012). EPA (Sanders 2012) and the Consumer Product Safety Commission (Hinson 2012) have accepted the new test, and OECD added it to its guidelines (OECD 2012).

Lastly, a battery of new tests and predictive tools to identify potential endocrine disruptors is being developed and undergoing validation within the Tox21 program. The program is a collaboration of FDA, EPA, and the National Institutes of Health “to use robotics technology to screen thousands of chemicals for potential toxicity, use screening data to predict the potential toxicity of chemicals, and develop a cost-effective approach to prioritizing the thousands of chemicals that need toxicity testing” (EPA 2012). The Office of Food Additives Safety joined the intergovernmental project when it was underway; it has nominated chemicals to be tested, and provided toxicological data from its files.

In summary, FDA has not taken a leadership role in the development and validation of new technologies to identify and evaluate additives for potential endocrine disrupting activity. Unlike EPA, it has not adopted or made use of validated screening tests and predictive models.

Subpopulations

SCOGS made it clear that, during the safety evaluation of additives, regulators must pay particular attention to susceptible populations. The committee identified these groups based on age, gender, and physiologic state (adolescence, pregnancy, lactation), and noted that groups with specific food preferences and individuals with chronic diseases should also be taken into consideration. It said:

“When the benefits are related not to health, but to organoleptic, technologic, or economic considerations, substantial risk even to a relatively small subgroup of the population is not generally acceptable. A major need in future food safety evaluations is the identification of the population subgroups at special risk, and the extent of the risk.”

Participants at a Pew workshop (Alger and others 2013) noted that, while FDA currently considers one subpopulation (young children), it should include more groups on a routine basis. Participants mentioned that EPA regularly assesses exposure for a wide variety of subpopulations.

FDA routinely assesses exposure for all people aged 2 y and older as one group, as well as children between ages 2 and 5 y as another. It calculates exposure by identifying all of the foods in which the substance will be added, the amount in each food, and the quantity of food consumed by individuals in the United States. The estimated daily intake of an additive is then calculated based on the amount consumed by the so-called “high” consumer represented by the 90th percentile of the people who eat the food containing the chemical (FDA 2006a). For infant formula, the agency uses an approach tailored to infants likely to be fed the formula (FDA 2004). It evaluates other subpopulations on a case-by-case basis.

SCOGS was especially concerned about individuals hypersensitive to food and food additives and described nonspecific cutaneous, gastrointestinal, respiratory, immunologic, and neurologic manifestations as adverse reactions to additives. Although the most common allergens are proteins and peptides found in common foods (for example, soy, nuts, milk, and seafood, as well as gluten-containing food products), certain chemicals commonly used as food additives (such as sulfites) also cause adverse reaction in some individuals. SCOGS wanted FDA to take a proactive role saying that “advances [in cell biology and clinical immunology] may aid in the development of simpler and more reliable procedures for the detection of hypersensitivity.”

FDA only dealt with hypersensitivity in the context of genetically engineered plants. The 1992 policy (FDA 1992) encouraged developers of new plants to consult with FDA early in the genetic engineering process.

EPA also lacks guidance to test for hypersensitivity. In 2011, the European Food Safety Authority released draft guidance underscoring the need for hypersensitivity testing (EFSA 2011).

In summary, FDA has not systematically considered the exposures to sensitive populations except for infants. For hypersensitivity, it has not developed any guidelines to screen or test for potential impacts or offered an effective system for consumers to report health impacts.

Toxicological insignificance

SCOGS said:

“[T]he arbitrary establishment of a concentration at or below which no hazard exists is scientifically untenable. Failure to observe an adverse effect when a substance is widely used for a long time in uncontrolled, casual human applications is insufficient reason to pronounce it safe even at very low levels. The concept of toxicological insignificance ignores the possibility of accumulation in tissues of slowly excreted compounds that may be carcinogenic, teratogenic, or mutagenic. The concept also fails to take into account the possibility of a slow irreversible functional alteration in vital organs.”

Pew workshop participants raised similar concerns (Maffini and others 2011).

Contrary to SCOGS’ suggestion, in 1995 FDA created the Threshold of Regulation rule, which exempts substances used in food contact materials from regulation as food additives if the dietary concentration is below 0.5 parts per billion (ppb) (FDA 1995) and if:

  • The chemical has not been shown to be a carcinogen;
  • There is no reason to suspect that it is a carcinogen; and
  • There is no evidence that it presents other health or safety concerns.

This threshold was calculated based on the carcinogenic potency of known carcinogens (Gold and others 1984), and “the assumption that carcinogenicity is ordinarily the most sensitive toxic endpoint” (Cheeseman and others 1999).

FDA makes extensive use of thresholds for food contact substances including a 4-tier scheme based on estimated cumulative exposures to define minimum recommended toxicology studies (FDA 2002). Except for the 1st tier, which is based on the rule discussed above, it is unknown how the other tiers were developed and whether those thresholds have been reviewed to ensure that they are sufficiently protective of public health.

FDA scientists (Cheeseman and others 1999) and supporters of the concept of the threshold of toxicological concern (Kroes and others 2000) (as it is known in the European Union and JECFA) claim that the “safe dose” based on cancer endpoints provides an “adequate margin of safety” for noncancer endpoints such as neurotoxicity, developmental, or reproductive toxicity, and endocrine disruption. Interestingly, FDA scientists recommended that “members of the endocrine disruptors” class that tested positive in the Ames assay be excluded from threshold of regulation exceptions because “the broad class of endocrine disruptors includes many structures most closely identified with carcinogenicity through a mechanism involving hormone modification” (Cheeseman and others 1999).

Although the concept of thresholds is greatly supported by regulators (Cheeseman 2005; EFSA 2012) and the regulated community (Felter and others 2009), the reasoning behind the levels set by FDA does not reflect current scientific understanding (CDC 2012; Vandenberg and others 2012) and relies heavily on expert judgment (Munro and others 1996; Cheeseman and others 1999).

In summary, contrary to SCOGS’ suggestion, FDA has adopted thresholds in rules and guidance below which industry is not expected to develop toxicity data when evaluating the safety of a chemical.

Absorption, distribution, metabolism, and excretion (ADME)

Understanding the fate of a chemical that enters the human body should be the logical 1st step in assessing its safety. SCOGS stated that:

“[I]f a substance is to be used in food for human consumption, controlled evaluation in human subjects is necessary. To assure the safety of such evaluation, a stepwise approach is required: initial testing in animals to establish a safe level for limited testing in humans, determination of the profile of metabolism and pharmacokinetics in man, choice of animal models most appropriate to the human and testing these models, and final controlled observations on human subjects consuming the substance under the proposed conditions of use in the food supply.”

Participants at the Pew workshop noted that ADME data are necessary, especially for nano-sized additives and endocrine disruptors (Maffini and others 2011). Experts have also pointed out that ADME information can improve the characterization of potential health risks (McLanahan and others 2012).

Despite this evidence, FDA does not require ADME data for substances assigned to the lowest concern level (FDA 1993). It expects ADME data for the 2 higher levels. Although the agency's public statements seem to support the importance of ADME (Aungst 2012), it has not updated its ADME guidance since 1993. Since GRAS is the primary mechanism to allow direct additives in food over the last decade (Neltner and others 2011), we checked whether the notices submitted to FDA contained ADME data. We found that 50% of 22 randomly selected GRAS notifications, about which FDA did not raise questions, did not submit any ADME data.

EPA and the WHO's International Programme on Chemical Safety (IPCS) have highlighted the benefits of using physiologically based pharmacokinetic (PBPK) modeling (EPA 2006) to “facilitate more scientifically sound extrapolations across studies, species, routes, and dose levels” (WHO 2010).

In summary, FDA's guidance allows industry to make safety decisions without the detailed ADME data necessary to understand how the human body handles and eliminates chemicals that may be in food.

Classifying assessment decisions for consistency and clarity

SCOGS boiled down almost all of its GRAS safety assessment conclusions into 5 types. Paraphrasing the committee's language, they are:

  • Type 1: Safe and unlikely to warrant future review
  • Type 2: Safe, but warrants monitoring for significant increases in consumption
  • Type 3: Uncertainties exist that require additional studies
  • Type 4: Adverse effects reported with insufficient evidence to be found safe
  • Type 5: Insufficient evidence to be found safe

The committee acknowledged that assigning conclusions to these 5 types was a challenge for scientists but said it was necessary to avoid “the tendency of cautious scientists to qualify and ‘write around’ rather than make hard choices.”

Workshops participants expressed concerns about the limited amount of data to make informed decisions and generally agreed that methods are needed to identify and fill gaps in toxicology and exposure data (Maffini and others 2011; Alger and others 2013).

In 1981, FDA scientists (Smith and Rulis 1981) acknowledged SCOGS’ approach and paired the 5 types with possible FDA regulatory action. However, we could not find evidence in the rules, publicly available notices, guidance documents, and Web pages that the agency considered using SCOGS’ conclusion types beyond the Smith and Rulis article.

Because FDA lacks a means to efficiently track significant increases in consumption (Neltner and others 2011), it may have rejected Type 2. In contrast, the Flavor and Extract Manufacturers Assn. essentially puts all of its decisions in Type 2 requiring members to report every 5 y their production levels of GRAS flavors; if the level doubles, the association's expert panel reassesses its decision and may remove the chemical from its GRAS list (Hallagan and Hall 2009).

Other science-based agencies such as the National Toxicology Program and EPA also use a classification system to rate scientific evidence used in chemical evaluation and risk assessment.

Types 3 and 4 would require additional testing. In the 1970s, FDA developed regulations that approved chemicals on an “interim basis pending additional study” (21 CFR Part 180) and FDA conditionally approved 4 chemicals under these regulations. No chemicals have been approved on an interim basis since 1982.

In summary, FDA does not assign its safety conclusions to one of the 5 SCOGS conclusions. It maintains that it approves only chemicals that SCOGS would conclude are safe and unlikely to warrant future review and rejects chemicals that SCOGS would assign to one of the 4 other conclusion categories. The agency does not have a system to reassess the safety of existing chemicals.

Personal bias and conflicts of interest

SCOGS’ members understood that personal leanings and scientific perspectives play an important role during the assessment and warned FDA to consider what it called “extra-scientific factors.” Today, we would call it personal bias. SCOGS said that the “principal sources of subjective variability among evaluators are:

  • Personal leanings concerning what constitutes ‘safety;’
  • Differences in perception of what constitutes adequacy of data by the same individual for different situations;
  • The degree to which scientific popularity (the ‘conventional wisdom’) is an influence; [and]
  • Personal weighting of the significance of adverse findings based on unconfirmed studies and/or less than rigorous experimentation.”

SCOGS recognized that even when personal bias is carefully managed, there are advantages of using 2 additional methods to minimize it: peer review and transparency.

Pew workshop participants were also concerned about the lack of transparency on how FDA “makes safety determinations, the data it uses and does not use, and how regulatory decisions are made” (Maffini and others 2011). They noted that “greater transparency in FDA processes would improve predictability and access to information” while acknowledging that companies that invest in safety studies may want a period of competitive advantage before data are made publicly available (Alger and others 2013).

FDA scientists in attendance pointed out that greater transparency could bog down the approval process, but perhaps more important is “the need for independent review and to shelter reviewers from influences outside and inside the agency. This works against transparency but is absolutely required for a science-based process” (Maffini and others 2011).

Regarding advisory panels, FDA uses 2:

  • Science Board (FDA 2013b): In recent years, committees of the Board have addressed 2 specific issues that were relevant to food safety at FDA: the safety of bisphenol-A (FDA 2008) and the agency's scientific capacity (FDA 2007). The latter led FDA to launch its Advancing Regulatory Science initiative (FDA 2011).
  • Food Advisory Committee (FDA 2012): From 2002 to 2004, the committee considered food additives issues such as acrylamide, allergens, and biotechnology. Since 2004, the committee has met only twice (FDA 2013a).

FDA can also make use of peer reviewers. In 2004, the White House's Office of Management and Budget established that “important scientific information shall be peer reviewed by qualified specialists before it is disseminated by the federal government” (Bolten 2004). Safety assessments are scientific information. The policy states that reviewers must comply with conflict of interest requirements, the review process must include public participation, and the agency must prepare a written response to the peer-reviewed report.

Despite these requirements for agency staff, there are no comparable ones for scientists conducting GRAS safety assessments for food manufacturers, especially where the firm does not notify FDA of the decision. This issue was underscored in 2010 by the U.S. Government Accountability Office (GAO) who concluded that “FDA's oversight process does not help ensure the safety of all new GRAS determinations” (GAO 2010). It also recommended that FDA adopt a rule or guidance to prohibit conflicts of interest for the assessors. Later that year, the agency requested comments on whether it should issue guidance on the subject (FDA 2010c).

In summary, FDA has largely implemented SCOGS’ suggestions on personal bias and conflict of interest for its employees and the safety assessments it makes. However, the agency has not addressed the issue for food manufacturers that make their own determination.

Reassessment and consistency across substances

Looking at the future, SCOGS posed 2 related questions that are still relevant:

“How should a regulatory agency treat new information in its review of food ingredients, especially involving a potential reversal of a standing approval? Should there be any differences with respect to the burden of proof of safety in borderline cases from that employed in the original approval for use in foods?”

Pew workshop participants noted that chemicals need to be reassessed in light of new scientific knowledge and changes in exposure over time. They acknowledged that “it is not practical to reassess all substances and uses immediately” but suggested that FDA should “develop a science-based framework to prioritize and reassess prior safety decisions” (Alger and others 2013).

FDA reassesses additive safety on a case-by-case basis when it identifies a potential public health concern. A recent example involved its decision to ban the use of caffeine in alcoholic beverages (FDA 2010a) because “the combined ingestion of caffeine and alcohol may lead to hazardous and life-threatening situations.” In instances where the public health risk is not that obvious, it is unclear how the agency uses new information to review safety decisions. FDA may also initiate reassessment in response to a citizen petition (Dorsey 2012) or a manufacturer's notification to expand uses of an existing chemical.

Regarding consistency of decisions across substances and agencies, SCOGS advocated for “consistency in rationale,” which suggests “comparability in approach in safety evaluation beyond all food ingredients to all substances ingested by human beings, including drugs and environmental pollutants.”

Pew workshop participants generally agreed “on the importance of including all dietary sources in the exposure assessment so that it accurately represents what a person may actually be exposed to” (Alger and others 2013).

FDA's approach to dietary exposure is to consider all dietary sources to which the additives are added. However, it does not consider tap (drinking) water and pesticides, and there is limited coordination between agencies that regulate the same chemical for different uses (Alger and others 2013). Also unclear is whether it considers other sources such as dietary supplements and naturally occurring substances. In 2007, the National Research Council's (NRC) “Science and Decisions: Advancing Risk Assessment” report (NRC 2007) recommended that exposure assessments should include all sources from which a chemical enters the human body. Although the report was addressed to EPA, it broadly applies to substances regulated across agencies and reflects the spirit of SCOGS’ “consistency in rationale.”

In summary, FDA has not developed a system to prioritize its review of previous safety decisions. Instead, it relies on a case-by-case approach. In addition, it does not appear to closely coordinate its hazard or exposure assessment with EPA when a chemical is regulated by both agencies.

Weight of the evidence

SCOGS understood the varying quality among studies, whether published or not, and noted that assessors should be cautious. It said that “[t]he credibility of a given set of data is increased by its reproducibility and by its coherence with other data within the overall pattern of the organismal response.”

Pew workshop participants (Maffini and others 2011) noted that:

  • It is important to incorporate multiple endpoints and tests in a weight-of-evidence determination;
  • Consistency of evidence across different studies and laboratories should be favorably compared to the reproducibility of individual studies; and
  • Assessment of the weight of the evidence should consider the evidence for harm and no harm across all available studies.

When confronted with multiple studies, FDA uses 8 criteria to weigh the evidence (Maffini and others 2011) derived from a compilation of Redbook, OECD, EPA, and WHO guidelines.

How the criteria are applied depends upon professional judgment. FDA does not seem to use a systematic review framework in which the assessor documents and justifies each decision, such as the Cochrane Reviews. This system is designed to facilitate decision making using stringent guidelines to establish whether or not there is conclusive evidence about a particular question (Cochrane Collaboration 2013). As a result, FDA's analysis raises concerns of reproducibility and predictability.

In summary, FDA maintains it closely scrutinizes all available studies. However, its analysis is often based on professional judgment without using the available methods to compare various studies in a more rigorous, transparent, and reproducible manner.

Discussion

In an effort to understand the origin of the current controversies and criticisms about the safety evaluation of chemical additives to food, we looked at the only available structured analysis of the science and decision making used in food additives safety: the 1982 SCOGS final report.

We found that, although FDA acted on some of SCOGS’ suggestions, a significant portion remains unresolved.

The analysis presented here and the research we conducted during the last few years led us to conclude that FDA's food additives program has, in several respects, not kept pace with the scientific developments of the last 20 y. Although it is difficult to pinpoint the origin of the situation, we believe the key point was in 1997, when the agency proposed the voluntary GRAS notification program (FDA 1997).

Under the GRAS notification program, FDA's role fundamentally changed: it shifted from one of a “judge” making final decisions and adopting regulations to an “auditor” or “peer reviewer” providing a critique of decisions made by food manufacturers. The agency's de facto role became “punching holes” in a firm's decision instead of making one itself. FDA's conclusion is that it either has “no questions” or it disagrees with the manufacturer's GRAS determination.

The agency's relationship with industry also changed. Instead of a food additive petition's public process that invited competitors, academics, and the public to weigh in, the GRAS notification program became a discussion between FDA and the firm submitting the safety assessment. Scientists can read the notice after the agency receives it, but there is no formal process to submit comments (Neltner and others 2011). The public learns the agency's conclusions when FDA posts its letter to the firm on its Web site.

Determining why the change occurred is also difficult. FDA has always had too few resources and staff to ensure the safety of the more than 10,000 chemicals allowed in food. In the 1990s, the agency was confronted with a massive backlog of GRAS affirmation petitions (Kahl 2010). It concluded that the time-consuming and resource-intensive process was deterring “many persons from petitioning the agency to affirm their independent GRAS determinations” (FDA 2010c) since food manufacturers did not need FDA's review or affirmation to market their products. The 1958 law contains an exception (intended for commonly used chemicals such as vinegar and oil) enabling manufacturers to determine a substance is GRAS without informing FDA.

The GRAS exception effectively ties FDA's hands when it comes to receiving information about chemical safety. If FDA asks too many questions about the assessment or requests additional information, the manufacturer may choose not to submit or withdraw its GRAS notice, thus effectively leaving the agency in the dark.

While establishing the GRAS notification program may have been a pragmatic and efficient approach, it runs contrary to Congress’ original plan. Today, virtually all new chemicals added directly to food rely on the GRAS program rather than the food additives petition process established by Congress (FDA 2013b). Without a transparent process that engages the broader scientific community, the checks and balances normally present in a regulatory program have been insufficient to prevent scientific stagnation.

Although updating regulatory science generally is a slow process, there are examples demonstrating that it can occur more rapidly. FDA's drug evaluation division and EPA are such cases. Their advances were primarily in response to Congressional mandates, increased resources, or independent analysis of their use of science in decision making. For instance, the Safe Drinking Water Act Amendments of 1996 (P.L. 104-182, 110 Stat. 1613) and the Food Quality Protection Act of 1996 (P.L. 104-170, 110 Stat. 1489) gave EPA the mandate and resources to address endocrine disruption and to give special consideration to children in its chemical risk assessment. Except for the FDA Food Safety Modernization Act of 2011 (P.L. 111-353, 124 Stat. 3885), which primarily deals with pathogens, food additives safety has not been the focus of any particular mandate or review that gave the agency the incentive to implement scientific advances to keep pace with new developments.

An example of independent analysis is the 2007 “FDA Science and Mission at Risk” (FDA 2007) report by an FDA's Science Board subcommittee, which recommended fixes to the science used in determining the safety of the agency's regulated products. In August 2011, the agency released a strategic plan aimed at addressing those recommendations (FDA 2011). Although it does not contain specific plans for food additives, the science priority areas include:

  • Modernize toxicology to enhance product safety by:
    • ∘ Developing better models of human adverse response;
    • ∘ Identifying and evaluating biomarkers and endpoints that can be used in nonclinical and clinical evaluations; and
    • ∘ Using and developing computational methods and in silico modeling.
  • Ensure FDA readiness to evaluate innovative emerging technologies, including nanotechnology.

The 2007 report noted the need for a stronger scientific workforce (FDA 2007) by stating that “[i]nadequately trained scientists are generally risk-averse, and tend to give no decision, a slow decision or, even worse, the wrong decision on regulatory approval or disapproval.” FDA's strategic plan considers this issue by strengthening the regulatory science culture and infrastructure, academic collaborations, and fellowships.

Other examples of independent reviews that initiated changes in regulatory science include NRC's 2007 report that led to the creation of the interagency Tox21 program (NRC 2007), its report on phthalates (NRC 2008), and on advancing risk assessment (NRC 2009).

With this in mind, FDA scientists took an important step on the road to advancing regulatory science by participating in the Pew workshops and supporting an effort to better measure the absorption of nanoengineered particles from food (ILSI 2012). Although no policy changes have been announced, the food additives program leadership has shown openness to working on incorporating scientific advances into its food additives safety assessments.

Conclusion: The Way Forward

Moving forward, there are several pragmatic and realistic steps that FDA could take to modernize its decision-making process and the science it uses to assess food additives safety, and, at the same time, boost public confidence in our food supply.

In our opinion, there are 2 overarching actions that FDA should take to enhance its program. FDA should:

  • Develop and implement a strategy to fix the GRAS process by ensuring that the agency has the opportunity to review and have a final say in all safety assessments that allow use of a chemical in food whether new or already on the market.
  • Ask its Food Advisory Committee supported by a knowledgeable workgroup of independent scientists to evaluate the science and decision-making procedures used to assess the safety of food additives, support the subcommittee's deliberations and recommendations if problems are identified.

In addition to these 2 actions, Table 3 describes specific proposals for the agency's consideration. Some of our suggestions have been previously put forward by other scientific organizations such as the American Heart Assn. (AHA) and the American Academy of Pediatrics. In 2012, AHA (Tomaselli 2012) expressed its concerns about the weaknesses of the GRAS process in general and the GRAS status of one substance in particular. In its policy statement about chemical management (Council on Environmental Health 2011), AAP listed a series of recommendations targeted to reform the Toxic Substances Control Act but universally applicable to any chemical safety assessment including evidence-based regulation, postmarket surveillance of chemical effects, and same evidence requirements for new and previously approved chemicals.

Table 3. Recommendations for FDA to modernize its food additives safety assessment process
NumberRecommendationTopic from Table 1
Rec #1Develop and implement a strategy to fix the GRAS process by ensuring that agency has the opportunity to review and have final say in all safety assessments that allow a use of a chemical in food whether new or already on the market.All
Rec #2Ask the agency's Food Advisory Committee supported by a knowledgeable workgroup of independent scientists to evaluate the science and decision-making procedures used to assess the safety of food additives, support the committee's deliberations and make recommendations if problems are identified.All
Rec #3Define harm and adverse effects. Revise guidance to apply the definition to common situations such as behavior, endocrine disruption and hypersensitivity. Presume 10-fold safety factor for children and pregnant women when data are limited.Topics 3, 4, 7
Rec #4Revise Redbook to define a framework for testing of additives using validated combinations of computational, in vitro and in vivo methods that consider potential endocrine disruption, behavioral impacts, and developmental neurotoxicity at all life stages. Consider methods already adopted by EPA and OECD, as well FDA's Endocrine Disruptor Knowledge Base. Use ICCVAM as preferred means to validate methods other than in vivo.Topics 3, 4
Rec #5Revise guidance to expect the following information for all chemicals: cumulative exposure assessment from all dietary sources, including pesticides, drinking water and dietary supplements; evaluation of exposure and toxicity to children and pregnant women; and ADME information or use of physiologically relevant pharmacokinetic models.Topics 1, 2, 3, 4
Rec #6Seek authority to efficiently get information needed to prioritize and assess chemicals already allowed in foods. Develop a strategy to reassess existing chemicals setting science-based priorities considering potential risk, lack of toxicity testing data, Tox21 tests results, NHANES biomonitoring data, chemicals with similar uses or biological effects and years since previous review. Challenge industry to voluntarily submit all health and safety studies that have not been peer-reviewed and published.Topic 3, 4, 9
Rec #7Develop guidance to limit what SCOGS called “personal bias” in scientific evaluation by issuing policies aimed at curtailing conflicts of interest, especially financial ones, between scientists making safety determinations and the companies whose products they are evaluating. Engage legal and ethics experts from other scientific disciplines such as medicine and environmental health to guide the effort.Topic 3, 6
Rec #8Use SCOGS’ 5 conclusions to summarize safety conclusions to more clearly distinguish risk assessment from risk management. Make analysis publicly available for review and comment.Topic 6, 9
Rec #9Coordinate closely with EPA where the agency is conducting a risk assessment on a pesticide or a toxic substance that is also added to food so that it can ensure that all sources of exposure are considered and the science is consistently used.Topics 2, 4, 6, 7, 9
Rec #10Eliminate Threshold of Regulation rule as unnecessary. If not eliminated, exempt chemicals that may be endocrine disruptors or harmful to the developing nervous system.Topic 3

A stronger, modern food additive safety regulatory program will enhance FDA's ability to more effectively fulfill its mission to protect public health.

Acknowledgments

The authors thank Lani Sinclair, Erin Bongard, Sean Cranshaw, and Neesha Kulkarni for their assistance. This work was supported by The Pew Charitable Trusts. The authors declare that they have no actual or potential conflict of interest. They certify that their freedom to design, conduct, interpret, and publish this research is not compromised by any controlling sponsor.

Author Contributions

MVM, EDO, and TGN developed the conceptual framework, EDO and TGN supervised the execution; MVM and TGN interpreted the analysis and wrote the article; MVM and HMA executed the research.