Tier 1 of the U.S. EPA Endocrine Disruptor Screening Program comprises 11 studies: five in vitro assays, four in vivo mammalian assays, and two in vivo nonmammalian assays. The battery is designed to detect compounds with the potential to interact with the estrogen, androgen, or thyroid signaling pathways. This article examines the procedures, results, and data interpretation for the five Tier 1 in vitro assays: estrogen receptor (ER) and androgen receptor binding assays, an ER transactivation assay, an aromatase assay, and a steroidogenesis assay. Data are presented from two laboratories that have evaluated approximately 11 compounds in the Tier 1 in vitro assays. Generally, the ER and androgen receptor binding assays and the aromatase assay showed good specificity and reproducibility. As described in the guideline for the ER transactivation assay, a result is considered positive when the test compound induces a reporter gene signal that reaches 10% of the response seen with 1 nM 17β-estradiol (positive control). In the experience of these laboratories, this cutoff criterion may result in false-positive responses. For the steroidogenesis assay, there is variability in the basal and stimulated production of testosterone and estradiol by the H295R cells. This variability in responsiveness, coupled with potential cell stress at high concentrations of test compound, may make it difficult to discern whether hormone alterations are specific steroidogenesis alterations (i.e., endocrine active). Lastly, both laboratories had difficulty meeting some recommended performance criteria for each Tier 1 in vitro assay. Data with only minor deviations were deemed valid.
In 2009, the U.S. EPA officially launched the Endocrine Disruptor Screening Program (EDSP) by requiring screening for compounds on the first Priority List for Tier 1 screening (U.S. Federal Register, 2009). The EDSP Tier 1 screening battery comprises 11 assays—five in vitro assays, four in vivo mammalian assays, and two in vivo nonmammalian assays. The objective of the Tier 1 battery is to determine whether compounds have potential endocrine activity by evaluating their ability to interact with the estrogen, androgen, or thyroid signaling pathways. Results of the Tier 1 screening battery, along with other scientifically relevant information, are to be used in a weight-of-evidence (WoE) determination of a substance's potential to interact with these systems. The fact that a substance may interact with a hormone system, or result in hormonal perturbations, does not suggest that when the substance is used it will cause adverse effects in humans or wildlife species.
In vitro assays in the Tier 1 battery include estrogen receptor (ER) and androgen receptor (AR) binding, an ER transactivation assay (ERTA), a steroidogenesis assay, and an aromatase inhibition assay. These assays can provide useful information on the mode of action of a test compound, including interactions with ER or AR, inhibition or stimulation of testosterone (T) or estradiol (E2) synthesis (i.e., altered steroidogenesis), or inhibition of the aromatase enzyme that converts androgens to estrogens. Data from these in vitro assays can aid in the interpretation of multimodal in vivo assays, which rely on more apical endpoints. The performance of these five in vitro assays, as conducted in two separate laboratories, will be examined in the present investigation.
Preparation of In Vitro Dosing Solutions
Before initiation of the in vitro assays, solubility of the test chemical was determined in most cases by the unaided eye, although microscopy was used occasionally (detailed below). For most in vitro EDSP assays, the test guidelines recommend that the chemicals are tested up to the maximum concentration of 1 μl/ml, 1 mg/ml, or 1 mM, whichever was lowest. For the steroidogenesis assay, the maximum concentration for test compounds was 0.1 μl/ml, 0.1 mg/ml, or 1 μM. Some compounds were evaluated at the limit of solubility. Test chemicals were first diluted in an appropriate vehicle (e.g., dimethysulfoxide [DMSO] or ethanol), then diluted in the buffer solution used for each assay with vehicle concentrations within recommended limits that were determined to not interfere with the assays. Log serial dilutions from the highest acceptable test concentration were then prepared. Evaluation of pH and osmolality of the stock or dosing solutions in the respective buffer for each assay were not evaluated, although it is known that alterations in these parameters may affect endpoints in the ER, AR, and aromatase inhibition assays.
Analytical Verification of Dosing Solutions for Tier 1 In Vitro Assays
Although the test guidelines do not state that the dose solution concentration must be verified, laboratories participating in this report chose to perform dose verification for all test chemicals used in each of the EDSP in vitro assays as the studies were conducted under Good Laboratory Practices where dose confirmation is required. Dosing solutions were analyzed down to the quantification limit for the test compound (i.e., concentrations of 10−8 to 10−10 M may have been below the lower limit of quantification and could not be analyzed). In in vitro/biochemical assays, dose verification helps to avoid false negatives and can assist in trying to interpret the data. The dose verification was critical for one test chemical in particular because it was determined to be unstable in DMSO, yielding dose analysis results that did not match the prepared concentrations. In this case, it was determined that water or acetonitrile was the better choice of solvents. Analytical verification was not routinely performed on dosing solutions for the positive or negative control compounds; however, the assay results for these materials met expectations and were consistent in replicate in vitro assays.
Despite analytical verification of the concentration of test compounds in dose solutions, some test compounds adhere to plastic that lessens the concentration available to interact with the cell-free (ER, AR) or cell-based assay systems. In these cases, the requirement to test high concentrations may ensure that sufficient test compound is available to generate a response if the compound has potential endocrine activity. Alternatively, if nonspecific interaction of the test compound is suspected, experiments could be conducted using “siliconized” glassware or concentrations could be evaluated after addition of the test compound to a mock test system to determine available test material concentrations.
ESTROGEN RECEPTOR BINDING ASSAY
Overview of Assay Rationale and Design
ER binding assays have been used for over 40 years to evaluate 17β-E2 binding as well as the competitive binding of other substances to the ER (Noteboom and Gorski, 1965; Toft et al., 1967; Clark and Gorski, 1969; ICCVAM/NICEATM, 2002). Thus, there is extensive history with this methodology, including early studies showing a reasonable correlation between ER binding and uterotrophic responses in vivo (e.g., Korenman, 1970).
Compounds exhibiting estrogenic and antiestrogenic effects may operate, respectively, by binding to the ER and triggering transcription (agonist), or by binding to ER and thereby blocking the binding of the endogenous ligand, resulting in attenuated signal transduction (antagonist). Displacement of the ligand from the receptor is presumably due to binding of the test material with the ER ligand binding site; however, one cannot exclude direct interference of the test compound with the ligand or interaction of the test compound with an alternate site on the receptor that affects the conformation of the ligand binding pocket. While the ER binding assay can detect both receptor agonists and antagonists, it does not differentiate between the two. Once ER binding has occurred, responses are ligand-dependent and cell-context dependent (i.e., related to ER subtypes, ligand-specific receptor conformations, and/or recruitment of different coregulators to the ER; Gronemeyer et al., 2004; Zhang et al., 2008; Ai et al., 2009).
For EDSP Tier 1, the potential of the test compound to displace 3[H]17β-E2 is assessed in three independent assays at concentrations with log spacing generally ranging from 10−10 to 10−3 M. The binding of the test chemical (i.e., its displacement of 3[H]17β-E2) is analyzed by a standard curve and 4-parameter (Hill) nonlinear regression analysis, then the relative binding affinity (RBA) for the test material is compared with that of 17β-E2.
In addition to evaluating the test compound, reference compounds are included in the assay to verify assay sensitivity; these compounds include 17β-E2 (strong positive control, radioinert), 19-norethindrone (weak positive control), and octyltriethoxysilane (negative control; Supplemental Fig. S1). A solvent control (SC) is included to determine maximum binding capacity for calculation of RBA. Solvent selection is based on maximum solubility of the test chemical with the goal to test 10−3 M as the highest concentration; commonly used solvents are ethanol, which has a maximum permissible concentration of 3%, DMSO (10% limit), or water. The test guideline provides performance criteria for the control compounds, as well as other assay parameters, to ensure that experimental conditions are adequate and the assay performs as expected (U.S. EPA, 2009a).
To conduct the ER binding assay, laboratories must isolate cytosol from uteri of rats that have undergone ovariectomy before collection of uterine tissue. Sprague-Dawley rats, approximately 85 to 100 days of age, are ovariectomized approximately 7 to 10 days before the harvest of uterine tissue. In one laboratory, approximately 150 Sprague-Dawley female rats were needed to conduct the in vitro ER binding assay. Approximately 50 uteri were used for initial cytosol preparation, binding assays, radiometric detection, and other methodological optimization before screening test compounds. A further 100 uteri were required to yield sufficient amounts of ER-containing cytosol to generate the definitive cytosol pool that was used to conduct the ER saturation binding assay and assess eight test compounds run in triplicate. Thus, the animal use for the conduct of this in vitro assay was significant, which illustrates that, in the case of the rat uterine cytosol ER binding assay, in vitro assays do not always mean the complete avoidance of animal use in testing.
Once uteri were harvested, the tissue was pooled to generate one lot of cytosol to conduct the required assays. Uterine tissues were homogenized on ice in buffer (10 ml/g) and differential centrifugation was used to isolate ER-containing cytosol. The amount of protein in the rat uterine cytosol that provides the optimal level of functional receptors in the ER binding assay was determined by testing serial amounts of protein per assay tube and using 0.03 nM radiolabeled E2. The test guideline specifies that the protein concentration used should bind 25 to 35% of the total radioactivity added at the 0.03 nM concentration of 3[H]17β-E2, which is generally in the range of 0.035 to 0.100 mg protein/assay tube (U.S. EPA, 2009a). Interestingly, parameters regarding freezing prepared cytosol preparations was not included in the ER guideline, but was noted as 90 days in AR (see below).
Before conducting the ER binding assay, it is recommended that a saturation radioligand binding assay is performed on each new batch of cytosol to demonstrate that ER is present at a concentration that is sufficient to perform the assay, and to confirm that the receptor is functioning with appropriate affinity for the endogenous ligand. The saturation binding assay uses eight concentrations of 3[H]17β-E2 without the ER to determine total radioactivity in the assay tubes, and eight concentrations of 3[H]17β-E2 with the ER in both the absence and presence of unlabeled 17β-E2 (at 100× the concentration of the radiolabeled ligand) to examine total ER binding and nonspecific binding, respectively. The saturation binding assay results give information regarding the concentration of active receptor sites (measured as the maximum specific binding number [Bmax]) and the affinity of the harvested ER for 3[H]17β-E2 (reflected in the dissociation constant [Kd]).
Representative experimental results for the saturation binding experiment conducted in one laboratory are presented in Supplemental Table S1 and Figure S2. The Kd for [3H]17β-E2 was 0.1520 nM and the Bmax was 88.92 fmol/100 μg protein for the prepared rat uterine cytosol used in these experiments. The Kd for the run was within the expected range of 0.03 to 1.5 nM. The Bmax was also within the expected range of 10 to 150 fmol/100 μg protein. The data from the saturation binding experiment indicated that the specific binding reached a plateau, nonspecific binding was less than 20% of total binding at all concentrations (range 1.7–8.6%), and the data were consistent with a linear Scatchard plot (Supplemental Fig. S2). Confidence in these numbers is high due to the goodness of fit (R2 = 0.9617), reproducibility between this and another batch of prepared rat uterine cytosol (data not shown), and performance of the strong positive, weak positive, and negative controls from Laboratories 1 and 2 in the competitive binding assays (Table 1; Supplemental Fig. S1).
Table 1. Performance of the ER Binding Assay in Two Laboratories
|[3H]17β-E2 displacement by increasing concentrations of inert 17β-E2 matches one site competitive bindinga||9||Yes||100||3||Two of three assays||67|
|Ratio of total binding in absence of competitor to total amount of [3H]17β-E2 (ligand depletion) ≤ 15%||9||100||3||100|
|loge (residual SD) of 17β-E2 ≤ 2.35||9|
- −0.150 to 1.06
- (0.370 ± 0.460)
|Slope of 17β-E2 standard curve between −1.1 and −0.7||9|
- −1.08 to −0.950
- (−1.00 ± 0.0400)
- −1.0 to −0.9
- (−0.933 ± 0.058)
|Top (%) of 17β-E2 standard curve between 94 and 111||9||100||3||67|
|Bottom (%) of 17β-E2 standard curve between −4 and 1||9|
- −1.30 to −0.100
- (−0.790 ± 0.350)
- −1.7 to −1.1
- (−1.37 ± 0.306)
|loge (residual SD) of 19-norethindrone ≤ 2.60||9|
- −0.520 to 1.10
- (0.380 ± 0.530)
- 1.45 to 2.14
- (1.81 ± 0.346)
|Slope of 19-norethindrone curve between −1.1 and −0.7||9|
- −1.22 to −0.980
- (−1.09 ± 0.100)
- −0.800 to −1.00
- (−0.900 ± 0.100)
|Top (%) of 19-norethindrone curve between 90 and 110||9||89||3||100|
|Bottom (%) of 19-norethindrone curve between −5 and +1||9|
- −1.50 to 2.30
- (0.100 ± 1.43)
|SC did not alter assay sensitivity or reliability (ethanol ≤ 3%, DMSO ≤ 10%)||9||No||100||3||No||100|
|Negative control (octyltriethoxysilane) did not displace > 25% (average) of radioligand from the ER across all concentrations||9||100||3||100|
|Test chemical was tested over a concentration range that defined the top of the curve (≤25% SC or the lowest concentration of 17β-E2)||9||Yes||100|| ||Yes||100|
The large range of test chemical concentrations (10−10 to 10−3 M, or up to the limit of solubility) used in the ER binding assay according to the test guideline (U.S. EPA, 2009a) is expected to provide sufficient data to allow full characterization of the competitive binding curve, determination of the IC50 (inhibitory concentration that decreases radioligand binding by 50%), calculation of the RBA (relative to the natural ER ligand, 17β-E2), and overall classification of the interaction (e.g., binder, nonbinder, equivocal). The high concentration limit of 10−3 M was altered to 10−4 M in the case of one test material because detailed toxicokinetic data indicated that in vitro tests conducted at 10−3 M would be equivalent to in vivo doses producing plasma concentrations substantially above the inflection point for nonlinear toxicokinetics, and thus, have no relevance. The highest concentration tested in the assay can also be lowered if insolubility in the test system is an issue. If the compound is not soluble at concentrations of 10−6 M or above and the compound does not interact with ER at testable concentrations, the compound is deemed “equivocal up to the limit of concentrations tested.”
Description of Assay Conduct
For proper conduct of the ER-binding assay, laboratories and personnel must be approved to work with radioisotopes. The conduct of the assay in our laboratories followed the U.S. EPA OPPTS 890.1250 guideline. Briefly, the methods included adding 3[H]17β-E2, uterine cytosol, and test agent (either radioinert 17β-E2, 19-norethindrone, octyltriethoxysilozane, solvent, or test material) to each sample tube. All samples were run in triplicate within each run. Tubes were incubated overnight (16–20 hr) at 4 to 8°C in the dark to allow the reaction to reach binding equilibrium. Following incubation, hydroxyapatite (60%) was added to each tube on ice to bind the receptor protein, the hydroxyapatite slurry was washed three times, and ER-bound molecules were eluted using ethanol. Radiolabel in the eluent was determined by liquid scintillation counting to measure the amount of 3[H]17β-E2 retained in each sample. Compounds that interacted with the ER displaced bound 3[H]17β-E2 that was lost during the washing steps and resulted in lower radioactivity counts.
The following measured variables were determined, as appropriate, for sample tubes: “total binding” (radioactive counts in tubes containing 3[H]17β-E2 and receptor in the absence of competitor), “nonspecific binding” (radioactive counts in tubes containing 3[H]17β-E2, receptor. and 1 × 10−7 M radioinert 17β-E2), and “specific binding” (3[H]17β-E2 binding [in the presence of a given concentration of test material] minus nonspecific binding, expressed as percentage of total binding in the absence of a competitor). The competitive binding curve was plotted as 3[H]17β-E2 binding (as percent of total binding) versus the concentration (log10 units) of the competitor, and where appropriate, the logIC50 was estimated using nonlinear curve fitting software (e.g., GraphPad Prism, version 5.0). An absolute IC50 value (concentration corresponding to an exact 50% response) was calculated from the data using the following formula: Y = Bottom + (Top − Bottom)/(1 + 10 ^ ((logIC50 − X) × Hill slope + log((Top − Bottom)/(50 − Bottom) − 1))), where Y = 50% of the total binding and the equation is solved for X (i.e., the concentration at which Y = 50%). If the curve did not cross the Y = 50% line, the absolute IC50 could not be defined. The classification of a chemical as a binder or nonbinder was based on three independent, valid runs and based upon the criteria as specified in the test guideline. The RBA of the test material, relative to the binding affinity of 17β-E2, was determined for any material that was considered positive for ER binding.
To the extent possible, two or more test compounds were run in the same ER binding assay to allow the use of the same positive and negative control samples. This approach maximized the data attained from each batch of cytosol and limited animal usage to the extent possible.
Factors to Consider in Assay Interpretation and Performance
The criteria for assay interpretation are well described in the test guidelines (U.S. EPA, 2009a). Interpretation is based largely on the percent binding of 3[H]17β-E2 with binding ≥75% considered “not interactive,” binding between 50 and 75% considered “equivocal,” and binding of ≤50% considered “interactive” (i.e., an IC50 for the test material was obtained), although model fit for a “one-site competitive binding” model also should be considered. The ER competitive binding assay is considered to be both specific and sensitive, particularly when combining criteria for minimum 3[H]17β-E2 displacement with Hill slope requirements.
A commonly encountered problem in the ER binding assay was meeting the performance criteria specified for saturation binding, where the recommended range for protein concentration is 0.035 to 0.100 mg protein/assay tube to bind 25 to 35% of the total radioactivity at 0.03 nM [3H]17β-E2. In one laboratory, the recommended protein range (0.04 mg/protein/assay tube) was suitable for all assay performance parameters and yielded the required radioligand binding (i.e., 27% of the total radioactivity bound at 0.03 nM [3H]17β-E2). In the other laboratory, the protein content of the cytosol was slightly higher than test guideline recommendations (0.1191 mg protein/assay tube, which resulted in the binding of approximately 40% of the total radioactivity at 0.03 nM [3H]17β-E2; Supplemental Table S1). This protein concentration was used as it yielded acceptable performance in all other assay values, including minimal ligand depletion. For example, levels of nonspecific binding were within the limit (<20%) recommended for acceptable assay performance, and the cytosol preparations showed reproducible performance for positive and negative control compounds across multiple assays, indicating that assay performance was acceptable (Table 1). In total, appropriate responsiveness and highly reproducible results of the strong positive (17β-E2), weak positive (19-norethindrone), and negative (octyltriethoxysilane) controls in the competitive binding assays indicated robust performance of the assay. These results question whether the protein ranges specified in the criteria for saturation binding could be expanded based on competitive binding assay performance, as the critical endpoint relates to acceptable performance criteria of the controls in the competitive binding assay.
Performance criteria and results for each competitive ER binding assay are outlined in Table 1. Across 12 ER binding assay runs conducted in two separate laboratories, performance criteria were largely attainable and reproducible across assays. Generally, the ER binding curves met the required performance criteria for positive and negative controls. The quality control (QC) criteria for the standard curves with the strong positive and weak positive (i.e., calculated parameter values of the top, bottom, and Hill slope of the sigmoidal, variable-slope model) indicated that the ER binding assays had acceptable responsiveness.
In cases where the performance criteria for the positive controls were not entirely met (6 of 12 assays; Table 1), this was generally related to the weak positive control, 19-norethindrone. In 5 of the 12 assays, the calculated bottom of the 19-norethindrone curve was generally slightly higher than recommended values, and in one assay the bottom of the 19-norethindrone curve was lower (Table 1). In one assay, the top of the curve for 19-norethindrone was slightly lower than recommended. In 3 of the 12 assays, the Hill slope was slightly lower than specified in the performance criteria. However, the suggested ranges for these parameters in the test guideline were derived from norethynodrel, which was not commercially available for use in these assays. While slightly outside the recommended ranges, the assay responsiveness and the majority of the data indicated that the assay was functioning appropriately (i.e., appropriate IC50 and RBA values for 17β-E2 and 19-norethindrone). The calculated R2 value for the goodness of fit to the sigmoidal, variable-slope binding model was >0.98 for 17β-E2 always and >0.95 in all but one case for 19-norethindrone.
Interpretation of ER assay results relies on both 3[H] 17β-E2 displacement and Hill slope parameters. For example, at the limit concentration (i.e., 10−3 M), data for one test substance did not adequately fit the nonlinear regression model and had Hill slope values outside of the range for the weak positive control. This compound was deemed “equivocal” in the ER binding assay as the model showed poor fit, suggesting that the response was not due to specific binding at the receptor site. An excessively steep Hill slope, coupled with poor model fit, likely indicates nonspecific displacement of the radioligand.
ESTROGEN RECEPTOR TRANSCRIPTIONAL ACTIVATION
Overview of Assay Rationale and Design
The ERTA assay (U.S. EPA, 2009b) identifies chemicals that bind and activate the ER in vitro and produce a reporter gene product (EDSTAC, 1998; ICCVAM, 2003; Kuiper et al., 1998; Escande et al., 2006). The interaction of estrogens with the ER is known to regulate the expression of specific genes that cause downstream endocrine effects. In vivo and in vitro transcription is induced when estrogen translocates into the target cell and binds to the ER resulting in dimerization of two estrogen-bound receptors. This homodimer complex can then interact with and activate specific DNA sequences called estrogen responsive elements that regulate the transcription of estrogen responsive genes. In vitro models to quantify transcriptional activation, such as the ERTA assay, mimic this action by using cells that have been specially designed to contain DNA constructs with a promoter linked to a reporter gene that produces an easily measured gene product. In some cases, a DNA construct for the specific receptor also may be stably expressed in the cell.
The ERTA assay uses the stably transfected, human cervical cancer hERα-HeLa-9903 cell line obtained from the Japanese Collection of Research Bioresources Cell Bank. The hERα-HeLa-9903 cell line contains two stably transfected DNA constructs. An hERα expression construct encodes the full-length human ERα and is expressed at supraphysiologic levels that increases the sensitivity for detecting an agonist response. A firefly luciferase reporter construct containing five tandem repeats of a vitellogenin estrogen responsive element driven by a mouse metallothionein promoter TATA element regulates luciferase expression. In contrast to the EDSP ER binding assay (U.S. EPA, 2009a), the transcriptional activation assays utilize a cell-based model. Uterine cytosol used in the ER binding assay contains the cofactors that are responsible for ligand binding; however, the intact cell recapitulates nuclear translocation that is essential for transcriptional regulation of this nuclear receptor. In addition, the ERTA, as described in the test guideline, utilizes a human ERα construct instead of the rat ERα used in the binding assay.
The ERTA assay was performed under standard cell-culture conditions in 96-well plates. For each plate prepared to test reporter gene activation, another plate was prepared with the equivalent treatments to test for cytotoxicity. Following a 3-hr attachment period, cells were exposed for approximately 24 hr to the vehicle control, positive control (1 nM 17β-E2), or multiple concentrations of the test chemical. In parallel, cells also were exposed to reference chemicals (described below) to demonstrate that the test system is responding to ER agonists as expected. Following treatment, cell culture medium was removed from each well and a standard luciferase assay was performed. Results were expressed as relative transcriptional activity for each well compared with the response for 1 nM 17β-E2. To test for cytotoxicity, standard cytotoxicity assays were used; one laboratory used propidium iodide staining, whereas the other laboratory used an mitochondrial-based tetrazolium dye 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) colorimetric assay.
Plating and Preincubation of Cells
hERα-HeLa-9903 cells were cultured following the specific protocol described in U.S. EPA test guideline OPPTS 890.1300 (U.S. EPA, 2009b). Upon receipt from Japanese Collection of Research Bioresources, cells were expanded in culture and cryogenically preserved for future use. When propagated for use in the assay, the cells were cultured for at least one passage from the expanded frozen stock at approximately 37°C with 5% CO2 before use in an assay. During routine maintenance of the cells, they were passaged approximately two to three times per week. To maintain integrity of the gene reporter system, cells were cultured for no more than 40 passages. For testing, cells were seeded at a density of approximately 10,000 cells per well in 96-well plates and incubated under the same conditions for 3 hr before treating with test chemicals. Following 3 hr in culture, cells were observed for attachment and morphology. While the cells did not spread out on the cell culture plates at this time, they were fully attached and could undergo gentle media removal and addition of dosing media.
It was observed that the use of fetal bovine serum treated with dextran-coated charcoal to remove serum hormones was required to achieve sufficient induction of the reporter gene following treatment with 17β- E2. With dextran-coated charcoal–stripped serum, an average fold induction of 12.8 ± 4.5 compared with the vehicle control was obtained. Appendix 3, in U.S. EPA (2009b), of the test guideline outlines the procedure.
To conduct the ERTA assay, hERα-HeLa-9903 cells were plated as described above (see Section Plating and Preincubation). Following attachment, media was removed and cells were exposed. Cells were then incubated 20 to 24 hr to induce the reporter gene products. The first test run may have been used as a range-finding experiment. If similar results were obtained from the second run, the range-finding run was deemed acceptable for analysis. If subsequent adjustment of concentrations had to be performed to address solubility and/or cytotoxicity issues, runs 2 and 3 were performed as definitive experiments. At the end of the exposure period, each plate was prepared for luciferase reporter gene expression or cytotoxicity analysis as described below.
Luciferase Reporter Gene Expression
As the test guideline recommends, the Steady-Glo Luciferase Assay System (Promega; Madison, WI) was used according to the manufacturer's recommendations. Briefly, after incubation, 50 μl of the supernatant from each well was removed and 100 μl of luciferase assay reagent was added to all wells. The microplates were then mixed on a plate mixer for approximately 5 min. One hundred seventy-five microliters of the lysed cell suspension was then transferred to an opaque microplate and luminescence was measured on a luminometer.
Cell Viability/Cytotoxicity Measurements
The cell viability/cytotoxicity testing was conducted immediately after termination of the exposure experiments using a duplicate 96-well plate. In addition to quantitative viability/cytotoxicity testing, cells were checked for the degree of confluence, homogeneity from well-to-well, and any signs of cytotoxicity or altered morphology. The plate layout and seeding density was either identical to the luciferase expression plate (Laboratory 1) or included 125 μM digitonin as a positive control for cell death in two rows of the 96-well plate (Laboratory 2). The test guideline does not specify a recommended cytotoxicity assay for the ERTA. One laboratory used a PI/Triton-X Cytotoxicity Assay using commercially available reagents (Sigma-Aldrich, St. Louis, MO). The other laboratory measured cytotoxicity by examining cellular metabolic activity through the reduction of MTT (CellTiter 96 Aqueous One Solution Cell Proliferation Assay Kit; Promega; Madison, WI). Both assays are easy to perform, do not require lengthy incubation times, and are relatively inexpensive to conduct. A ≥20% reduction in cell viability was considered evidence of cytotoxicity. The luciferase data were not normalized for occasional and/or random changes in cell viability.
Acceptability Criteria for Positive and Negative Reference Chemicals
ERTA assay results with a comparison against the performance criteria are shown in Table 2. With each assay, performance requirements include a dose-response curve for 17β-E2 (potent estrogen), 17α-E2 (weak estrogen), 17α-methyltestosterone (very weak estrogen), and corticosterone (negative compound; Supplemental Fig. S3). The mean luciferase expression of the 17β-E2 is then normalized to the vehicle control and used to derive the relative transcriptional activity for each well. The luciferase activity of each reference chemical must fall into an acceptable range for logPC50 (where PC is positive control), logPC10, logEC50, (where EC is effective concentration) and Hill slope values for the assay to be deemed valid.
Table 2. Performance of the Estrogen Receptor Transactivation Assay in Two Laboratories
|logPC50 for 17β-E2b||−11.4 ∼ −10.1||13|
- −10.5 to −11.2
- (−10.8 ± 0.2)
- −10.5 to −11.9
- (−11.6 ± 0.4)
|logPC10 for 17β-E2||−11||13|
- −11.4 to −12.9
- (−12.3 ± 0.5)
- −11.0 to −13.0
- (−12.0 ± 1.0)
|logEC50 for 17β-E2||−11.4 ∼ −10.1||13|
- −9.4 to −11.2
- (−10.5 ± 0.4)
- −9.8 to −10.8
- (−10.2 ± 0.4)
|Hill slope for 17β-E2||0.7 ∼ 1.5||13||69||7||100|
|logPC50 for 17α-E2c||−9.6 ∼ −8.1||13|
- −8.6 to −10.1
- (−9.1 ± 0.4)
- −8.5 to −10.0
- (−9.8 ± 0.2)
|logPC10 for 17α-E2||−10.7 ∼ −9.3||13|
- −9.6 to −10.9
- (−10.4 ± 0.4)
- −9.0 to −10.6
- (−9.7 ± 0.8)
|logEC50 for 17α-E2||−9.6 ∼ −8.4||13||100||7||57|
|Hill slope for 17α-E2||0.9 ∼ 2.0||13||54||7||100|
|logPC50 for 17α-methyltestosteroned||−6.0 ∼ −5.1||13||15||7||0|
|logPC10 for 17α-methyltestosterone||−8.0 ∼ −6.2||13|
- −3.0 to −10.8
- (−8.5 ± 1.9)
|logEC50 for 17α-methyltestosterone||–||13||−3.7 to −8.5||–||7||−5.8||86|
|Hill slope for 17α-methyltestosterone||–||13||–||–||7||–||–|
|logPC50 for corticosteronee||–||13||–||–||7||–||–|
|logPC10 for corticosterone||–||13||–||–||7||43|
|logEC50 for corticosterone||–||13||–||–||7||–||–|
|Hill slope for corticosterone||–||13||–||–||7||–||–|
|Fold induction||≥4-Fold for PC relative to vehicle control||32||97||28||100|
|Fold induction||PC10 for 1 nM E2 is >1 + 2SD of vehicle control||23||Yes||100||28|| ||39|
|Results are reproducible||Yes||13||Yes||100||7||Yes|| |
Table 2 summarizes average values and ranges for assay criteria from two laboratories, including the frequency with which the performance criteria were achieved. There was considerable interlaboratory variability in which performance criteria were met. For example, with the positive control compounds, Laboratory 1 routinely met the PC10, PC50, and EC50 values for 17β-E2 and 17α-E2, but had some difficulty achieving the recommended Hill slope values for these reference compounds. Laboratory 2 routinely met the Hill slope values for 17β- and 17α-E2, but had some difficulty achieving the logPC50 values for 17β-E2 and 17α-methyltestosterone. Both laboratories had difficulty achieving the PC10 and PC50 values for 17α-methyltestosterone, but both laboratories generally met the criterion for fold induction. It is important to note that when laboratories missed the performance criteria, the deviations from the recommended values were generally minimal and considered inconsequential.
Following treatment with the negative reference compound corticosterone, three of seven experiments in Laboratory 2 showed induction above 10% of the response of the positive control, which is used as the criterion for a positive response (Table 2). This induction was observed at 10−4 and/or 10−5 M and cytotoxicity did not exclude these results. There were no corticosterone responses above the PC10 seen in Laboratory 1. This result indicates some differences in cell responses across laboratories and suggests that a cutoff value of 10% of the positive control may not be appropriate.
A proficiency assay also must be performed before initiating testing under the EDSP guideline. Ten chemicals were used to demonstrate proficiency in performing the assay, and a summary of findings are included in Supplemental Table S2. Observations of cytotoxicity and solubility are included. As explained in the guideline, 1,3,5-tris(4-hydroxyphenyl)benzene only exhibits a PCmax of 15% (PCmax is maximum response of a test chemical relative to the response induced by 1 nM E2). In one laboratory, it was difficult to achieve this parameter in repeated experiments, and this shows the limitation of this assay in identifying weakly positive chemicals. In contrast, the other laboratory occasionally detected the negative control compound, dibutyl phthalate, as weakly positive. Justification was provided in accepting experiments with slight deviations from the criteria listed in the guideline.
In addition to testing reference and proficiency chemicals, a vehicle control with solvent matching that of the reference or test chemical must be included on each plate. If the positive control and test chemical use a different solvent, a SC for each must be included on the assay plate. The solvent concentration must not induce cytotoxicity. DMSO is a recommended solvent (final concentration must not exceed 0.1% [v/v]). Water 0.1% (v/v) and acetonitrile 0.1% (v/v) also were acceptable solvents for this model system; however, in both laboratories the reference chemicals were only evaluated in DMSO.
The test guideline shows an example plate layout that includes all four reference chemicals, positive control, and vehicle control in triplicate on one plate (U.S. EPA, 2009b). However, one laboratory found that there was considerable intraexperimental variation in luciferase expression and chose to adopt a plate format that includes one reference chemical at eight dilutions, and the positive control and vehicle control with six replicates for each per plate. The pure antagonist ICI 128,780 was also included on each plate in duplicate to aid in the assessment of false positives that may be the result of nonreceptor-mediated luminescence signals. Both laboratories evaluated the potential for plate and edge effects in the standard layout and none were noted.
Overall, it is unclear whether the PC10 is the correct criterion for a positive response. In various runs, the negative control compounds (corticosterone and dibutyl phthalate) yielded positive results using this induction criterion. However, the weakly positive compounds (17α-methyltestosterone and 1,3,5-tris(4hydroxyphenyl)benzene) sometimes yield negative results with the PC10 criterion. In addition, there is marked interexperimental variability in fold induction for the hERα-HeLa-9903 cells that contribute to inconsistent responses with these proficiency compounds.
To evaluate for ER agonist activity, the relative transcriptional activity as compared with the positive control (1 nM 17β- E2) must be determined. The mean value for the vehicle control was calculated and normalized to the results for each well. Next, the mean value for the positive control was calculated. The normalized value for each well was then divided by the mean value of the normalized positive control wells (with the normalized mean of the positive control wells being defined as 100% relative transcriptional activity). The final value for each well was the relative transcriptional activity for that well compared with the mean normalized positive control response. Any cytotoxic concentrations (viability ≤ 80%) were excluded from data analysis.
The mean value for the relative transcriptional activity for each concentration was calculated. Statistical software, such as Graphpad Prism or Origin, can be used to calculate the EC50 and Hill's Slope with the generation of a full concentration response curve. For positive chemicals, the PC values can be calculated by using the following equation:
where (a, b) and (c, d), respectively, are the coordinates of data points lying immediately above and below the PCx value. The RPCmax, the maximum level of response induced by the test chemical, compared with the positive control (1 nM 17β-E2) should be calculated.
Data Interpretation and Decision Logic for Assessing ER Agonist Activity
Positive and negative decision criteria are provided in the test guideline. A test chemical is determined to be positive if the maximum response (RPCmax) induced by the test chemical is equal to or exceeds 10% of the response of the positive control in at least two of two or two of three runs. Considering that some negative chemicals sometimes showed a response greater than the PC10, it may be difficult to determine whether a chemical is negative or a weak positive. This limitation is likely a result of the dynamic range of the PC induction compared with the SC. Other cell lines may be more suitable for discerning these subtle differences or use of an alternate cutoff value.
How Predictive Are the Models: ER Binding and ERTA as Part of the Tier 1 EDSP Battery
While the ER binding assay and ERTA provides information on a compound's ability to interact with the ER, the potential of that compound to interact with the endocrine system is determined across all of the assays in the Tier 1 battery by using a WoE approach that includes “other scientifically relevant information” that may already be available on the test substance. Across 11 compounds evaluated in both laboratories, the ER binding assay results were negative (10 of 11) or equivocal (1 of 11; Table 3). In 4 of 11 cases, the ERTA results did not agree with ER binding, predominantly due to weakly positive results in the ERTA (i.e., ≤25% induction of the response of 1 nM 17β-E2) at high concentrations of test material (e.g., 10−4 M). For one compound, nonguideline ERTAs were used by the U.S. EPA to deem this compound “positive.” Across these examples, it is difficult to reconcile the inability of a test material to bind specifically to ERs at concentrations up to 10−3 M (negative ER binding assay), yet, these compounds yield positive results in the ERTA at slightly lower concentrations. Given previous results with corticosterone and dibutyl phthalate, it is possible that the ERTA generated nonspecific increases in signal that are unrelated to specific ER binding. Alternatively, there may be meaningful differences in cell context of the in vitro assays that produced these disparate results (e.g., species differences in receptor responsiveness—rat vs. human, coregulator availability, etc.).
Table 3. Responsiveness across Select EDSP Tier 1 ER- and AR-Related Assays
|ERTA||Weak + (≤ 19%)||–||Weak + (≤ 25%)||–||Weak + (≤ 18%)||–||–||+ (Per OSRIb)||–||–||–|
|AR binding||–||–||± (Equivocal)||–||–||–||–||+||–||–||NA|
|Uterotrophic||– (Oral)||– (SC)||– (Oral)||– (Tier 2c)||– (Oral)||– (Oral)||– (Oral)||– (Oral)||– (SC)||– (SC)||– (SC)|
|Hershberger||+ (Metabolism)||–||–||– (Tier 2d)||+ (Metabolism)||–||+ (Metabolism)||–||–||–||NA|
Despite some difficulty reconciling results of the in vitro ER assays, data from other assays in the Tier 1 battery contribute to the WoE assessment. In the case of estrogenicity/antiestrogenicity, the ER binding assay and ERTA are two of five assays that examine potential estrogenic and/or antiestrogenic activity. The remaining assays include the uterotrophic assay, the pubertal female assay, and the fish short-term reproduction assay. Of these in vivo assays, the uterotrophic assay is considered specific and sensitive for estrogenic compounds, whereas the pubertal female assay and the fish short-term reproduction assay detect multiple modes of endocrine activity. In contrast to the ERTA, the uterotrophic assay yielded negative results for the same 10 compounds that tested negative in the ER binding assay; the compound that was equivocal for ER binding was negative in the uterotrophic assay when administered via the oral route (Table 3). Overall, the ER binding assay and the uterotrophic assay showed good concordance for estrogenic activity for the compounds tested in both assays, whereas the ERTA differed more often from the in vivo findings in the uterotrophic assay (Table 3).
ANDROGEN RECEPTOR BINDING ASSAY
Overview of Assay Rationale and Design
Compounds may exhibit androgenic or antiandrogenic effects by binding to the AR. The EDSP Tier 1 AR binding assay characterizes the ability of a test substance to compete with R1881, a synthetic androgen, for binding to the AR. Rat ventral prostate contains progesterone receptors (PRs) that can bind R1881; therefore, a PR antagonist (the synthetic glucocorticoid triamcinolone acetonide) was included to eliminate the potential contribution of PR to the binding potential of the test system. The resulting AR interaction either triggers receptor-mediated transcription (in the case of an agonist) or blocks the ability of the endogenous ligand to bind to the receptor (in the case of an antagonist) thereby attenuating signal transduction; the AR binding assay cannot differentiate between these two activities. Responses are ligand-dependent and cell-context dependent (i.e., related to AR subtypes, ligand-specific receptor conformations, and/or recruitment of different coregulators to the AR; Anderson and Liao, 1968; Bruchovsky and Wilson, 1968; Fang et al., 1969; Mainwaring, 1969a,1969b; Unhjem and Tveter, 1969). While displacement of the ligand is presumably due to binding of the test substance to the receptor, direct interference with the ligand or interaction of the test compound with an alternate site on the receptor that affects the conformation of the ligand binding pocket cannot be excluded.
The ability of a test compound to compete with [3H] ligand for binding in rat ventral prostate tissue homogenate is assessed in three independent assays at concentrations with log spacing ranging from 10−10 to 10−3 M. The high concentration limit of 10−3 M was altered to 10−4 M in the case of one test material because detailed toxicokinetic data indicated that in vitro tests conducted at 10−3 M would be equivalent to in vivo doses producing plasma concentrations substantially above the inflection point for nonlinear toxicokinetics, and thus have no relevance. A justification should be provided if concentrations other than those specified in the test guideline are used in the assay, the most likely cause of this being limits of solubility of the test substance (U.S. EPA, 2009c). The binding of the test substance is analyzed by a standard curve and a 4-parameter (Hill) nonlinear regression analysis, and RBA for the test material is compared with that of R1881. To verify sensitivity of the assay, a weak positive, dexamethasone, is also included (Supplemental Fig. S4). Unlike the ER binding assay, no negative control chemical is recommended in this test guideline (U.S. EPA, 2009c). A SC is required to determine maximum binding capacity for the calculation of RBA. Solvent selection is based on maximum solubility of the test chemical in an attempt to reach the highest concentration of 10−3 M. The preferred solvent is ethanol, followed by water or DMSO. Performance criteria for the control compound and other assay parameters are included in the test guideline to evaluate that the assay performs as expected and that experimental conditions are adequate during each test (U.S. EPA, 2009c).
The rat prostate cytosol should be prepared following the protocol specified by the OPPTS 890.1150 Test Guideline (U.S. EPA, 2009c). Briefly, the ventral prostate tissues are collected from castrated Sprague-Dawley rats (60–90 days old; 90-day-old is preferred to provide optimal AR protein expression). Castration eliminates most endogenous androgen biosynthesis, resulting in a transient increase in AR that peaks after approximately 24 hr, therefore, it is important to collect the ventral prostate tissue as close to 24 hr after castration as possible. This will ensure that there is maximal concentration of AR present in the cytosol. Similar to the preparation of uterine cytosol for the ER binding assay, the prostate cytosol preparations are then pooled, aliquoted, and stored at −80°C until use. According to the guideline, if it is necessary to use a cytosol that is more than 60 days old (AR) or 90 days (ER), it is recommended that a saturation binding experiment be conducted to ensure that the receptor is performing as expected. In addition, the ER guideline notes that frozen uteri should not be stored for more than 6 months, although no specific recommendation were given for prostate. Revalidation of the cytosol by saturation binding and short storage parameters for tissue/cytosol increases the overall animal usage, hence it is recommended to coordinate the evaluation of test chemicals. The protein concentration of the cytosol preparation is determined for each batch of cytosol.
In one laboratory, the maximum number of animals used for each ventral prostate collection was 50. The number of rats used for each collection depended on the number of anticipated requests for the assay as it was recommended that the cytosol be stored frozen for no more than 6 months at −80°C before use. When the castration surgery was not being performed in-house, sufficient lead time had to be included for the supplier to schedule surgeries and delivery of the castrated animals to meet the guideline requirement for cytosol preparation postsurgery. In the other laboratory, approximately 150 Sprague-Dawley male rats were needed to conduct the in vitro AR binding assay, including 50 animals for assay set-up and optimization and 100 animals to yield sufficient amounts of AR-containing cytosol to conduct the AR saturation binding assays and assess eight test compounds, in triplicate, for AR binding. Once again, the number of animals used to conduct this in vitro assay is significant.
For the saturation assay, the optimal protein concentration in the cytosol should bind <25 to 35% of the total radiolabeled R1881 at any concentration evaluated in the assay. To ensure that not more than this percent range of radioligand is bound, the lowest concentration of radioligand (0.25 nM 3[H]-R1881) is used to determine the optimal protein concentration. The undiluted prostate cytosol collected in both laboratories contained a concentration of approximately 6 mg protein/ml, which was consistent with the expected range of 5.5 to 8 mg/ml in undiluted cytosol listed in the test guideline (U.S. EPA, 2009c). In both laboratories, the cytosol was shown to bind <25 to 35% of 3[H]-R1881 when assayed with 0.25 nM 3[H]-R1881, meeting the acceptance criteria specified in the guideline (U.S. EPA, 2009c).
The saturation binding assay uses eight concentrations of 3[H]-R1881 in the presence of radioinert R1881 to confirm that the AR activity is sufficient for use in the competitive binding assay (Supplemental Fig. S4). AR activity is evaluated by determining the concentration of active receptor sites, maximum specific binding number (Bmax), and affinity of the harvested AR for 3[H]-R1881. The dissociation constant (Kd) also is determined. Three independent saturation binding runs were performed in one laboratory and two were completed in the other laboratory. The Kd values for the runs were 0.883, 1.015, and 0.811 nM 3[H]-R1881 with an average of 0.903 nM 3[H]-R1881 in one laboratory and 0.436 and 0.492 nM 3[H]-R1881 with an average of 0.464 nM 3[H]-R1881 in the other laboratory. In one laboratory, the Kd values were slightly below the range in the EPA validation studies, which was 0.685 to 1.57 nM (U.S. EPA, 2009c). While slightly below the suggested range, the confidence in these numbers is high according to the goodness of fit of the data in this laboratory (R2 = 0.9871–0.9937) and the small variation among runs. The Bmax values were 4.7, 4.6, and 4.4 fmol/100 μg protein with an average of 4.5 fmol/100 μg protein in one laboratory, and 3.1 and 3.3 fmol/100 μg protein with an average of 3.3 fmol/100 μg protein in the other laboratory. Confidence in the assay results was high as the adjusted coefficient of determination (adjusted R2) was 0.865, 0.872, and 0.821 in one laboratory and 0.987 and 0.994 in the other laboratory, with small variations between runs.
The final concentration of 3[H]-R1881 used in all of the competitive binding assay tubes is 1 nM, which is different from the concentration used for the protein determination for the saturation binding assay. Thus, the protein concentration used in the competitive binding assays is optimized accordingly for the competitive binding assay. For the competitive binding assays, one laboratory diluted the prostate cytosol to approximately 4 mg protein/ml, which was shown to bind to approximately 11% of 1 nM 3[H]-R1881, providing a final concentration of 1.2 mg/ml protein/assay tube. The other laboratory did not dilute the prostate cytosol (6.6 mg protein/ml) that was shown to bind approximately 5.7% of 1 nM 3[H]-R1881, providing a final concentration of 1.97 mg/ml protein/assay tube.
The large range of test chemical concentrations (10−10 to 10−3 M, or up to the limit of solubility) used in the AR binding assay, based on the test guideline (U.S. EPA, 2009c), is expected to provide sufficient data to allow full characterization of the competitive binding curve, determination of the IC50, calculation of the RBA (relative to R1881), and classification of the interaction (e.g., binder, nonbinder, equivocal). The concentration of solvent in the assay tubes should not alter the sensitivity or the reliability of the assay. Evidence of insolubility and the method to evaluate solubility (e.g., microscopy, nephelometer) should be reported. One laboratory relied on gross visual inspection of the dosing solutions and assay tubes to evaluate solubility, whereas the other laboratory evaluated solubility by microscopy just after dosing. If solubility is an issue, the highest concentration may be lowered by an amount that yields half-log decreases in concentration as opposed to a full log interval between lower dose levels. If the test chemical does not reach 50% reduction in binding and is not soluble at concentrations of 10−6 M, the test chemical is deemed untestable.
Test material dosing solutions were prepared on the day of treatment. The solvent used for the test substances evaluated by one laboratory was deionized water, whereas the other laboratory used ethanol. The final solvent concentration for the reference standard, positive control, and the test substances was 3% for deionized water and 1 to 3% for ethanol; neither laboratory had solvent effects in their binding assays. Preparations of the initial solutions for the reference standard, positive control, and test substance were prepared by serial dilution from a stock solution; in addition, one laboratory included two additional dilutions for the positive control and the test substance, a high concentration of 3.0 × 10−2 M and a low concentration of 3.0 × 10−9 M, which were added to the binding assays to provide additional information over a larger range of exposure concentrations. No precipitation of the reference standard, positive control, or the test substances was observed during the assays. The reference standard and the positive control were not evaluated for concentration; however, the assay results met expectations and were consistent across AR binding assays conducted in this laboratory.
Description of Assay Conduct
The ability of the test substance to competitively bind to AR in rat prostate cytosol with three independent runs using 3H-R1881 as the radioligand was evaluated according to the EPA guideline (U.S. EPA, 2009c). Similar to the ER binding assay, laboratories and personnel conducting the AR binding assay must be approved to work with radioisotopes. To maximize the data produced from each assay, up to three test chemicals could be run by one technician in the same AR binding assay. Including more than one test chemical in an assay would allow for utilizing the same control samples, thus, limiting the amount of cytosol required and ultimately using fewer animals. It would not be our recommendation to run more than three test chemicals unless additional technicians were available as the receptor in the thawed cytosol will degrade with time.
Results and Assay Interpretation
The performance parameters for radioinert R1881, as listed in Table 4, were within the acceptable ranges as specified in the test guideline, with the exception of the bottom of the curves (percent), which was slightly lower than the suggested range in both laboratories on occasion (n = 4 of 14 assay runs). In addition, the slopes of the R1881 curves were slightly greater than the suggested range in one laboratory (n = 2 of 14 assay runs). The performance parameters for dexamethasone were within the acceptable ranges as specified in the test guideline, with the exception of the top of the curve (percent) of 112 for one run in Laboratory 2, which was slightly above the recommended range. However, this did not warrant performing additional runs since these values were only slightly outside of the suggested range and the range of acceptance criteria listed in the test guideline are suggested ranges, not mandatory limits. All runs were within or very close to the suggested performance criteria, and therefore the assays were considered valid. In both laboratories, confidence in the assay results was high due to the small variation between runs. In addition, there was no observed precipitation at any of the concentrations tested. The SC responses indicated no drift in any of the runs.
Table 4. Performance of the Androgen Receptor Binding Assay in Two Laboratories
|Increasing unlabeled R1881displaces [3H]R1881 from the receptor consistent with one-site competitive binding||11||Yes||100||3||Yes||100|
|Ratio of total binding in absence of competitor to total amount of 3H-R1881 (Ligand Depletion) ≤ 15%||11||100||3|
- 0.69 – 0.86
- (0.77 ± 0.085)
|Slope or R1881 curve between −1.2 and −0.8||11||100||3|
- −0.800 to −0.600
- (−0.700 ± 0.100)
|Top (%) of R1881 standard curve between 82 and 114||11|
- 96.7 – 102.9
- (100.0 ± 2.0)
|Bottom (%) of R1881 standard curve between −2 and +2||11||82||3|
- −3.30 to −0.300
- (−2.30 ± 1.73)
|Slope of dexamethasone curve between −1.4 and −0.6||11||100||3|
- −1.2 to −0.8
- (−1.07 ± 0.231)
|Top (%) of dexamethasone curve between 87 and 106||11||100||3||67|
|Bottom (%) of dexamethasone curve between −12 and +12||11||100||3|
- −5.5 to 1.8
- (−1.20 ± 3.82)
|Solvent does not alter assay sensitivity or reliability||11||No effect||100||3||No effect||100|
Interpretation of AR assay results relies on both 3H-R1881 displacement and Hill slope parameters. An excessively steep Hill slope, coupled with poor model fit, likely indicates nonspecific displacement of the radioligand. Under these conditions, the test compound is considered “negative” for AR binding.
How Predictive Are the Models: AR Binding as Part of the Tier 1 EDSP Battery
As discussed previously, while competitive binding assays provide information on a compound's affinity for the endogenous receptor under in vitro conditions, the potential of that compound to interact with the receptor and subsequently to modulate the endocrine system is determined by using a WoE approach that takes into consideration data from all of the assays in the Tier 1 battery and other scientifically relevant information. For detection of androgen/antiandrogens, there are four Tier 1 assays included in the U.S. EPA EDSP that are capable of detecting the effect of chemicals on the AR signaling pathway: the in vitro AR binding assay, the Hershberger assay (castrated rat; in vivo), and the intact rat male pubertal and fish short-term reproduction assays (both in vivo assays capable of detecting multiple modes of action). Of these in vivo assays, the Hershberger assay is considered specific and sensitive for androgenic compounds operating through the AR. Across the 10 compounds evaluated in both laboratories, the AR binding assay results were negative (nine compounds) or equivocal (one compound) without any compounds designated as positive for AR binding (Table 3). Seven of the 10 compounds yielded similar, negative results in the in vivo Hershberger assay, including the compound that produced an equivocal response for AR binding. Three compounds yielded positive results for antiandrogenicity in the Hershberger assay, but this was judged to be due to hepatic enzyme induction and enhanced T clearance, not due to interaction of the test compound with the AR (see accompanying article on the Uterotrophic and Hershberger assays for more details). Thus, with the exception of compounds inducing T metabolism, the in vitro AR binding assay and the in vivo Hershberger assay results were reasonably consistent for this data set. Notably, the Hershberger assay also can detect 5α-reductase inhibitors that would not be detected with the AR binding assay.
Overview of Assay Rationale and Design
Aromatase, also known as CYP19, is a member of the P450 superfamily of monooxygenase enzymes, and it plays an important role in catalyzing the conversion of androgens to estrogens during steroidogenesis. The recombinant aromatase inhibition assay uses human CYP19 and P450 reductase supersomes (microsomes derived from baculovirus infected insect cells; BD Biosciences Catalog #456260) combined with β-nicotinamide adenine dinucleotide phosphate (NADPH), sodium phosphate buffer, and propylene glycol as an in vitro test system. The substrate supplied to this system is [1β-3H]androstenedione (ASDN), which is converted to estrone and 3H2O by the catalytic activity of aromatase. In the radiometric method, the amount of 3H2O is measured as an indicator of aromatase activity (U.S. EPA, 2009d).
As a positive control in the assay, eight concentrations of 4-hydroxyandrostenedione (4-OH-ASDN), a potent inhibitor of aromatase activity, are run in duplicate in parallel to the test chemical. Eight concentrations of the test chemical spanning seven orders of magnitude (10−10 M to 10−3 M) are tested in triplicate in each run of the aromatase assay. Also, included in each run of the aromatase assay are a total of four background activity controls and four full activity controls. The full activity control includes substrate, NADPH, propylene glycol, phosphate buffer, vehicle (solvent used in the preparation of the test substance solutions), and microsomes. The background activity controls consist of all components of the full activity control except for NADPH. Two replicates of the background activity control and the full activity control are dosed at the beginning of the assay, while the remaining two replicates of the background and full activity controls are dosed after all other test vessels have been dosed with assay components. This is done to assess the variability in the response of the assay over the entire order of dosing.
For each test chemical, the aromatase assay was repeated a total of three times in three different “runs” of the assay. Thus, three concentration response curves were generated for each chemical tested in the assay, and these three concentration response curves were plotted, and the average of the three concentration response curves was generated and plotted. The average response curve was used when concluding on the aromatase assay results. Chemicals that reduce enzyme activity, on average, by 50% or more were considered to be inhibitors of aromatase activity. Chemicals that fit the inhibition curve, but reduced activity by 25 to 50% on average, were considered equivocal aromatase inhibitors. Chemicals that did not fit the model or that fit the model but reduced activity by <25% on average were considered to be noninhibitors (U.S. EPA, 2009d).
When preparing to conduct the aromatase assay according to the U.S. EPA guideline (U.S. EPA, 2009d), personnel conducting the assay were appropriately trained for radioisotope work since 3H-ASDN is used in the test system. The conduct of the assay was confined to radioactive-approved laboratory space and used only designated radioactive equipment. Also, of note for the conduct of the aromatase assay is the fact that the nonradiolabeled ASDN (used to dilute the 3H-ASDN) is a controlled substance. Thus, its distribution and waste disposal was tracked throughout the assay.
To internally validate the aromatase assay, the response of 4-OH-ASDN, the positive control for aromatase inhibition, as well as the response of additional test chemicals with varying capacities to inhibit the activity of the aromatase enzyme were evaluated in both laboratories (Supplemental Table S3). Test chemicals in the validation effort of the aromatase assays included nitrofen, fenarimol, prochloraz (PRO), econazole, ronidazole, and atrazine. The results of the human recombinant aromatase assay with these validation chemicals indicated that ronidazole and atrazine are noninhibitors of aromatase, while nitrofen, fenarimol, econazole, and PRO are inhibitors of aromatase. The Hill slope and IC50 values for the validation set of test chemicals in the aromatase assay were consistent with the findings reported in the U.S. EPA aromatase Integrated Summary Report (Supplemental Table S3; U.S. EPA, 2007).
Description of Assay Conduct
During the conduct of the assay, test tubes were dosed with the appropriate components as described in the guideline (U.S. EPA, 2009d). A total of 20 μl of the test material, carrier solvent, or positive control (4-OH-ASDN) were dosed into a 2 ml aromatase test system, which equates to 1% of the total assay volume. Ethanol or DMSO were the solvents used in accordance with the recommendations in the aromatase guideline (U.S. EPA, 2009d). DMSO was used when the test material was not completely soluble in ethanol at the desired concentrations for dosing into the test system. Solubility was judged by the naked eye following dosing into the 2 ml test system. If precipitates or a cloudy appearance were noted when the test material was added into the system, then the top dose was lowered in accordance with the guideline until there were no signs of insolubility in the assay (U.S. EPA, 2009d). Test tubes containing the various reagents, test materials, and/or positive controls were incubated for a total of 15 min at 37°C in a gently shaking water bath. Following incubation, the reaction was terminated by adding methylene chloride and placing the samples on ice. Following a total of three extractions with methylene chloride as described in the guideline (U.S. EPA, 2009d), the water fraction of the assay was quantified for 3H2O in a liquid scintillation counter.
Logistic Experimental Considerations
When assessing a test chemical in the aromatase assay, the three runs of the assay were generally conducted on separate days. In our experience, each run of the assay generally required six to eight person-hours. Each run of the assay used fresh dilutions of the stock solutions for the positive control, 4-OH-ASDN, and the test material. Microsomal and NADPH preparations in phosphate buffer were also made fresh with each run of the assay. Undiluted microsomes were stored in 100 μl aliquots at −80°C for no longer than 12 months from the date of release in accordance with the aromatase guideline (U.S. EPA, 2009d). As there was no specific guidance on whether to make fresh substrate solutions with each run of the aromatase assay, the radioactive substrate (3H-ASDN diluted with nonradiolabeled ASDN) was often, but not always, made fresh for each run of the assay. Stock dosing solutions were assessed for concentrations of the test material on at least 1 day of the aromatase assay. Positive control stock solutions were, as a general rule, not analytically verified. The quantification of the positive control was not deemed scientifically necessary since the assay performance criteria for the positive control were generally met, demonstrating that the assay system was working as expected at the nominal concentrations of 4-OH-ASDN added into the system.
A typical dose–response curve for 4-OH-ASDN (relative to the full activity control response) is illustrated in Supplemental Figure S5. Aliquots of the human recombinant microsomes were diluted in the assay phosphate buffer to achieve the desired final working stock concentration of approximately 0.008 mg/ml. The final target protein concentration in the incubation mixture was targeted to be approximately 0.004 mg/ml in accordance with the aromatase guideline (U.S. EPA, 2009d). Generally, the protein content of a particular microsomal aliquot was determined before use in the aromatase assay; however, due to logistic constraints in one laboratory, there were occurrences when the aromatase assay was conducted before the protein content in the individual microsomal aliquot was quantified. In this case, the amount of microsomes added to each reaction vessel was estimated based on previous protein quantifications for that same batch number of microsomes. Following protein quantification for a specific microsomal aliquot, adjustments were made so that the actual final protein concentration used in the assay was accurately reflected in the aromatase activity calculations. Only minor adjustments were necessary in these cases as protein content between individual microsomal aliquots within a batch of microsomes differed only very slightly.
Another practice that was instituted in one of the laboratories was to first run full activity control samples (generally n = 2, full activity control test tubes) before running the full assay to assess whether the full activity of the reaction met with the required performance criteria of ≥0.100 nmol/mg protein/min. This procedure was adopted when it was found that several batches of microsomes from the supplier resulted in less than optimal full activity responses, and time and materials were wasted running the test material and positive control in the assay with suboptimal microsomes.
In general, the performance criteria of the aromatase assay were largely attainable in both laboratories; however, it was a common instance that one or more performance criteria were not met in any given run of the aromatase assay (Table 5). Aromatase assay run results were critically assessed by an experienced study director if performance criteria were not entirely met. A given run was generally considered to be valid if the performance criteria not met were few in number and/or if the 95% confidence intervals (CIs) of the response were still within range of the performance criteria. If two or more performance criteria in a given run were not met, a run was generally discarded and an additional run of the assay was performed at the discretion of the study director. A commonly encountered problem in a set of the assays in Laboratory 1 was the poor performance of the manufacturer's microsomal lot, likely due to low aromatase activity per unit of protein. This could be the result of poor purification of the microsomal fraction and a relatively higher concentration of nonspecific protein. Subsequently, new microsomal lots were screened for full activity in the aromatase assay, which eliminated the problem of poor performance of the full activity controls. In both laboratories, performance criteria associated with the Top% of the 4-OH-ASDN curve and the logIC50 for the 4-OH-ASDN curve were most frequently not met (Table 5). However, when the performance criteria were not met, the values were generally only slightly outside of the recommended ranges for these parameters as outlined in the aromatase assay guideline (U.S. EPA, 2009d).
Table 5. Performance of the Recombinant Aromatase Inhibition Assay in Two Laboratories
|Minimum level of mean aromatase activity ≥ 0.100 nmol/mg protein/min||36||83||10||91|
|Mean background control activity ≤15% full activity control||36||100||10||91|
|Slope of 4-OH ASDN response curve between −1.2 and −0.8||36|
- −1.5 to −0.47
- (−0.95 ± 0.20)
- −0.06 to 1.5
- (−0.99 ± 0.23)
|Top (%) of 4-OH ASDN response curve between 90 and 110||36||81||7||64|
|Bottom (%) of 4-OH ASDN response curve between −5 and 6||36||89||10||91|
|logIC50 of 4-OH ASDN between −7.3 and −7.0||36|
- −7.7 to −6.6
- (−7.2 ± 0.24)
Factors to Consider in Assay Interpretation
The aromatase guideline (U.S. EPA, 2009d) requires that the Hill slope and IC50 values (or EC50 values if the dose-response curve did not result in ≥ 50% inhibition) of the three test chemical dose–response curves be analyzed via the one-way ANOVA to test for significant differences between runs of the assay. The utility of this statistical evaluation is unclear as it is not factored into decision making for the aromatase assay. In our experience, when there were instances of statistical differences between response curves, this was due to very low assay variability within a given run and there were no biologically meaningful differences between runs (Supplemental Fig. S6).
The aromatase assay is designed for detecting inhibition of aromatase activity, and while the scope of the assay is quite specific, it is also important to bear in mind that inhibition of aromatase activity may possibly occur by an indirect manner. For instance, denaturation of the enzyme could result in aromatase inhibition that was not specifically due to interaction with a binding site on the enzyme (U.S. EPA, 2007). In addition, it is possible that some test substances could create a type of physical interference in the assay matrix or could react with other components in the assay mixture (NADPH, etc.) such that overall aromatase activity would be lowered. Therefore, it is possible to measure inhibition of the aromatase enzyme even when the test substance is not specifically acting on the enzyme itself. A limitation of the assay, as with many in vitro systems, is the lack of metabolizing capability. The assay is not capable of assessing the effects of the metabolically transformed test chemical, thus only the parent compounds’ ability to inhibit aromatase activity is assessed in the current form of the U.S. EPA's Tier 1 aromatase assay (U.S. EPA, 2009d).
The steroidogenesis assay is an in vitro method for detecting xenobiotics that may affect the steroidogenic pathway beginning with the sequence of reactions occurring after binding to the gonadotropin hormone receptors (follicle stimulating hormone receptor and luteinizing hormone receptor) through the production of T and E2. The assay measures changes in the production of T and E2 in a human adrenocortical carcinoma cell line (H295R; Hecker et al., 2006a; U.S. EPA, 2009e). This in vitro steroidogenesis assay was included in Tier 1 of the U.S. EPA's program to screen for potential endocrine disrupting chemicals, and is the only in vitro assay included to investigate the broad spectrum of the enzymatic reactions within steroidogenesis pathway (U.S. EPA, 2009e).
The H295R cell line (American Type Culture Collection [ATCC] CRL-218, Manassas, Virginia) is capable of expressing genes that encode for key enzymes required for steroidogenesis (Rainey et al., 1993). Although H295R cells have physiologic characteristics of undifferentiated human fetal adrenal cells, the cells have the ability to produce the steroid hormones found in the adult adrenal cortex and the gonads, which represents a unique in vitro system and allows testing for effects on both corticosteroid synthesis and the production of sex steroid hormones, such as androgens and estrogens (Gazdar et al., 1990). Evaluation of 5α-reductase activity (i.e., the formation of dihydrotestosterone (DHT) from T), however, is specifically excluded for analysis. The assay is performed under standard cell-culture conditions in 24-well plates, and after an acclimation period of 24 hr, cells are exposed for approximately 48 hr to SC, positive control chemicals (forskolin [FOR] or PRO), or multiple concentrations of test chemical in triplicate. In parallel, a QC plate is run with prescribed concentrations of known inducers and inhibitors of hormone production. At the end of the exposure period, the medium is removed from each well for analysis, and cell viability is evaluated immediately. In both laboratories, the concentrations of T and E2 hormones in the medium were measured using liquid chromatography with tandem mass spectroscopy (LC/MS/MS or more specifically, LC-positive atmosphoric pressure photoionization MS/MS [LC/APPI-MS/MS]; Zhang et al., 2009). Because the cells produce a basal level of hormones, both induction and inhibition of steroidogenesis can be measured. Data were expressed as fold change relative to the SC and are based on three independent and acceptable test runs. The first test run may function as a range-finding run with subsequent adjustment of concentrations for runs 2 and 3 if relevant solubility and/or cytotoxicity issues are encountered.
Plating and Preincubation of Cells
H295R cells were cultured following the specific protocol described in the U.S. EPA EDSP guideline OPPTS 890.1550 (Hecker and Giesy, 2008; U.S. EPA, 2009e). In preparation for conducting exposure experiments, H295R cells were cultured for five passages from the original ATCC source cells. Cells from passage number five were cryogenically stored in liquid nitrogen. Cells from these frozen batches were cultured for at least four additional passages before initiation of testing to allow frozen cells to recover (Hecker et al., 2006b). Although divergent from the prescribed passage numbers in the guideline, cells were used for up to 10 passages or beyond if QC criteria were met (see below). For testing, H295R in growth medium (DMEM/F12 Ham) were seeded at a density of 200,000 to 300,000 cells per well in 24-well plates (50–60% confluence) and incubated for 24 hr before treating with test chemicals to allow for attachment and reestablishment of normal cellular morpohology.
Quality Control Experiment for Cell Performance
A QC H295R cell performance test was conducted along with each run of the assay to verify that the performance of H295R cells met the QC requirements. According to the U.S. EPA EDSP guideline, the QC plate was dosed with FOR (an inducer of T and E2 synthesis) at 10 and 1 μM; PRO (an inhibitor of T and E2 synthesis) was dosed at 1 and 0.1 μM. The cells were refed with fresh media and then exposed for 48 hr with vehicle (DMSO, 0.1%) or the positive control chemicals, FOR or PRO. At the end of the exposure period, the medium was removed from each well and prepared for analysis as described below. The recommended criteria to be met on each QC plate are given as general guidance and are listed in Table 6 along with the QC sample values. LC/APPI-MS/MS methodology, which had a lower detection limit than other methods, allowed for some modification in the performance criteria. For example, the guideline listed a minimum basal production of E2 as 40 pg/ml or ≥2.5× the minimal detection limit; the detection limit with LC/APPI-MS/MS and LC/MS/MS was 10 and 5 pg/ml, setting the requirement for basal E2 production at 25 and 12.5 pg/ml, respectively. This change in basal E2 production resulted in 92 and 100% acceptance of assay results (vs. 53 and 25% with 40 pg/ml).
Table 6. Performance of the Hormone Response in the Steroidogenesis Assay in Two Laboratories
- Minimum basal production
- For T: 500 pg/ml or ≥ 5-times MDL
- For E: 40 pg/ml or ≥2.5-times MDL
- (1548.4 ± 275.2)
- Induction (10uM FOR)
- For T: ≥ 2-times SCb
- For E: ≥ 7.5-times SC
- Inhibition (1 μM PRO)
- For T: ≤0.5-times SC
- For E: ≤ 0.5-times SC
A summary of the basal attributes and responsiveness to the positive control of the steroidogenesis assay system is presented in Table 6, including the range, mean, and SD. The data indicate the highly variable nature of the hormone production in the H295R cell line, even while controlling for the passage number as indicated in the guideline. Justification was provided in accepting experiments with slight deviations from the performance criteria; however, if variance was deemed too large for confidence in assay performance, the assay was repeated.
The test chemical experiment was conducted in the same manner as described for the QC experiment. After an acclimation period of 24 hr, H295R cells were exposed to medium containing test material in triplicate at the appropriate concentrations (up to the guideline limit concentration of 10−4 M or the limit of solubility, whichever is less, and including up to 7 log dilutions). The cells were exposed for approximately 48 hr and the medium was removed from each well and analyzed as described below. The study consisted of three independent, acceptable experiments that met performance standards as deemed appropriate (see QC section above).
Cell Viability/Cytotoxicity Measurements
The cell viability/cytotoxicity testing was conducted in the QC plate and in the test chemical exposure plate immediately after termination of the exposure experiments. In both laboratories, the CellTiter 96 Aqueous One Solution Cell Proliferation Assay Kit was used for assessment of cell viability testing (see Section ERTA). The minimum cell viability required per well was ≥80%; wells with lower viability were excluded in the final data analysis. Hormone production data were not normalized for occasional and/or random changes in cell viability. In addition to viability/cytotoxicity testing, a subjective visual examination of cells was conducted to examine attachment, the degree of confluence, homogeneity from well-to-well, and any signs of cytotoxicity or altered morphology.
There was some difficulty using the Live/Dead Cell Viability/Cytotoxicity Assay described in the test guideline. In both laboratories’ experience, the Live/Dead assay did not provide a clear output for determination of cytotoxicity and was not consistent with the morphologic appearance or hormone production of the cultured cells. After a direct comparison of multiple cytotoxicity measures, it was determined that the MTT assay performance was superior to the alternative methods.
After 48 hr of continuous treatment, the medium was removed from each well, transferred into two vials (two 0.45 ml aliquots) and stored at −80°C until measurement of hormone concentrations in the medium. In the laboratory using LC/APPI-MS/MS, stability of frozen samples was accounted for by the inclusion of spiked internal samples in each sample set. In the other laboratory using LC/MS/MS, internal standards were added before sample extraction. These analytical methods preclude the need for a separate chemical interference assay, as may be required with an ELISA, because interference with LC/MS/MS is unlikely, and extraction and quantification efficiency could be determined with internal standards. Validation of the analytical measurement system required conformance with the QC criteria defined in Table 6. The large dynamic range for linearity of hormone detection and quantification was 0 to 2500 pg/ml (ppt) for both T and E2 with the LC/MS/MS method, and this ensured validity of the hormone quantification results.
To prepare the culture medium for hormone analysis by LC/APPI-MS/MS, a liquid–liquid extraction system was utilized (Zhang et al., 2009). Briefly, standards and samples were vortexed with methylene chloride containing both T (CAS No. 58–22–0) and E2 (CAS No. 50–28–2) internal standards, and the organic phases were transferred to clean vials. The aliquots were evaporated to dryness, derivatized by adding 100 μl sodium bicarbonate buffer (100 mM, pH 10.5) and 100 μl dansyl chloride solution (2.0 mg/ml in acetone), heated (60°C for 15 min), then cooled. Derivatized samples were analyzed and quantified by reverse-phase LC/APPI-MS/MS using an Applied Biosystems Sciex API 4000 tandem mass spectrometer (Concord, Ontario, Canada) equipped with an Agilent HP 1100 HPLC system. The mass spectrometer was equipped with a TurboIonspray interface. Similar methodology was used in the laboratory using LC/MS/MS. However, a protein precipitation plate was used for T extraction using acetonitrile.
QC Parameters during Testing
In addition to meeting the criteria for the QC plate, other quality criteria such as variation between replicate wells, replicate experiments, linearity and sensitivity of hormone measurement systems, and variability between replicate hormone measurements of the same sample were also examined (Table 6). These criteria were used as a general guidance in evaluating the adequacy of the experimental conditions. As with any biologic system, slight variations from experiment-to-experiment were to be expected. Justification was provided in accepting experiments with slight deviations from the criteria.
A commonly encountered deviation from the guideline recommendation in the assays was poor basal production of E2 (Table 6). With respect to the basal production of E2, the substantial variation in production was unrelated to passage number or treatment conditions. Several different aliquots of cells and source cells from ATCC were tried without a clear determination of the cause of basal hormone variability. One potential contributor to the variation of basal E2 and T production could be associated with the variable composition of Nu-Serum between lots; however, the majority of testing in Laboratory 1 used the same lot of serum and had substantial variability. It is also conceivable that analytical characterization of the hormone production more accurately distinguished hormone analogues when compared with the ELISA methods described in the publications and test guideline (U.S. EPA, 2009e), that is, the ELISA method may have grouped few E2 and/or T variants as being the parent hormone.
The literature and guideline suggest 22-R-hydroxycholesterol can be added to the media to increase basal production; however, in Laboratory 1 this only served to slightly increase the basal production (and only in some cases) and limited the dynamic range of fold induction of hormone levels after FOR treatment. A full, six-point dose–response (0–40 μg/ml) of 22-R-hydroxycholesterol was evaluated and was dismissed in Laboratory 1 due to lack of benefit. Laboratory 2 used 10 μM 22-R-hydroxycholesterol to ensure that criteria for basal production of E2 were attained. However, this may account for the difference in T fold induction following FOR treatment compared with Laboratory 1.
In addition, incomplete inhibition of E2 synthesis with PRO treatment was occasionally identified in Laboratory 1. While 10 μM FOR induced T and E2 levels to acceptable values in 100% of the assays performed, PRO inhibited E2 production by 2.0-fold (i.e., 0.5 times) only in 42% of the assays (Table 6). Typically, inhibition by PRO was slightly outside the recommended range (e.g., 0.6 times); however, the runs were considered valid when all other parameters were within recommended ranges. This was not observed in Laboratory 2, where PRO treatment often reduced E2 levels to below the limit of detection (5 pg/ml).
Laboratory Proficiency Test
Experiments were conducted with positive control chemicals to demonstrate laboratory proficiency and assay validation (Supplemental Fig. S7). These experiments, which used the same analytical and treatment procedures as previously described, were conducted before the initiation of test material screening. The positive control data, in general, met the criteria as listed in section g (iii) of U.S. EPA OPPTS 890.1550 (U.S. EPA, 2009e). The EC50 values of proficiency tests with FOR and PRO are shown in Table 7.
Table 7. Steroidogenesis Assay Laboratory Proficiency Study—Determination of EC50 Range for FOR and PRO
|EC50 for FOR (μM)||0.2–2.0||0.3–3.0||0.36–1.02||3||0.64–0.85||3||0.29–2.78||3||1.65–3.91||3|
|EC50 for PRO (μM)||0.01–0.1||0.03–0.3||0.06–0.58||3||0.07–0.16||3||0.23–0.50||3||0.08–0.30||3|
In multiple laboratory proficiency tests (runs) in Laboratory 1 and 2, the calculated EC50 for PRO inhibition of T did not meet the suggested range of values. In these cases, the calculated value was slightly higher than recommended as a result of the curve fit of the concentration response; however, all other cell responses indicated acceptable performance (e.g., magnitude of decrease more than twofold] and concentration response). The data for the proficiency studies, which included full concentration–response data for each of the positive control chemicals, indicated acceptable EC50 values and other response parameters when compared with the guideline recommendations (data not shown). When taken in entirety, the cells and the method were determined to be adequate for identification of alterations in steroidogenesis as described in the guideline.
In general, the performance criteria for the steroidogenesis assay, both for the proficiency assays as well as the QC plates run simultaneously with each compound, were largely attainable; however, it was a common occurrence that one or more performance criteria were not met in any given run. Steroidogenesis assay results were critically assessed by experienced study personnel if performance criteria were not entirely met.
Data Processing and Statistics
To evaluate the relative increase or decrease of hormone production after test chemical exposure, the results were normalized to the mean SC value for each assay and results were expressed as fold change relative to the SC in each exposure plate. All data were expressed as mean ± SD and doses that exhibited cytotoxicity ≥20% were omitted from further evaluation. Relative changes were calculated as follows:
Before conducting statistical analyses, the assumptions of normality (Shapiro–Wilk's test, α = 0.01) and variance homogeneity were evaluated (Bartlett's test, α = 0.01). If the data were not homogeneous or normally distributed, then the data were transformed to approximate homogeneity or a normal distribution. If the data were homogeneous and approximately normally distributed, differences between chemical treatments and SC were analyzed using a parametric ANOVA followed by Dunnett's test, if significant. If the data were not homogeneous or normally distributed, a nonparametric test was used (Kruskal–Wallis), which if significant, was followed by the Wilcoxon rank sum test with a Bonferroni–Holm correction. Differences were considered significant at p ≤ 0.05.
Data Interpretation and Decision Logic for Assessing Induction and/or Inhibition
Data from the assay were used to classify chemicals according to their ability to induce and/or inhibit T and E2 production. A dose-related, statistically significant, and reproducible increase/decrease in fold change in the hormone production relative to the SC would indicate the chemical was an inducer/inhibitor of one or more enzymes in the steroid synthesis pathway. Statistically significant differences at concentrations that did not follow a dose–response curve and/or differences that were not reproducible were considered to be random effects; such results were considered to be equivocal at best. Results exceeding the limits of solubility or at cytotoxic concentrations were not included when interpreting the results (U.S. EPA, 2009e).
A statistically significant increase in E2 levels (1.2- to 1.3-fold) was seen with three of eight test compounds evaluated in Laboratory 1 at the highest test concentration. In Laboratory 2, three of three compounds evaluated showed decreased T levels (1.1- to 1.2-fold) at the highest test concentration. The apparent increases in E2 may have been due to (1) enhanced aromatase activity; however, this would require an upregulation of other steroid biosynthesis hormones to maintain T levels (i.e., there was no concomitant decrease in T in the cases where E2 was increased); (2) increased stability of synthesized E2; or (3) decreased catabolism of endogenous (nonsynthesized) E2, which is present at significant levels within the Nu serum-supplemented cell culture medium. The apparent decreases in T may have been due to (1) decreased stability of synthesized T (i.e., due to increased aromatase activity; however, there was no concomitant increase in E2 concentrations when decreases in T were observed); or (2) decreased synthesis of T due to decreased activity of steroidogenic enzyme activity upstream of T syntheses; however, there was no concomitant decreases in E2 concentrations when decreases in T were observed. Thus, while slight increases in E2 production (1.2- or 1.3-fold) or slight decreases in T concentrations (1.1- to 1.2-fold) were demonstrated at the top (limit) concentration, the slight fold change increases were of questionable biologic relevance. These statistical alterations were consistent with the guideline and standard evaluation procedure criterion for an alteration in steroidogenesis, but appeared to be nonspecific alterations.
During development and validation of the H295R steroidogenesis assay, at least a 1.5-fold alteration in T or E2 production, accompanied with statistical significance, was required to demonstrate altered steroidogenesis (Hecker and Giesy, 2008). Notably, none of the compounds that increased E2 production in the steroidogenesis assay in Laboratory 1 yielded estrogenic responses in the uterotrophic or female pubertal assays or induced vitellogenin production in the fish short-term reproduction assay. Thus, a change in the test guideline criteria for positive results (from the 1.5-fold alteration in T or E2 that was originally proposed to any statistically significant change in T or E2) may have resulted in the identification of additional compounds at high concentrations as the cells approach a cytotoxicity or general stress–response threshold.
How Predictive Are the Models
While the aromatase and steroidogenesis assays provide information on a compound's ability to alter steroidogenesis in vitro, the potential of that compound to interact with the endocrine system is determined across all of the assays in the Tier 1 battery by using a WoE approach. In the case of inhibitors of steroidogenisis, the aromatase and steroidogenesis assays are two of five assays that examine altered steroid biosynthesis; the remaining assays include the pubertal male and female assays, and the fish short-term reproduction assay. The pubertal female and male assays and the fish short-term reproduction assay detect multiple modes of endocrine activity, but have potential signatures of effects that could indicate effects on the steroidogenesis pathways, including aromatase inhibition.
Across the 11 test compounds evaluated, the aromatase assay results were negative (8 of 11), equivocal (1 of 11), or positive (2 of 11). The compound with equivocal results in the aromatase assay was negative in the steroidogenesis assay. In the case of the two positive results in the aromatase assay, both compounds produced positive results in the steroidogenesis assays. One compound inhibited aromatase and decreased E2 production in the steroidogenesis assay. These data present a consistent result for altered in vitro steroid biosynthesis; however, one compound inhibited aromatase, but increased E2 production in the steroidogenesis assay. Thus, there was imperfect concordance between the aromatase assay and the steroidogenesis assay, which may be understandable given the greater scope of the steroidogenesis assay. Overall, there was agreement between the aromatase and steroidogenesis assays, particularly when the aromatase positive results showed ≥50% aromatase inhibition.
In the two cases where both the aromatase and steroidogenesis assays were positive, the first compound also had positive results in the higher tier pubertal and fish short-term reproduction assays, while in the second compound, the in vitro positive findings were not confirmed in vivo in either the pubertal assays or the fish short-term reproduction assay. This highlights that in vitro results in the aromatase and steroidogenesis assays are not necessarily indicative of in vivo results, likely due to several factors, including the complexity of intact systems (i.e., having the capability of responding to alterations in steroidogenesis to maintain homeostasis) and/or kinetics (absorption/distribution/metabolism/elimination) of the test compound in the in vivo test system.
The five in vitro assays that have been incorporated into the EDSP Tier 1 screening battery can provide valuable information on the mode of action of the test compound under study; however, there are some caveats in the conduct of these in vitro assays and use of the resulting data. The ER and AR binding assays appear to be reproducible and specific; however, more efficient use of technician time and resources (including cell cytosols that require the use of animals) can be achieved by evaluating multiple chemicals at one time, whenever possible. While involving the same initial molecular event (ER binding), the ERTA assay can yield different results than the ER binding assay; these discrepancies may be legitimate differences due to cell context dependent responses, but there is some indication that use of the PC10 cutoff criteria in the ERTA may contribute to the mischaracterization of some chemicals as positive. The aromatase assay appears to be reproducible and reliable, provided that the microsomes used in the assay have sufficient CYP19 activity to yield valid assay results. In the steroidogenesis assay, the H295R cells showed interexperimental variability in basal and stimulated T and E2 production. This aspect, coupled with the potential impact of cell stress/subclinical cytotoxicity, make interpretation of steroidogenesis results more complex. When considering WoE, it is important to consider that the in vitro assay systems have little or no metabolic capability; therefore, activity is primarily evaluated for the parent compound. Differences with in vivo studies may indicate an important role for metabolism or another kinetic component present in whole animals. Lastly, it was extremely difficult to meet all of the performance criteria for each assay. Some minor excursions from the performance criteria should be expected and permitted if the assay appears to function appropriately, and there may be some utility in modifying the performance criteria as more experience is gained with these assays.