Impact of chloroform exposures on reproductive and developmental outcomes: A systematic review of the scientific literature

We assessed the animal and epidemiological data to determine if chloroform exposure causes developmental and/or reproductive toxicity.


| INTRODUCTION
Chloroform, also known as trichloromethane or CHCl 3 , is a colorless, sweet-smelling, volatile liquid at room temperature (Agency for Toxic Substances and Disease Registry (ATSDR), 1997; National Toxicology Program (NTP), 2016; Office of Environmental Health Hazard Assessment (OEHHA), 2016a; World Health Organization (WHO), 2004). In the past, chloroform was used as an inhaled anesthetic, but this application has been largely phased out worldwide. Today, chloroform is commonly used in the production of various other chemicals, some of the most notable being fluorinated refrigerants and Teflon coatings. Chloroform also occurs as a byproduct of paper bleaching and is the most prevalent trihalomethane produced in water disinfection processes, which are regulated by the U.S Environmental Protection Agency (EPA). As described by the EPA, the routine use of chlorination to disinfect public water supplies since the early 1900s has led to a substantial decline in the incidence of waterborne diseases in the United States and worldwide (EPA, 1999). The levels of chloroform present in water as a result of disinfection processes depend on the existing chlorine and organic matter concentrations as well as on the pH and temperature of the water. Because of its physiochemical properties (particularly, its high volatility) and varied uses, human exposure to chloroform can occur via multiple routes of exposure. Releases to the air and water may result due to the chemical's use in various industrial manufacturing processes. Oral exposures may occur through consumption of chlorinated or chloraminated drinking water, as well as inhalation (and minor dermal) exposures due to showering and bathing. In the United States, the normal concentration of chloroform in air ranges from 0.02 to 0.05 parts per billion (ppb), and that in treated drinking water ranges from 2 to 44 ppb (ATSDR, 1997).
Chloroform has been previously evaluated for its potential to cause developmental and reproductive toxicity. In 2001, the EPA evaluated the available database of studies for chloroform, and concluded that although effects on reproduction and developmental were observed to occur in animals, "these effects occur at the same or higher doses as those that cause effects on the dam … suggesting that most of the effects are secondary to maternal toxicity" (EPA, 2001). Likewise, the WHO in 2004, evaluating much of the same data as the EPA, concluded that there were no indications of teratogenicity associated with chloroform exposure, and effects on the developing organism were seen only at doses shown to be maternally toxic. As such, neither organization has expressed significant concerns regarding developmental or reproductive toxicity due to chloroform exposure. In contrast, the State of California's OEHHA has listed chloroform as a reproductive toxicant under Proposition 65 since 2009 (OEHHA, 2016a).
The purpose of this analysis is to conduct a systemic assessment for persons of reproductive potential (population of interest) exposed to chloroform via either ingestion or inhalation (exposures of concern)-comparing those highly exposed to those either not exposed or minimally exposed (comparators)-for developmental and/or reproductive toxicity (outcomes). We thus conducted an assessment of the animal and epidemiological data for chloroform to determine if the available evidence indicates that chloroform causes developmental and/or reproductive toxicity. As a first step in this effort, a scoping exercise was undertaken to identify the primary developmental/reproductive concern(s) for chloroform (i.e., female reproductive toxicity, male reproductive toxicity, or developmental toxicity). Based on the results of this scoping effort, a more focused analysis was conducted on the developmental toxicity potential of chloroform. This evaluation involved separate analyses of the animal and epidemiological studies, followed by an integrated assessment of the data. We also compared the results of our assessment with the conclusions of the EPA, WHO, and OEHHA.

| Literature search
We conducted a search of the published scientific literature for human or animal studies that involved chloroform exposures and assessment of developmental or reproductive outcomes (male or female) using the PubMed database, available through the National Center for Biotechnology Information (NCBI) at the U.S. National Library of Medicine (NLM) (https://www.ncbi.nlm.nih.gov/pubmed). Human epidemiological studies examining the relationship between chloroform exposure and developmental toxicity, male reproduction, or female reproduction were identified on May 2, 2017 using the following PubMed query: (chloroform OR disinfection OR trihalomethane*) AND ("birth defect" OR "birth defects" OR hypospadia* OR "birth weight" OR birthweight OR birth* OR fetal OR fetus OR preterm OR pre-term OR spontaneous OR abortion* OR stillbirth* OR miscarriage* OR gestation* OR fertil* OR fecund* OR menstrual OR sperm OR semen).
Animal toxicity studies of chloroform exposure on developmental outcomes, male reproduction, or female reproduction were identified on May 23, 2017, using the following PubMed query: (chloroform) AND ("reproduction" OR "gestation" OR "lactation" OR "teratogen" OR "prenatal" OR "postnatal").
Search results were winnowed to a body of potentially relevant studies through a review of titles and abstracts. These were then compared against the body of studies reviewed in the August 2016 OEHHA assessment of chloroform for listing as a reproductive toxicant under the State of California Proposition 65 legislation (OEHHA, 2016a(OEHHA, , 2016b to ensure that no relevant studies previously reviewed by OEHHA were excluded or missed. The outcome of this literature search is discussed in greater detail below in the Results section.

| Scoping exercise
To narrow our assessment to those outcomes of greatest concern for a more extensive analysis, we undertook a scoping exercise whereby we reviewed the full-text articles of the epidemiologic and animal studies identified in our literature search to assess the primary developmental and reproductive adverse outcomes associated with chloroform exposure. The results of this scoping effort, which identified developmental outcomes as the area of primary focus, are described below under the Results section.

| Data quality assessment
The animal and human studies relevant to an assessment of developmental toxicity were subjected to assessments of data quality.
To assess the quality of the animal developmental toxicity studies, the Toxicological data Reliability Assessment Tool (ToxRTool) was utilized (Schneider et al., 2009;https://eurl-ecvam.jrc.ec.europa.eu/about-ecvam/archivepublications/toxrtool). The ToxRTool, developed by the European Union Reference Laboratory for Alternatives to Animal Testing (EURL ECVAM), is a software-based tool that allows one to apply a set of assessment criteria to a study to determine the reliability of the study's data in terms of both relevance and adequacy for regulatory decision-making. The tool was established based on the Klimisch reliability categorization system (Klimisch, Andreae, & Tillmann, 1997) and assigns a study to 1 of 3 possible categories (1 = reliable without restrictions; 2 = reliable with restrictions; or 3 = not reliable) based on the reported study methods and level of data documentation. Studies were evaluated by two authors in unison to assign a preliminary score. These scores were then reviewed by a third author and any discrepancies in scoring were discussed among the three authors to come to an agreement on the overall score assigned to each study.
To assess the quality of human epidemiologic studies of developmental outcomes, we considered the potential for selection, information, and confounding bias and their possible roles in influencing reported study results. This assessment was guided by the use of the revised Research Triangle Institute (RTI) item bank (Viswanathan, Berkman, Dryden, & Hartling, 2013). As an assessment tool, the RTI item bank provides a set of 13 questions used to identify confounding and evaluate the risk of bias in observational study designs; these questions were answered for the 30 observational studies considered (see Supporting Information Table S1). Independent evaluations were performed by two authors, and any rating discrepancies were resolved through discussion.

| Data synthesis
With regard to the laboratory animal data, our analysis emphasized the strengths and weaknesses of the studies, the influence of potential confounding factors on the study results (e.g., the appetite-suppressive effect of chloroform inhalation and the influence of excessive maternal toxicity), and the consistency of the findings both across the animal toxicity database and with results from the epidemiological literature.
In our overall assessment, the selected epidemiologic studies were evaluated to determine whether they indicated a consistent pattern of association between exposure to chloroform and adverse developmental outcomes in humans. The quality and risk of bias in each epidemiologic study was evaluated based on the following criteria: study population definition and recruitment approach; validity and reliability of exposure, outcome, and confounder assessment; potential for confounding and other types of bias, including selection and information bias; appropriateness of the statistical analysis; and presentation, reporting, and interpretation of results. In addition, the overall weight of the epidemiologic evidence regarding a causal association was considered in light of the Bradford Hill guidelines (Hill, 1965), with particular emphasis on strength, consistency, temporality, biological gradient, and plausibility. The Hill guidelines were used with the understanding that none of the guidelines, except for a temporal sequence where by a exposure precedes an outcome, is strictly required to establish causality, and that plausibility "depends on the biological knowledge of the day" (Hill, 1965) and may sometimes be controversial. The search of the literature for animal studies of chloroform exposures and developmental or reproductive outcomes produced 227 results as of May 23, 2017. A review of titles and abstracts identified 6 potentially relevant articles in addition to the 17 articles already reviewed in the 2016 OEHHA assessment that listed chloroform as a reproductive toxicant under the State of California Proposition 65 legislation (OEHHA, 2016a); these 23 articles underwent a full-text review. No relevant articles published after the OEHHA review (post-2015) were identified. Following full-text review, one article from the OEHHA review was excluded because it did not report specific doses. The remaining 16 studies from the OEHHA review were included; a single study not previously considered by OEHHA (Brown-Woodman et al., 1998) was also included. In total, 17 of 23 articles that underwent full-text review were included and 6 were excluded. Areas of focus for each of the studies identified in the literature search and reasons for exclusion of the six studies removed from assessment are listed in Table 1. In particular, studies that assessed exposure to trihalomethanes in general, but not exposed specifically to chloroform, and articles that provided no original or additional information such as perspective, commentary, or state of the science articles, were excluded.

| Epidemiologic studies
The search of the literature for human epidemiologic studies of chloroform exposures and developmental or reproductive outcomes produced 1,126 results as of May 2, 2017. From the 2016 OEHHA assessment of chloroform for listing as a reproductive toxicant under the State of California Proposition 65 legislation (OEHHA, 2016a), 100 references were identified, 92 of which were duplicates of those found in the PubMed literature search. From the two sources, 1,134 articles were identified. A review of titles and abstracts identified 84 potentially relevant articles that underwent a full-text review. Following full-text review, 35 studies considered in the 2016 OEHHA report were included; 2 articles eligible for but omitted from the OEHHA report were added; and 5 articles were included that were first published after the completion of the OEHHA review. In total, 42 of 84 articles that underwent full-text review were included and 42 were excluded (Table 2). In particular, studies that assessed exposure to trihalomethanes in general, but not exposure specifically to chloroform, were excluded, with the exception of two studies (also included by OEHHA) that specified that chloroform comprised 89% of the total trihalomethanes (Lewis et al., 2006;Lewis et al., 2007).
Three of the studies that addressed developmental toxicity also examined female reproductive endpoints (Balster & Borzelleca, 1982;Murray et al., 1979;NTP, 1988). Additionally, one subchronic study (Jorgenson & Rushbrook, 1980) and one carcinogenicity study (Heywood et al., 1979) were reviewed because these studies examined female reproductive organ histopathology following treatment. A twogeneration reproductive study  reported reduced mating indices; in contrast, an NTP reproductive toxicity study (NTP, 1988) did not report adverse effects on female reproductive parameters. In addition, no toxicity to the female reproductive organs  Schwetz, Leong, and Gehring (1974) X Include Thompson, Warner, and Robinson (1974) X Include Newell and Dilley (1978) X Include Burkhalter and Balster (1979  (e.g., ovaries and uterus) was reported among the subchronic/chronic studies reviewed (Heywood et al., 1979;Jorgenson & Rushbrook, 1980). Overall, the studies reporting these data were limited in number, but provided generally consistent findings. Five studies investigated male reproductive outcomes (e.g., sperm quality and/or motility, fertility, and reproductive organ weights). One study reported a significant increase in abnormal spermatozoa following 5 days of inhalation exposure (Land et al., 1981), but the fertility of these animals was not investigated. Another study reported increased absolute epididymal weights and degeneration of the epididymal ductal epithelium, but sperm motility, count, and morphology were unaffected (NTP, 1988). Independent calculation of the relative (to body weight) epididymal weights found no difference between these two groups. No toxicity to the male reproductive organs (e.g., testes) was reported among the subchronic/chronic studies reviewed (Heywood et al., 1979;Jorgenson & Rushbrook, 1980) or the other reproductive toxicity study . Overall, these studies generally reported equivocal or negative findings.
Of the included epidemiologic articles, the majority of studies (30 of 42) focused on developmental toxicity endpoints (e.g., low birth weight, fetal growth restriction, and/or birth defects). These endpoints are similar to those observed in the animal studies. Of the epidemiologic studies that examined female reproduction, outcomes assessed included altered menstruation , female fertility (Dahl et al., 1999), and pregnancy loss/stillbirth/spontaneous  (Waller et al., 1998;Savitz et al., 2006;King et al., 2000;Dodds et al., 2004;Wennborg et al., 2000;Savitz et al., 2005;Toledano et al., 2005;Iszatt et al., 2014 1 ). All studies of male reproduction examined sperm concentration, count, motility, and/or quality (Chang et al., 2001;Iszatt et al., 2013;Yang et al., 2016;Zeng et al., 2013;. Although some positive statistical associations for male or female reproductive outcomes were detected in some (but not all) epidemiologic studies of chloroform exposure, the literature on these endpoints is relatively sparse compared with that for developmental outcomes. Thus, the preponderance of epidemiologic studies on chloroform exposure, like that of the animal data, is focused on developmental endpoints. Based on the results of this scoping exercise, we determined that a more detailed analysis was warranted of the animal and human studies of chloroform exposure as they pertain to developmental outcomes. The decision to focus primarily on developmental outcomes is further supported by OEHHA's recent determination to classify chloroform under Proposition 65 based on developmental toxicity.
The remainder of the analysis below focuses on detailed assessment of the animal studies and human epidemiologic studies of chloroform exposure as they pertain to developmental outcomes.

| Data Synthesis
3.3.1 | Evaluation of the animal data A total of 11 animal studies involving exposure to chloroform and assessment of developmental outcomes was identified for inclusion in this evaluation. All of these studies were previously considered in the assessment by OEHHA with the exception of Brown-Woodman et al. (1998). Seven of the studies involved in utero only exposures and assessment of offspring (Table 3); dosing was either by oral gavage (Ruddick et al., 1983;Thompson et al., 1974) or via inhalation (Baeder & Hofmann, 1988, 1991Murray et al., 1979;Newell & Dilley, 1978;Schwetz et al., 1974). Most of these studies were conducted in the rat; however, Thompson et al. (1974) conducted separate experiments in both rats and rabbits and Murray et al. (1979) conducted in the mouse. Another four studies involved exposure of parental animals to chloroform prior to mating through gestation and lactation with postnatal assessment of offspring (Table 4); chloroform exposure was either by oral gavage or via drinking water. The data for one of these studies were reported in two separate publications (Balster & Borzelleca, 1982;Burkhalter & Balster, 1979). Lim et al. (2004) assessed rat pups after prenatal exposure only as well as after prenatal plus postnatal exposure. Information regarding the preparation of dosing solutions or analysis of dosing formulations or air concentrations of chloroform is provided in the tables. However, because of the age of these studies, few provided detailed information on dosing preparations. Only NTP (1988) reported analysis of oral dosing formation concentrations. For the purposes of cross-study comparisons, inhalation and drinking water exposures were converted to mg/kg/day doses using standard body weights, water consumption values, and inhalation minute volumes. These calculated doses are shown in parentheses in Tables 3 and 4. Finally, two additional studies reported the results of experiments involving in vitro chloroform exposures on the development of either whole rat embryos (Brown-Woodman et al., 1998) or zebrafish (Teixidó et al., 2015) (Table 5). All of the studies included in this evaluation are individually summarized in Supporting Information to this article.
The overall quality scores derived by application of the ToxRTool (Schneider et al., 2009;https://eurl-ecvam.jrc.ec. europa.eu/about-ecvam/archive-publications/toxrtool) are presented in Tables 3-5. The in utero only exposure studies typically scored on the borderline between "1" (reliable without restrictions) and "2" (reliable with restrictions). The two studies by Hofmann (1988, 1991) were judged to be the most reliable studies, meeting 20 and 21 of the 21 reporting criteria assessed, respectively. In contrast, the study by Newell and Dilley (1978) scored the poorest, meeting only 13 of the 21 reporting criteria. The most reliable reproductive toxicity studies that addressed developmental outcomes were deemed to be the NTP (1988) and  studies. In contrast, the studies by Burkhalter and Balster (1979), Balster and Borzelleca (1982), and Lim et al. (2004) were downgraded to "3" because the study designs were not appropriate for obtaining some of the substance-specific data needed for our analysis. Therefore, these studies were considered supporting information only.
Scoring of the in vitro studies used a different set of criteria from those used to assess the in vivo studies. Using these criteria, both in vitro studies were given scores of 1. However, despite these data being deemed reliable without restriction, the authors of the present analysis considered the data from these studies to be less informative in the overall assessment because they were obtained from organisms devoid of the placenta and maternal environment, the pharmacokinetics of which contribute greatly to the levels and time course of metabolites that may be experienced by a fetal organism. This issue is discussed in more detail below.

In utero-only developmental toxicity studies
The studies involving in utero-only exposures are listed in Table 3; specifics regarding dosing and duration are detailed therein. The most common maternal toxicity observations reported were significantly decreased maternal body weights and/or body weight gains in association with significantly decreased food consumption. Among the studies that used gavage dosing, these findings were observed at doses of ≥50 mg/kg/day in rats (Ruddick et al., 1983;Thompson   When empty uteri were stained, total pregnancies were 74% vs 44% (GD 1-7), 91% vs. 43% (GD 6-15), and 65% versus 60%  In GD 8-15 treatment group, 6 cleft palates occurred in a single litter. 9 of 10 fetuses displaying cleft palate had #BW (≥ 25% decrease) compared to controls (1.0

AE0.12)
Data on skeletal variations, ossification delays not shown Schwetz et al. (1974) ( Rats in the food-starved control group and in the 30 and 100 ppm groups were exposed in different experimental phase from the regular controls and 300 ppm group rats. Food-restricted control group allowed 3.7 g food/ day. Number fetuses per litter examined viscerally, skeletally not reported; skeletal staining with Alizarin red only. Due to large number of resorptions at 300 ppm, numbers of fetuses (~12) and litters (3) for evaluation were extremely low Newell and Dilley (1978) (2) Rat ( , 1974). In studies involving inhalation (Baeder & Hofmann, 1988, 1991Murray et al., 1979;Newell & Dilley, 1978;Schwetz et al., 1974), these effects were observed at exposures of ≥10 ppm (approximately equivalent to 11.8 mg/kg/day in the rat). The reason for the greater sensitivity to the anorexic effects of chloroform via inhalation versus oral bolus dosing is not known but may reflect reduced first-pass metabolism in the liver due to inhalation compared to oral dosing. Alternatively, it is possible that chloroform volatilized from the oral dosing solutions if they were not made up fresh daily. Clinical signs of toxicity in the chloroform-dosed maternal animals were reported rather inconsistently across the developmental toxicity studies. For example, in the oral gavage studies, Thompson et al. (1974) reported anorexia and/or diarrhea at all doses in the rabbit and hair loss and/or "rough appearance" in rats at 126 mg/kg/day, whereas Ruddick et al. (1983) did not report any clinical observations at doses of ≤400 mg/kg/day. Reductions in food consumption were also observed among the inhalation studies. Schwetz et al. (1974) reported marked anorexia in rats exposed to 300 ppm; Newell and Dilley (1978) also reported sleepiness and/or lethargy at exposures of ≥10.9 mg/mL (2,700 ppm). Consequently, both of these studies included food-restricted control groups to account for any effects related to reduced food consumption. In both studies, the food restricted groups had fetuses of reduced body weights. In Newell and Dilley (1978), the food restricted group had a more severe retardation in fetal body weights than in the high-dose chloroform group and also showed a greater amount of reduced ossification. In contrast, Hofmann (1988, 1991) and Murray et al. (1979) did not report any clinical signs of toxicity associated with chloroform exposure. The reason that no clinical signs were noted is not clear, as clinical signs were likely to have been present at the exposures administered in these studies. The absence of clinical signs from these reports could simply reflect the relative age of the studies, which were conducted at a time when studies often did not report on such parameters. Finally, although some of the studies reported maternal mortalities (Newell & Dilley, 1978;Ruddick et al., 1983;Thompson et al., 1974), none were attributed to treatment, except those in rabbits dosed with 50 mg/kg/day of chloroform, which showed significant signs of hepatotoxicity (Thompson et al., 1974).
Evaluation of the in utero-only exposure studies also revealed that high-dose chloroform exposures (≥200 mg/kg/ day) were associated with very early (peri-implantation) total litter losses. These findings were generally characterized as reduced pregnancy incidences in the affected studies. Ruddick et al. (1983) reported only 10 and 8 surviving pregnancies (out of 15 females per group) at oral gavage doses of 200 and 400 mg/kg/day, respectively; experiments done via gavage dosing at lower doses (Thompson et al., 1974) did not show similar effects. However, in the inhalation study by Schwetz et al. (1974), only 3 of 20 rats in the 300 ppm (~354 mg/kg/day) group were reported as pregnant (one of which had a completely resorbed litter); the lower exposure of 100 ppm (~118 mg/kg/day) had no effect on the number of pregnant dams. Similarly, Baeder and Hofmann (1988) reported only 12 surviving pregnant dams (of 20 dams in the dose group) at 300 ppm (~354 mg/kg/ day); however, lower inhalation exposures in both this and their subsequent study (Baeder & Hofmann, 1991) were without substantial effect. Murray et al. (1979) reported reduced pregnancy incidences in the mouse at 100 ppm (~303 mg/kg/day) with either mid-gestation (GD 6-15) or early gestation (GD 1-7) exposures; however, exposures begun after implantation (GD 8-15) were without effect. 2 In contrast, although resorptions were reported to be increased at the highest exposure in the study by Newell and Dilley (1978) (20.1 mg/L;~839 mg/kg/day), no substantial effect on pregnancy incidences were reported at this or lower doses. However, exposure in this study was for only 1 hr per day (versus 7 hrs per day in the other inhalation studies of chloroform) and exposure started one day later than in the other inhalation studies (GD 7 versus GD 6). These data suggest that two factors determined whether early total litter losses occurred as a result of chloroform exposure. First, peri-implantation losses were only evident at extremely high doses-doses well above those at which other indications of maternal toxicity (clinical signs of toxicity, reduced food consumption, and decreased body weight gains) were observed. Second, peri-implantation losses were only observed when dosing started on GD 6 or earlier. Studies for which dosing was initiated after GD 6 did not report reduced pregnancy incidences, indicating that the reduced pregnancy indices were likely due to disruption of the implantation process. Further, studies that included staining of the uterus to determine pregnancy status often found indications that the implantation process had begun, but was disrupted (Baeder & Hofmann, 1988, 1991Murray et al., 1979). For example, Hofmann (1988, 1991) describe 8 of 20 high-dose impregnated dams as having only "empty implantation sites" at necropsy. The mechanism by which chloroform mediates this effect, however, is not known, although chloroform has been shown to reach the uterine compartment shortly after inhalation exposure (Danielsson, Ghantous, & Dencker, 1986). Other studies have shown that chloroform affects vascular function (McCarthy, 1969), causing vasodilation. Thus, we hypothesize that chloroform is likely affecting blood vessels in the endometrium. Although the process of implantation in the rat is associated with increased uterine blood flow to the areas of implantation (McRae & Heap, 1988; Tawai & Rogers, 1992), how the increased uterine blood flow associated with chloroform treatment might affect the process of implantation is yet unknown.
Fetal body weights were reduced at doses as low as 300 ppm (35.4 mg/kg/day) in rats (Baeder & Hofmann, 1988;Schwetz et al., 1974). Reductions in crown-rump length were also observed, sometimes at lower doses, but this is a less sensitive measure and typically did not show a clear dose response. Fetal weight reductions were also observed in mice at the single exposure level tested, 100 ppm (approximately 303 mg/kg/day; Murray et al., 1979). In rabbits, reduced fetal weights were reported at 20 and 50 mg/kg/day, but not at 35 mg/kg/day (Thompson et al., 1974); thus, the relation of reduced fetal body weight to treatment in rabbits is unclear. It is important to note, however, that the lowest dose at which fetal weight reductions were observed in rats (300 ppm) (Baeder & Hofmann, 1988;Schwetz et al., 1974) was higher than the lowest dose at which maternal toxicity was reported (10 ppm) (Baeder & Hofmann, 1991). Further, this observation was consistent across the available rat studies. Reductions in fetal size have been reported as a result of maternal food restriction (Nitzsche, 2017) and were also observed in the foodrestricted control groups of Schwetz et al. (1974) and Newell and Dilley (1978). This suggests that the fetal weight reductions are likely a consequence of maternal toxicity.
Alterations or delays in bone ossification were also observed in most, but not all, of these studies. In rats, these included delayed ossification of the skull bones and sternebrae, lumbar ribs, wavy ribs, and "interparietal deviations" (Baeder & Hofmann, 1988, 1991Ruddick et al., 1983;Schwetz et al., 1974;Thompson et al., 1974). Delayed ossification of skull bones was observed in rabbits at the low and mid doses (but not the high dose) in Thompson et al. (1974) and in mice (in combination with sternebral ossification delays) at 100 ppm (Murray et al., 1979). None of these findings are considered skeletal malformations; rather, they are developmental delays commonly seen in fetuses with reduced body weights and considered transient findings (Carney & Kimmel, 2007;Kimmel, Garry, & DeSesso, 2014). Indeed, these skeletal findings were seen at the same or high doses than those at which reductions in fetal weights were observed and are commonly associated with reductions in fetal weight due to reduced maternal food consumption (Nitzsche, 2017).
Finally, only the studies by Schwetz et al. (1974) and Murray et al. (1979) reported any malformations in offspring. In Schwetz et al. (1974), some malformations related to caudal dysplasia were reported at 100 ppm, although the number of affected fetuses is not reported. Similar findings were neither reported at the next lower exposure of 30 ppm, nor at 300 ppm although the number of fetuses available for evaluation was severely limited. These findings were not confirmed in the other rat inhalation studies of chloroform (Baeder & Hofmann, 1988, 1991. In Murray et al. (1979), offspring from the group of pregnant mice exposed to chloroform on GD 8-15 exhibited an increased incidence of cleft palate in combination with a substantial reduction in mean fetal body weight. Specifically, 10 fetuses from 4 litters (6 in a single litter) in the chloroform-treated group presented with cleft palate. One cleft palate occurred in a concurrent control fetus and three cleft palates occurred in fetuses from one of the other control groups; however, no cleft palates occurred in fetuses from either of the other chloroform exposed groups (treatment on GD 1-7 or GD 6-15). Further, no cleft palate was seen in any of the rat developmental toxicity studies, nor in , which is a multigeneration study with a teratology screen. The latter study was conducted in mice at multiple doses, some above those administered in Murray et al. (1979). Thus, the cleft palate seen in Murray et al. (1979) was not confirmed by other studies at the same or higher exposures. Further, the malformations seen in Schwetz et al. (1974) and Murray et al. (1979) occurred only at high maternally toxic doses.
The no observed adverse effect levels (NOAELs) tended to be substantially higher in the gavage dosing studies compared to the inhalation studies. Based on the inhalation studies, the NOAEL for maternal toxicity is 4 mg/kg/day based on a chloroform inhalation exposure concentration of 3 ppm in the rat study of Baeder and Hofmann (1991). At the next higher exposure concentration (10 ppm; 12 mg/kg/day) and exposures above this level, maternal toxicity in the form of reduced food consumption, maternal body weights, and body weight gains were consistently observed. Other indications of maternal toxicity, including clinical signs and reduced pregnancy incidences, are seen only at the highest exposure levels used in these studies. The lowest LOAEL for fetal toxicity, from the studies of Schwetz et al. (1974) and Hofmann (1988, 1991), is 35 mg/kg/day based on an inhalational exposure concentration of 30 ppm. Thus, the next lower dose at which no fetal effects were observed was 10 ppm. This dose, which is approximately equivalent to 12 mg/kg/day based on minute volume and breathing rate, would be considered the developmental NOAEL. For human health risk assessment, it is important to note that these fetal effects occurred at doses higher than those at which maternal toxicity was observed.

Reproductive toxicity studies
Four reproductive toxicity studies of chloroform were identified (Table 4). Two of these studies treated pregnant animals with chloroform and investigated potential effects in offspring; however, standard parameters regarding reproductive or developmental effects were not assessed (e.g., no parental toxicity data reported; limited pup data reported); as such, these studies were assigned ToxRTool scores of only 2/3, which restricts their use to supporting information only. The first study assessed developmental neurobehavior in pups following oral gavage of the parental animals to 31.1 mg/kg/ day of chloroform beginning prior to mating though gestation and lactation (Balster & Borzelleca, 1982;Burkhalter & Balster, 1979); no adverse effects were reported. Further, endpoints related to parental effects were not assessed. The other study (Lim et al., 2004) focused on the assessment of diabetogenic parameters in offspring following exposure of parental animals to drinking water containing 75 μg/L chloroform (equivalent to 0.02 mg/kg/day) beginning prior to mating and continuing either through gestation only or through both gestation and lactation. Again, standard parameters relating to fertility, reproduction, and offspring development were not examined and parental systemic toxicity was not assessed. Although significantly decreased pup body weights were observed on PND 21 in both treated groups, the number of animals per dose group was very low (only 3 pups per litter from only 4 dams per group). Further, the investigators did not report whether these data were assessed based on the litter rather than the fetus. Thus, these data are not considered reliable for assessing the developmental toxicity of chloroform.
Two other studies investigated reproductive toxicity in mice after administration of chloroform via either gavage or drinking water exposure NTP, 1988); in both of these studies, exposure began prior to mating, continued through generation of multiple litters, and included assessment of multiple developmental parameters in the offspring. In , although a few spurious findings were observed at a drinking water exposure of 0.1 mg/mL (equivalent to 24 mg/kg/day), this exposure level is considered the study NOAEL for both parental toxicity (based on enlarged livers) and developmental effects (based on reduced F2b postnatal weights, reduced F1b viability index, and reduced F1a lactation index). It should be further noted that the mouse pups in this study likely began consuming the drinking water prior to weaning. Due to their low body weights, the resulting direct doses to the offspring likely were considerably higher than those received by the parental animals; further, these direct exposures were in addition to those received by the offspring via lactation. Thus, the NOAEL for developmental effects established in this study is considered to be a conservative estimate.
In the NTP (1988) study at the nominal dose of 50 mg/ kg/day (actual dose of 41 mg/kg/day), significantly increased epididymal weights were reported in F1 offspring of treated mice. Mild vacuolar degeneration of the epididymal epithelium was reported in some of these animals. However, no effects on fertility of the F1 offspring were reported at this dose. As such, the study investigators considered this finding to be possibly unrelated to chloroform treatment. Other changes at this dose were not considered adverse, and therefore not toxicologically relevant. Also, no parental toxicity was reported. Consequently, the NOAEL for parental toxicity was considered to be the highest dose tested-50 (41) mg/kg/day. The NOAEL for offspring effects was also considered to be 50 (41) mg/kg/day. This NOAEL for developmental effects is similar to that determined in the study of .

In vitro developmental assays
Only two studies that used in vitro alternative assays to address the potential developmental effects of chloroform were identified (Table 5). Brown-Woodman et al. (1998) reported effects of chloroform on the development of Sprague Dawley rat embryos in culture at 2.06 μmol/mL. Teixidó et al. (2015) reported altered development of zebrafish embryos in the presence of chloroform, albeit at effective concentrations (≥0.63 mM) significantly lower than that identified in the study by Brown-Woodman et al. (1998). Additionally, Teixidó et al. (2015) reported potential teratogenic effects associated with exposure.
While these studies are potentially useful from the perspective of generating hypothesis-driven mechanistic research data, they are of little use for the purposes of conducting a human health risk assessment to understand the potential for developmental toxicity due to chloroform exposure. For one, these investigations were conducted using in vitro systems devoid of the maternal organism and placenta, both of which play important roles in regulating the disposition of chloroform that is experienced by the developing organism. For example, studies have shown that, following a single oral exposure, most of the chloroform is eliminated from the human body via exhalation within 8 hours (Fry, Taylor, et al., 1972), and in the pregnant animal, chloroform only reaches the placenta and fetuses for short intervals following an inhalation exposure (Danielsson et al., 1986). Further, chloroform is extensively metabolized in rodents (Brown et al., 1974). Thus, because the metabolic processes and pathways of elimination for chloroform are almost completely absent from these in vitro systems, it cannot be concluded that the results obtained using these systems are necessarily indicative of what would be observed in vivo. In fact, the "teratogenic effects" reported in the study of Teixidó et al. (2015) are not seen in the whole animal studies.
The reductions in crown-rump length and somite number were observed in Brown-Woodman et al. (1998) occurred in the absence of the maternal animal and might falsely suggest that these effects are mediated directly on the offspring, independent of the maternal animal. However, it is more likely that the embryotoxic effects observed in Brown-Woodman et al. (1998) are due to a toxic effect of chloroform on vasculogenesis in the yolk sac, which would interfere with the delivery of nutrients to and removal of waste products from the developing organism in vitro. These reductions in yolk sac blood vessel development in the present of chloroform are consistent with our previously stated hypothesis that chloroform is affecting the blood vessels in the endometrium, resulting in reduced implantation success in the whole animal studies.

Summary of the animal data
A total of 14 animal studies that addressed developmental endpoints were included for consideration in our data evaluation. Most of the in utero developmental toxicity studies were conducted in the rat and involved the administration of chloroform either via gavage dosing or inhalation exposure. The most common findings of maternal toxicity reported were significantly decreased maternal body weights and/or body weight gains in association with significantly decreased food consumption. These observations were consistently observed at the same or lower doses than those at which fetal toxicity was reported. Higher exposures (≥200 mg/kg/day) were associated with very early (periimplantation) losses, which may be due to the vasoconstrictive effects of chloroform. Based on data from multiple inhalation studies, we considered the NOAEL for maternal toxicity to be 3 ppm (4 mg/kg/day).
Fetal body weight reductions, as well as alterations or delays in fetal bone ossification, were consistently observed in the in utero developmental toxicity studies of chloroform. In all of the studies, such effects were observed at doses equal to or above those at which maternal toxicity was seen. Ultimately, the offspring effects of chloroform exposure occur at doses also associated with maternal toxicity. Based on review of these data, we consider the NOAEL for offspring effects to be 10 ppm or 12 mg/kg/day (Baeder & Hofmann, 1991).
The findings from the reproductive studies are generally consistent with those from the in utero developmental toxicity studies in that the offspring effects occurred at the same or higher doses than those at which maternal/parental toxicity was observed. The NOAELs identified in these studies are generally higher than those identified in the in utero developmental toxicity inhalation studies. The exception is the NOAELs from Lim et al. (2004), which are not considered reliable for assessing the developmental toxicity potential of chloroform because standard parameters related to offspring development were not assessed. Finally, the findings from the two in vitro developmental toxicity assays provide little to no value to the overall evaluation due to the absence of the maternal organism and placenta in the experimental system. In conclusion, the data from these studies show that the developmental effects of chloroform exposure occur at higher doses than those than cause maternal toxicity.

| Evaluation of the epidemiologic data
Thirty human epidemiologic studies that investigated the potential relation between chloroform exposure and developmental outcomes underwent a detailed scientific assessment. Twenty-five had been reviewed previously by OEHHA (2016a). There were five studies not included in the OEHHA review (Brender et al., 2014;Cao et al., 2016;Kogevinas et al., 2016;Levallois et al., 2016;Wright et al., 2017). Eleven studies were prospective cohort studies, 11 studies were retrospective cohort studies, and 8 were case-control studies (7 population-based and 1 hospital-based casecontrol study). The design and results of these studies are summarized in Tables 6-9 and Supporting Information  Table S1.
To gauge the overall quality of the available epidemiologic data on developmental outcomes associated with chloroform exposure, the identified studies were assessed using the RTI revised item bank on risk of bias and precision of observational studies (Viswanathan et al., 2013). The RTI item bank provides a framework for evaluating the validity of observational epidemiologic studies based on 13 questions that are designed to assess potential areas of bias (selection bias, information bias, and confounding bias) and error that can limit the value of such studies as a basis for causal inference. Numerical grading and scoring systems to evaluate epidemiologic literature tend to provide inconsistent weighting and poorly predict study results (Greenland & O'Rourke, 2001;Sanderson, Tatt, & Higgins, 2007). The RTI item bank is not a grading or scoring system for epidemiologic studies. Rather, it is a systematic way of qualitatively examining each study that allows for the identification of key flaws and shortcomings of each study as well as patterns of bias across studies. These patterns are highlighted in this review. The results of this evaluation are shown in Supporting Information Table S1. Some of the factors considered to have the most impact on the reliability of the epidemiologic studies, and therefore of greatest relevance for this assessment, are discussed in more detail below. In this section, the term "significant" is used in reference to statistical significance (by convention, defined as p-value <.05), and the term "association" is used in reference to any relationship between chloroform and developmental outcomes that was found to be statistically significant, unless otherwise specified.

Integrated exposure assessment
Across the epidemiologic literature on chloroform exposure and developmental outcomes, studies can be distinguished based on their method of exposure assessment. One of the major methodological shortcomings of many of the studies included in this evaluation is the lack of adequate exposure assessment. Only 13 (Botton et al., 2015;Cao et al., 2016;Costet et al., 2012;Danileviciute et al., 2012;Grazuleviciene et al., 2011;Grazuleviciene et al., 2013;Iszatt et al., 2011;Kogevinas et al., 2016;Levallois et al., 2012;Levallois et al., 2016;Savitz et al., 2005;Smith et al., 2016;Villanueva et al., 2011) of the 30 studies assessed in this review incorporated individual-level exposure information (e.g., frequency and duration of showering, use of a water filter, and bottled versus tap water consumption) when estimating a mother's exposure to chloroform during pregnancy; the other 17 studies relied solely on chloroform sampling data in local or regional water, and in one case, in air (Brender et al., 2014), for chloroform exposure estimation. Thus, these studies were semi-ecologic in nature, meaning that they relied on group-level geographic exposure data. Ecologic studies are generally considered by epidemiologists to be among the weakest of epidemiologic study designs because associations observed for groups cannot be assumed to hold for individuals (Morgenstern, 1995). That is, average exposure levels across a group may differ substantially from those for individuals within that group, leading to gross misclassification of exposure, and producing substantial bias of estimated associations in either direction.
As stated by the authors of one study that used ecologic data on chloroform exposure, "A potential limitation [is] the lack of individual exposure data. THMs are volatile, and exposure can occur from activities such as showering and bathing. The THM exposure for a 10-minute shower may be equivalent to drinking 2 L of water" (Summerhayes et al., 2012). Thus, failure to account for showering/bathing and water consumption habits could substantially reduce the accuracy of estimated chloroform exposure. Furthermore, only one study assessed individualized exposure to chloroform via a biomarker ; the rest estimated exposure based on models that have not been validated by comparison with physiological biomarkers of chloroform exposure.
Overall, studies that incorporated individualized information into chloroform exposure assessment were considered more likely to obtain valid results because they had a more accurate exposure estimate. Thus, studies that included this information should be considered more pertinent for making any causal inferences. To this point, we considered all studies collectively as well as studies with individualized exposure assessment independently. Throughout the remainder of this evaluation, we refer to exposure measures that incorporate individual data on water use or biomarker information as "integrated chloroform exposure." Because exposure assessment is so fundamental to the validity of epidemiologic research, we classified study quality based largely on whether exposure assessment was integrated using individualized data. That is, we classified the studies with integrated exposure estimates as being relatively strong methodologically (Botton et al., 2015;Cao et al., 2016;Costet et al., 2012;Danileviciute et al., 2012;Grazuleviciene et al., 2011;Grazuleviciene et al., 2013;Iszatt et al., 2011;Kogevinas et al., 2016;Levallois et al., 2012;Levallois et al., 2016;Savitz et al., 2005;Smith et al., 2016;Villanueva et al., 2011)    (1) <5 (2) 5-<10 (3) 10-<15 (4)

No association (P water concentration)
No association (T1 integrated) No association (any cardiovascular defect, P water concentration) No association (conotruncal heart defect, P water concentration) No association (transposition of the great arteries, P water concentration) No association (tetralogy of Fallot, P water concentration) No association (atrial septal defect, P water concentration) No association (ventricular septal defect, P water concentration) No association (pulmonary stenosis, P water concentration) Neural tube defect   Lewis et al., 2006;Lewis et al., 2007;Porter, PutnaD, Hunting, & Riddle, 2005;Rivera-Núñez & Wright, 2013;Summerhayes et al., 2012;Toledano et al., 2005;Wennborg et al., 2000;Wright et al., 2004;Wright et al., 2017;Zhou et al., 2010). While exposure assessment is one of many aspects that influence the quality of any study, given the critical importance of exposure assessment in this subject matter, we determined that the use of an integrated (i.e., individualized) exposure assessment method should be required of any study to be considered informative on the potential causal relationship between chloroform exposure and developmental outcomes (Koepsell & Weiss, 2003;Morgenstern, 1995). Of note, outcome misclassification is seldom a major concern in epidemiologic research on birth outcomes, because such measures (e.g., preterm birth, smallfor-gestational age, and low birth weight) are routinely and easily measured based on standard practices with little potential for systematic error (i.e., bias). When mismeasurement does occur, it is typically in estimating gestational age TABLE 8 Summary of epidemiological evidence for associations between integrated chloroform exposure and developmental outcomes. Bold font indicates adverse associations; underlined font indicates favorable associations. "Integrated" refers to exposure assessment that incorporates water concentration and behaviors to account for dermal absorption, ingestion, and inhalation a based on the date of last menstrual period (Dietz et al., 2007;Howards, Hertz-Picciotto, Weinberg, & Poole, 2006). The use of ultrasound measures taken early in pregnancy can minimizes this mismeasurement (Howards et al., 2006). Even if this correction does not occur, misclassification of gestational age is unlikely to differ systematically based on chloroform exposure level as prior research has shown chloroform not to be associated with menstrual cycle length and timing (Howards et al., 2006;Windham et al., 2003).

Biomarker study
Estimation of chloroform exposure in 29 of the 30 studies included in this review was highly dependent upon self-report of behaviors such as showering, bathing, hand-washing, use of a water filter, amount of tap water consumption, and/or non-individualized regional data on chloroform concentrations in water. Biological markers (biomarkers), by contrast, have the advantage of being independent of self-reporting error or bias and can potentially provide an objective measure of average exposure during a given time period. However, biomarkers can be limited for exposure assessment, for example, if they provide information on only recent exposure when long-term exposure is of etiologic interest; if they do not capture all sources or routes of exposure; or if they are not specific to the exposure of concern (Grandjean, 1995;Rothman, Greenland, & Lash, 2008).
Only one epidemiologic study assessed exposure to chloroform using a biomarker-namely Cao et al. (2016). This study used maternal whole blood samples taken at the time of delivery prior to the infant's birth as a basis for estimating chloroform exposure. The median concentration of chloroform in the blood was 50.7 ng/L. Higher levels of chloroform in maternal prenatal blood were not significantly associated with birth weight, birth length, or gestational age at birth in this study (adjusted mean difference in birth weight between highest and lowest tertiles of chloroform = −48.23 g; 95% confidence interval (CI) = −103.64, 7.19).

Occupational exposure
Studying the risk of developmental outcomes among mothers exposed to chloroform as part of their occupation can be particularly insightful, because occupational exposure levels tend to be higher than those encountered in nonoccupational settings. Any health effects may be more readily detected in highly exposed populations than those with lower levels of exposure that may be too subtle to cause overt effects. However, the occupational context is not always generalizable to the public and may better inform health and safety measures to be taken only for those exposed at the workplace. The single study that examined occupational chloroform exposure in relation to developmental outcomes (Wennborg et al., 2000) was conducted in Sweden. Study investigators identified potentially chloroform-exposed mothers using workplace information from government employment records, and contacted them with a request to complete a questionnaire to assess occupational exposure. Specifically, mothers who worked in biomedical research laboratories at major Swedish universities, along with a comparison group of mothers who worked in non-laboratory departments at the same universities, were Summary of epidemiological evidence for associations between chloroform exposure and developmental outcomes among studies conducted in the United States. Bold font indicates adverse associations; underlined font indicates favorable associations. "Integrated" refers to exposure assessment that incorporates water concentration and behaviors to account for dermal absorption, ingestion, and inhalation a asked about whether they performed laboratory work with chloroform during pregnancy. This study found no significant difference in mean birth weight between the infants of mothers exposed to chloroform at work and those that were not exposed to chloroform at work (adjusted mean difference = 27 g; 190).
No epidemiologic study to date has examined the relationship between paternal occupational or environmental exposure to chloroform and developmental outcomes. Additional research may be warranted to address this gap in the scientific literature.

| Summary of the epidemiologic data by developmental endpoint Birth defects
Five epidemiologic studies investigated the risk of birth defects in association with chloroform exposure (Brender et al., 2014;Dodds & King, 2001;Grazuleviciene et al., 2013;Iszatt et al., 2011;Wright et al., 2017) (Table 7). Of the 30 associations evaluated in these studies, 25 associations were statistically null, and only 5 statistically significant associations were detected. Dodds and King (2001) observed a significantly increased risk of chromosomal anomaly when comparing the second highest level of exposure to the lowest level (risk ratio (RR) = 1.9; 95% CI = 1.1, 3.3). No significant increase in risk was evident when comparing the highest exposure category to the lowest categories, such that no positive exposure-response gradient was observed. In light of the absent exposure-response trend, and because chloroform has not been found to be genotoxic (EPA, 2001), this association is likely spurious.
Brender et al. (2014) is unique among the epidemiologic studies, as it examined ambient air chloroform exposure attributed to industrial sources. This study estimated associations of airborne chloroform exposure with various limb deficiencies, oral clefts, neural tube defects, and cardiovascular defects that were ascertained from the Texas Birth Defects Registry. Exposure assessment used a modified emission weighted proximity model that took into consideration emissions from all sources within a 10-km buffer around each maternal residence, and the amounts of chemical released from each source during each pregnancy. The model produced a risk value for each mother-infant pair. Having a risk value greater than 0 for ambient chloroform exposure was significantly associated with a 10% increase in the odds of having an infant with a septal heart defect (odds ratio (OR) = 1.10; 95% CI = 1.01, 1.19). A risk value greater than 0 for ambient chloroform exposure was associated with 1.40 times the odds of having an infant with any form of a neural tube defect (95% CI = 1.04, 1.87) and with 1.55 times the odds of having an infant with spina bifida (95% CI = 1.10, 2.20). When parsing risk values into intensity levels (four categories of the risk value), Brender et al. (2014) found a significant positive association of mild intensity ambient chloroform exposure with spina bifida occurrence (OR = 1.74; 95% CI = 1.02, 2.99); however, moderate and high levels of exposure were not significantly associated with spina bifida occurrence, indicating the absence of a positive exposure-response trend. Wright et al. (2017) examined seven different congenital heart defects, none of which were associated with chloroform concentrations in water. In contrast, two relatively high-quality studies of birth defects used integrated exposure measures of chloroform (Grazuleviciene et al., 2013;Iszatt et al., 2011); both showed no significant association between chloroform  Wright et al. (2004) No association (T2 water concentration) Suggestive evidence of adverse association: OR = 1.04 (1.00, 1.10) for fourth quintile; fifth quintile is similar (T3 water concentration) Adverse association: OR = 1.11 (1.04, 1.17) for highest decile versus below median, possible trend (T3 water concentration) Abbreviations: CI = confidence interval; OR = odds ratio; NA = not applicable; P = pregnancy (entire); RR = relative risk; T1, T2, T3 = trimesters 1, 2, and 3; TCM = trichloromethane, i.e., chloroform. a When results are available for continuous and categorical classifications of the same exposure, continuous results are prioritized. b Most studies used a 10 μg/L increase in exposure as the unit of analysis, so a 25 μg/L (Summerhayes et al., 2012) or 59 μg/L (Wright et al., 2004) increase may show more pronounced associations. Summerhayes et al. (2012) also compared the highest decile with the lowest decile, which showed similarly adverse associations of a slightly stronger magnitude (ORs between 1.07 and 1.12). Wright et al. (2004) also compared the highest decile with below-median values, resulting in similarly adverse associations (−18 [95% CI: −26, −10]). c Small for gestational age has cut-off at 5th percentile of adjusted birth weight, whereas most other studies use 10th percentile. exposure and cardiovascular, musculoskeletal, or urogenital birth defects (Table 8).
Considering the available published epidemiologic studies on chloroform exposure and birth defects, we conclude that the weight of the evidence does not demonstrate an association between chloroform exposure during pregnancy and a risk of birth defects. This conclusion is based on the paucity of statistically significant associations in light of the number of associations tested, the lack of evidence for a positive exposure-response relationship, and the lack of a significant association in both studies with integrated exposure. Given that birth defects comprise a highly heterogeneous group of outcomes that should be examined separately for etiologic effects, additional insight could potentially be gained from larger studies with integrated exposure assessment that examine specific birth defects individually.

Age: Preterm birth or gestational age at birth
Preterm birth (birth prior to 37 weeks' completed gestation) is a commonly studied outcome measure in developmental epidemiology. Variations of this outcome measure include very preterm birth (birth prior to 32 weeks completed gestation) and gestational age (a continuous measure based on time since last menstrual period or on ultrasound guided estimation). Ten epidemiologic studies, representing seven independent populations, investigated the risk of preterm birth or shorter gestational age in association with exposure to chloroform (Table 7). Three of these studies were based on separate, but overlapping, samples of Massachusetts residents during different study periods (Lewis et al., 2007;Rivera-Núñez & Wright, 2013;Wright et al., 2004). Five of the independent studies found no significant association between chloroform exposure and preterm birth or shorter gestational age Hinckley et al., 2005;Kogevinas et al., 2016;Kramer et al., 1992;Villanueva et al., 2011). Rivera-Núñez and Wright (2013) observed a significant association with higher risk of preterm birth when comparing the 3rd quintile of chloroform levels in water to the 1st (lowest) quintile (OR = 1.08; 95% CI = 1.02, 1.14), but no clear exposure-response trend was observed across the five quintiles. Four studies showed a significantly reduced risk of preterm birth associated with higher chloroform exposure (Wright et al., 2004 [OR = 0.90;95% CI = 0.84, 0.97 comparing >90th to <50th percentile of exposure]; Savitz et al., 2005 [OR = 0.56; 95% CI = 0.32, 0.96 comparing 3rd to 1st quintile of exposure]; Lewis et al., 2007 [hazard ratio (HR) = 0.95; 95% CI = 0.91, 0.99 per 10 mcg/L increase]; Costet et al., 2012 [OR = 0.5; 95% CI = 0.3, 0.9]), including possible inverse exposureresponse trends in three of these studies (Lewis et al., 2007;Savitz et al., 2005;Wright et al., 2004). Of the two studies that examined gestational age as a continuous measure, one found no significant association with chloroform exposure  and another found a significant positive association with the >90th percentile of exposure, but not with continuous exposure (Wright et al., 2004). No studies examined associations with post-term birth (≥42 weeks) explicitly.
Among the five relatively high-quality studies of preterm birth or continuous gestational age in association with integrated chloroform exposure (i.e., incorporates water concentration and behaviors from dermal absorption, ingestion, and inhalation; see Table 8), no study found a significant positive (i.e., adverse) association with preterm birth or gestational age, and one study (Savitz et al., 2005) found a significant inverse (i.e., protective) association.
The preponderance of the epidemiologic evidence on chloroform exposure and preterm birth suggests no association, or even a potential protective association, between chloroform exposure and preterm birth. Evidence in favor of a protective effect includes a significant inverse association detected in four of nine studies, including three studies (one with integrated exposure assessment) that provided evidence in support of an inverse exposure-response relationship. A biologically plausible mechanism to explain a beneficial effect of chloroform exposure on preterm birth or gestational age is lacking, although the absence of such a mechanism does not preclude a causal effect. Therefore, the observed inverse association detected in multiple studies could be due to confounding by a protective factor (e.g., higher socioeconomic status and healthier lifestyle) associated with chloroform exposure or due to a beneficial effect of chloroform itself (e.g., decreased risk of maternal waterborne infection).
Fetal growth: Low birth weight or continuous birth weight Low birth weight, defined as weight at birth less than 2,500 g, has historically been used as a straightforward and well-recorded indicator of poor health at birth and increased risk of infant mortality. Nine studies from eight independent populations examined chloroform exposure as it relates to the risk of low birth weight (including low birth weight at term, i.e., restricted to full-term births) or very low birth weight (<1,500 g) (Table 7). Six studies showed no significant association of low birth weight (in various forms) with the level of chloroform exposure during pregnancy (Danileviciute et al., 2012;Hinckley et al., 2005;Kogevinas et al., 2016;Kramer et al., 1992;Lewis et al., 2006;Villanueva et al., 2011). Three studies, in contrast, found a significantly increased risk of low birth weight in association with chloroform exposure (Grazuleviciene et al., 2011;Iszatt et al., 2014;Toledano et al., 2005); all three of these studies provided evidence for a positive exposure-response relationship. Toledano et al. (2005) found an increased risk of low birth weight in association with higher chloroform water concentrations in a U.K. sample of mothers (OR = 1.10; 95% CI = 1.07, 1.13). The association with very low birth weight was somewhat attenuated and of borderline statistical significance (OR = 1.07; 95% CI = 0.99, 1.15). Iszatt et al. (2014) observed that mothers exposed to an enhanced coagulation water treatment intervention to reduce disinfection by-products in water (i.e., those classified as having lower chloroform exposure) were less likely to have an infant born with very low birth weight, including after the analysis was restricted to infants born at term (16% decrease; 95% CI = 24, 8%); similar results were observed among low-birth-weight infants (9% decrease; 95% CI = 12, 5%). Grazuleviciene et al. (2011), a study based in Lithuania, observed a statistically significant 10% increase in the odds of low birth weight for an increase in chloroform exposure of 0.1 μg/day during pregnancy (1.10 per 0.1 μg/d increase; 95% CI = 1.01, 1.19) across all routes of exposure, that is, drinking, showering, and so forth. By contrast, a study using a subset of data from the same population (Danileviciute et al., 2012), including participants who had provided samples for genetic analysis, did not observe a significant association between chloroform exposure and risk of low birth weight overall. However, Danileviciute et al. (2012) observed an increased odd of low birth weight in the absence of a certain maternal polymorphism (GSTM1) that could lend an individual more susceptible to environmental exposures.
Considering the available epidemiologic studies on chloroform exposure and low birth weight, we found that three studies found a positive association between chloroform exposure and increased risk of low birth weight, along with a possible exposure-response trend. However, two of the three studies had core limitations: Toledano et al. (2005) had insufficient control for confounding factors in their analysis and Iszatt et al. (2014) was an ecologic study. The majority of studies-including three of four studies with integrated exposure assessment-did not find any clear association. Given the methodological limitations of most of the available studies, and the small number of epidemiologic studies with higher quality exposure assessment, no firm conclusion can be drawn regarding any causal effect of chloroform on low birth weight.
Six studies observed no significant association between chloroform exposure and birth weight when it was considered as a continuous variable Hoffman et al., 2008a;Kogevinas et al., 2016;Savitz et al., 2005;Villanueva et al., 2011;Wennborg et al., 2000), whereas seven studies found some significant associations of chloroform exposure with decreases in birth weight (Grazuleviciene et al., 2011;Rivera-Núñez & Wright, 2013;Smith et al., 2016;Summerhayes et al., 2012;Wright et al., 2004;Zhou et al., 2010). Most of these studies assumed a linear relationship between chloroform exposure and birth weight; assuming this to be the correct form of the relationship, a significant association implies an exposure-response trend.
Of the 13 studies, 6 had integrated exposure measures (Table 8). Of these, four studies showed no significant association Kogevinas et al., 2016;Savitz et al., 2005;Villanueva et al., 2011) and two showed a significant inverse association between chloroform exposure and birth weight (Grazuleviciene et al., 2011;Smith et al., 2016). Grazuleviciene et al. (2011) found decrements of slightly more than 50 g in birth weight per 0.1 μg/d increase in chloroform exposure, whereas Smith et al. (2016) showed a decrement in birth weight of 26.5 g in association with an increase in chloroform exposure from tertile one (<0.91 μg/ d) to tertile two (0.91-1.56 μg/d). Other studies with lessrigorous exposure assessment generally showed relatively small decrements (3.4-19 g) in birth weight when comparing the higher to lower categories of chloroform exposure; these decreases would likely be considered clinically insignificant. The larger decrement detected by Grazuleviciene et al. (2011) thus appears to be an outlier in terms of the magnitude of the association between chloroform exposure and birth weight. Further, compared with other integrated studies, Grazuleviciene et al. (2011) had a lower average chloroform exposure level (mean = 7.8 μg/L; SD = 10.2).
Considering the epidemiologic evidence on chloroform exposure and continuous birth weight, majority of studies with weaker exposure assessment methods suggest an inverse association, but the majority of studies with higher quality exposure assessments do not. In light of the reliance on poor-quality ecological exposure estimation in most of the available studies, as well as the paucity of studies with more rigorous integrated exposure assessment, no reliable conclusion can be reached on the potential causal relationship between chloroform and birth weight.

Infant growth: Postnatal weight gain
A single study by Botton et al. (2015) examined the relationship of chloroform exposure with differences in postnatal weight gain during the first six months of life. The study showed no significant association of chloroform exposurewhether estimated based only on ingestion or based on integrated exposure data-with postnatal weight gain. Thus, as the only available study, it does not provide evidence in favor of a causal relationship between chloroform exposure and reduced postnatal weight gain. Growth: Small for gestational age Small for gestational age is defined as an infant below the tenth percentile of weight for gestational age at birth, specific to the population into which the child is born. The United States, Canada, and some other countries have populationspecific designations for the tenth percentile cut-off. These designations can be further specified by the infant's sex and race, and potentially by the parity of the mother.
More than half (17) of the epidemiologic studies examined the association between chloroform exposure and risk of small for gestational age, including either all infants collectively or only those born at term (Table 7). These studies represent 13 separate populations, with 43 associations estimated across different levels and periods of exposure. Thirteen of the 17 studies (37 associations) observed no clear increased or decreased risk of small for gestational age infants with increasing chloroform exposure Costet et al., 2012;Danileviciute et al., 2012;Grazuleviciene et al., 2011;Hinckley et al., 2005;Hoffman et al., 2008a;Infante-Rivard, 2004;Kogevinas et al., 2016;Levallois et al., 2012;Levallois et al., 2016;Rivera-Núñez & Wright, 2013;Savitz et al., 2005;Villanueva et al., 2011). The other four studies (six associations) found a significant increase in risk of small for gestational age with increased exposure to chloroform during pregnancy (Kramer et al., 1992;Porter et al., 2005;Summerhayes et al., 2012;Wright et al., 2004). Wright et al. (2004) showed a possible positive exposure-response trend across three categories of exposure (<50th percentile, 50-90 percentile, >90th percentile). Mothers with chloroform exposure above the 90th percentile had 1.11 times the odds of having an infant born small for gestational age compared to mothers with chloroform exposure below the median value (95% CI = 1.04, 1.17) (Wright et al., 2004). The evidence observed in Kramer et al. (1992) (OR = 1.8 comparing highest to lowest categories of chloroform exposure; 95% CI = 1.1, 2.9) and Summerhayes et al. (2012) (RR = 1.03 for a 25 μg/L increase in chloroform water concentration; 95% CI = 1.02, 1.05) is also consistent with a positive exposure-response trend, whereas that from Porter et al. (2005) is not (OR = 1.24 comparing 2nd to 1st quintile; 95% CI = 1.02, 1.50). It should be noted that Kramer et al. (1992) used a slightly different small for gestational age definition (below the 5th instead of 10th percentile weight for gestational age). The use of this more extreme category for the outcome would result in a larger odds ratio than expected for the standard definition if the modeled relationship were a positive monotonically increasing trend between chloroform exposure and the risk of small for gestational age. Summerhayes et al. (2012) also reported a relatively large relative risk, but this was due in part to the fact that they used a relatively large incremental increase in chloroform exposure (25 μg/L) as the unit for analysis, compared with other studies that examined chloroform as a continuous variable (typically 10 μg/L).
Not a single study of small for gestational age that used an integrated approach for assessing chloroform exposure (of which there were nine) found a statistically significant association between chloroform exposure and risk of small for gestational age Costet et al., 2012;Danileviciute et al., 2012;Grazuleviciene et al., 2011;Kogevinas et al., 2016;Levallois et al., 2012;Levallois et al., 2016;Savitz et al., 2005;Villanueva et al., 2011). In the absence of significant associations among studies with methodologically rigorous chloroform exposure assessment, combined with the inconsistency of observed associations across all available studies, the preponderance of the evidence does not establish a clear association between chloroform exposure during pregnancy and small for gestational age infants.

U.S.-based studies
For understanding the impact of disinfection by-products such as chloroform in the U.S. water supply, it is important to take transportability of study results into consideration by giving particular attention to studies conducted in the United States (Table 9). Of the 30 studies included in this systematic evaluation, 11 were conducted in the United States: 5 in Massachusetts, 1 in Arizona, 1 in Iowa, 1 in Maryland, 1 in Texas, and 2 in three U.S. communities of North Carolina, Tennessee, and Texas; none were conducted in California.
Two of the 11 U.S. studies examined congenital anomalies (Brender et al., 2014;Wright et al., 2017), and neither study integrated information on individual behaviors for more accurate exposure assessment estimates. Most models yielded no significant association with any type of birth defect, and no exposure-response relation was seen between chloroform exposure and the risk of spina bifida, which was significantly elevated only in the second versus first quartile of exposure.
Preterm birth and gestational age at birth were inconsistently associated with chloroform exposure in U.S. studies. Wright et al. (2004), Savitz et al. (2005), and Lewis et al. (2007) all observed significant inverse associations of gestational age with chloroform exposure, including one study (Savitz et al., 2005) that used integrated exposure assessment. The one significant positive association with preterm birth in a U.S. study showed no positive exposureresponse trend (Rivera-Núñez & Wright 2013). Seven U.S. studies considered small for gestational age as an outcome in association with chloroform exposure. Four studies found no significant association between chloroform exposure and small for gestational age births. Wright et al. (2004) found a significant positive association between median chloroform concentration in water and the odds of small for gestational age, with some evidence of a positive exposure-response gradient. Kramer et al. (1992) and Porter et al. (2005) also observed significant positive associations, but no clear exposure-response trends. Both of these studies were considered to be lower quality based on the use of ecologic exposure assessments.
Significant decreases in mean birth weight were observed in association with chloroform exposure in three U.S. studies (Lewis et al., 2006;Rivera-Núñez & Wright, 2013;Wright et al., 2004). However, all of these studies had limited exposure assessment that lacked individual-level behavioral data.
Overall, U.S.-based studies indicate inconsistent associations of chloroform exposure at levels typically detected in various U.S. communities with the risks of adverse developmental outcomes. The one U.S.-based study with higher quality exposure assessment using individual-level behavioral data (Savitz et al., 2005) found a significant inverse (protective) effect of chloroform exposure during the third trimester on gestational age at birth. Thus, U.S. studies with high-quality exposure assessment are sparse, revealing a gap in the scientific database; nevertheless, the available U.S. studies show no consistent relationship between exposure to chloroform and adverse developmental outcomes in humans.

| CONCLUSIONS
The purpose of this analysis is to conduct a systemic assessment for persons of reproductive potential (population of interest) exposed to chloroform via either ingestion or inhalation (exposures of concern)-comparing those highly exposed to those either not exposed or minimally exposed (comparators)-for developmental and/or reproductive toxicity (outcomes). Thus, we conducted an assessment of the animal and epidemiologic data for chloroform to determine if the available weight of evidence showed that chloroform exposure causes developmental and/or reproductive toxicity. An initial scoping exercise was undertaken, the results of which identified developmental toxicity as the primary area of concern for chloroform. Both the animal and epidemiology data for male and female reproductive endpoints were rather limited, but generally equivocal or negative for potential effects in association with exposure. The one exception is the finding of total litter loss, which is discussed in more detail below. This conclusion, that male and female reproductive toxicity are not areas of significant concern due to chloroform exposure, is consistent with that of OEHHA (2016c); therefore, the bulk of our analysis is focused on the potential for developmental toxicity in association with chloroform exposure.
In Schwetz et al. (1974), reported malformations related to caudal dysplasia at 100 ppm, but these findings were not confirmed in the other rat inhalation studies of chloroform at similar exposures (Baeder & Hofmann, 1988, 1991. A single animal inhalation study (Murray et al., 1979) reported an increased incidence of cleft palate in mice exposed during late gestation to 100 ppm chloroform in air. This finding was not replicated in the oral reproductive mouse study by , which included a teratogenic screen and used a higher oral dosage of chloroform than the inhalation dosage administered in Murray et al. (1979). Cleft palate also was not observed with chloroform exposure in rats. Further, the corresponding epidemiologic evidence does not demonstrate an association between chloroform exposure at environmentally relevant exposures and the occurrence of birth defects in humans (Brender et al., 2014;Dodds & King, 2001). A single epidemiologic study (Brender et al., 2014) suggested possible associations of chloroform exposure with septal heart defects and/or spina bifida/neural tube defects more broadly in humans. However, these findings did not demonstrate clear exposureresponse trends and were not supported by the other epidemiologic studies evaluated, including those that used integrated exposure measures of chloroform.
The animal data showed consistent evidence of reduced fetal body weight (along with expected delays in fetal bone ossification) due to chloroform exposure. These findings were observed at the same doses at which indications of maternal toxicity (i.e., reduced food consumption, body weight, and/or body weight gains) were seen or at doses above which maternal toxicity was first observed. Fetal weight reductions and delays in ossification have been shown in other studies to be secondary to reduced maternal food consumption (Nitzsche, 2017). EPA (2001) considered the fetal effects observed after chloroform exposure to be secondary to maternal toxicity. In contrast, OEHHA's Developmental and Reproductive Toxicant Identification Committee (DARTIC) considered the fetal body weight reductions to be independent of maternal toxicity (OEHHA, 2016c).
In contrast to the relatively consistent nature of the animal data, the epidemiology data are generally equivocal on the issue of fetal growth. Multiple studies have evaluated parameters related to fetal growth, including low birth weight or weight at birth evaluated as a continuous variable. Approximately half of the epidemiology studies herein reviewed reported no association of chloroform exposure with birth weight. However, the other half of the studies suggested a possible inverse relationship between increasing chloroform exposure and birth weight; no association of chloroform exposure with increased birth weight was reported. The equivocal nature of the epidemiologic data on birth weight was also evident when the analysis was limited to those studies that employed integrated measures of chloroform exposure. Related epidemiologic data are available on the risk of small for gestational age. These data are less variable than those on birth weight: 13 of the 17 studies that examined this parameter showed no significant relationship between chloroform exposure and small for gestational age. The less-equivocal nature of these data may be due to the more specific definition of fetal growth. The remaining four studies found an increase in risk of small for gestational with greater chloroform exposure, but the results were only marginally statistically significant, and some of these studies tested numerous hypotheses. Based on the equivocal results overall, including the lack of significant findings in the higher quality studies with integrated assessment of chloroform exposure, a causal relationship cannot be established between chloroform exposure and fetal growth.
The animal data also showed indications of very early (peri-implantation) total litter losses related to high-dose chloroform exposures. This finding was only observed at exposures well above those at which indications of maternal toxicity were seen. Further, because chloroform affects vascular function and blood flow in the uterus (which is important to the process of implantation) and the outcome affected the complete litter rather than individual fetuses, we believe that this effect was mediated on the maternal animal rather than directly on the offspring. In OEHHA's decision to classify chloroform as a developmental toxicant (OEHHA, 2016c), the DARTIC discussed this finding of early litter losses in the animal studies. They considered that the finding could be indicative of either a female reproductive effect or mediated at the level of the developing offspring. For the purposes of classification, however, they appear to have categorized the finding as a developmental effect.
Finally, the epidemiologic data for chloroform suggest no or even an inverse association of exposure with preterm birth. These findings were also noted in OEHHA's evaluation of the data (OEHHA, 2016c). The available animal data did not support the presence of an association of chloroform exposure with early birth; however, the available data to assess this endpoint were limited to only two reproductive toxicity studies NTP, 1988). Further, the biological plausibility of a protective effect is not established, but could relate to cleaner drinking water sources with reduced bacterial or viral contamination.
In general, human epidemiology studies and animal toxicology studies have complementary strengths and limitations for assessing the reproductive toxicology of chloroform. The main strength of epidemiologic studies is the evaluation of human health outcomes in relation to real-world exposure levels and scenarios in a range of settings, making these studies directly relevant to public health. Epidemiologic studies of birth outcomes can take advantage of routinely collected objectively measured birth data that are available for nearly all newborns. Limitations include imprecise and probably inaccurate exposure assessment, especially in studies without integrated exposure assessment; confounding and other forms of bias, which can be present in all observational studies to a varying extent; and a paucity of information on reproductive health outcomes at the highest levels of human exposure such as those seen in occupational settings.
Strengths of animal studies include precise control over exposure doses and circumstances, and the potential for minimization of confounding and other forms of bias through randomization, allocation concealment, and blind outcome assessment. However, limitations include uncertainties about interspecies extrapolation of results to humans, possible evaluation of doses and exposure routes that are not relevant to most or all humans, and typically small sample sizes. For example, in some studies, the laboratory animals were exposed to air concentrations as low as 3 ppm while people are exposed to air concentrations that are 60,000-150,000-fold lower (0.02-0.5 ppb; ATSDR, 1997). In the oral dosing studies, the highest dose administered was 400 mg/kg/day. By comparison, water concentrations have been measured at 2-44 ppb (ATSDR, 1997). Assuming 2 L of water consumed per day and a default body weight of 70 kg, these concentrations translate to human chloroform exposures of <2 μg/kg/day (200,000-fold lower) due to water consumption.
In conclusion, based on our detailed analysis of the animal and epidemiology data associated with developmental chloroform exposure, we conclude that chloroform, if teratogenic, is only so at high maternally toxic doses. At levels sufficient to cause maternal toxicity chloroform may also cause decrements in fetal weights and associated delays in ossification. The available animal data related to this outcome are relatively consistent; the related epidemiologic data, in contrast, are generally equivocal, preventing any firm conclusions that such effects of chloroform exposure are seen in humans. The animal data also show evidence of early (peri-implantation) total litter losses, which we consider indicative of an effect that is mediated on the maternal animal rather than directly on the offspring. Based on this analysis, we find the available animal data demonstrate that chloroform perturbs development only at doses that typically cause maternal toxicity. Further, the available data in humans do not demonstrate a causal role for chloroform on the induction of adverse developmental effects at concentrations experienced generally in the environment.
conflicts of interest to declare related to the analysis of chloroform and developmental outcomes, as presented herein. The authors are solely responsible for the analyses and preparation of this article; the opinions and conclusions are those of the authors and are not those of their employer or the sponsor.

SUPPORTING INFORMATION
Additional supporting information may be found online in the Supporting Information section at the end of the article.