Derivation of a benchmark for freshwater ionic strength

Authors


Abstract

Because increased ionic strength has caused deleterious ecological changes in freshwater streams, thresholds for effects are needed to inform resource-management decisions. In particular, effluents from surface coal mining raise the ionic strength of receiving streams. The authors developed an aquatic life benchmark for specific conductance as a measure of ionic strength that is expected to prevent the local extirpation of 95% of species from neutral to alkaline waters containing a mixture of dissolved ions in which the mass of SOmath image + HCOmath image≥ Cl. Extirpation concentrations of specific conductance were estimated from the presence and absence of benthic invertebrate genera from 2,210 stream samples in West Virginia. The extirpation concentration is the 95th percentile of the distribution of the probability of occurrence of a genus with respect to specific conductance. In a region with a background of 116 µS/cm, the 5th percentile of the species sensitivity distribution of extirpation concentrations for 163 genera is 300 µS/cm. Because the benchmark is not protective of all genera and protects against extirpation rather than reduction in abundance, this level may not fully protect sensitive species or higher-quality, exceptional waters. Environ. Toxicol. Chem. 2013;32:263–271. © 2012 SETAC

INTRODUCTION

Ionic strength is a key physiological determinant of the distribution of aquatic organisms. In most studies of the physiological adaptation of organisms to different concentrations of dissolved ions, Na+ and Cl are the predominant environmental ions 1–3. However, the constituents can be quite different when land disturbance increases ionic strength 4. Surface coal mining involves blasting and crushing the surface layers of sandstone, shale, limestone, and dolomite. The surface runoff and leachate from the crushed rock are neutral to mildly alkaline but contain much higher levels of HCO3/COmath image, SOmath image, ClCa2+, and Mg2+ than occur in undisturbed stream systems 5, 6. This effluent is of particular concern because the amount of dissolved ions entering streams below surface coal-mining operations can be very high and the areal extent of mining may have exceeded the assimilative capacity of streams and entire drainage basins 5, 7–10. A protective benchmark is needed to inform decision making because there is currently no regulatory criterion to protect aquatic life from ionic stress.

We chose to develop the benchmark using a field–based method because all life stages are exposed and sensitive taxa are adequately sampled in the field, whereas they have not been tested in the laboratory. Furthermore, the mixture of dissolved ions addressed in this case presents a particular challenge for testing. It contains a large proportion of HCOmath image, which at times is at saturation levels and interacts with other ions in the mixture affected by atmospheric, hydrological, geological, and biological processes. These processes cannot be faithfully replicated in the laboratory. Furthermore, because many genera are absent at or near HCOmath image saturation, simulation of exposure with this ionic matrix may be difficult in the laboratory. Finally, the organisms—in particular, the Ephemeropterans—that are most sensitive to the ionic mixture are not available as cultured animals for toxicity tests; their sensitive life stages are unknown, and life–cycle effects are suspected.

The present study demonstrates the use of field data to develop a protective benchmark for ionic strength using a method developed by Cormier and Suter 11. This method is adapted from the U.S. Environmental Protection Agency's (U.S. EPA's) standard method for deriving water–quality criteria 12. Because field data are used, the many decisions that influence the final data set are explained and the data set is characterized with respect to background exposure levels and to the composition of the ionic matrix. This benchmark assessment also illustrates ways to evaluate uncertainty and validates the benchmark with an independent data set.

Because field data are used, analyses that are not usually performed with a laboratory method are needed to ensure that the benchmark is reasonable and valid. These include an assessment to determine that the observed association between this specific mixture of dissolved ions and the absence of benthic invertebrates is indeed causal and not confounded. These methods and detailed assessments of causation and confounding are described separately 13, 14.

APPROACH

Measure of exposure

The mixture of ions measured in West Virginia streams contains Ca2+, Mg2+, SOmath image, Cl and HCOmath image at a circumneutral to alkaline pH (Table 1). Because the toxicity is related to the ionic mixture and not to a single ion 13, a measure of ionic strength was selected as the measure of exposure, rather than measures of individual ions. For freshwaters, there are several methods for measuring ionic concentration 4. With practical use of the benchmark in mind, specific conductance, hereafter referred to as conductivity, was selected as the exposure measurement of ionic strength for the following reasons: (1) it measures all ions; (2) the technology has become fast, inexpensive, accurate, precise, and reliable; (3) it can provide continuous monitoring records with deployed systems; (4) it is less influenced by other nonfilterable material such as oils and carbohydrates that may be dissolved in water; and (5) many monitoring programs routinely include a conductivity measurement.

Table 1. Summary statistics of the measured water–quality parametersa
ParameterUnitsMin25th percentileMedian75th percentileMaxMeanValid n
  • a

    K+ and Na+ not measured; all means are geometric means except pH, DO, temperature, and habitat score.

    DO = dissolved oxygen; catchment area = delimited from highest elevation to sampling pore point; RBP = rapid bioassessment protocol; TSS = total suspended solids.

ConductivityµS/cm15.414626156311,646281.52,210
Hardnessmg/L0.550.291.11881,49297.11,148
Alkalinitymg/L0.230.566.7117560551,425
SOmath imagemg/L117371596,00051. 61,428
Clmg/L135.211.951,1536.51,118
Ca, totalmg/L0.00213.625.149.243025.51,154
Mg, totalmg/L0.053.76.3142047.31,150
TSSmg/L13461904.31,442
Fe, totalmg/L0.0050.1230.260.51100.261,433
NO2–NO3mg/L0.010.10.20.37300.201,178
Al, totalmg/L0.010.090.110.23120.151,436
Al, dissolvedmg/L0.010.020.050.060.930.041,287
Fe, dissolvedmg/L0.0010.020.0420.0611.80.051,259
Mn, totalmg/L0.0030.020.040.17.250.051,430
Mn, dissolvedmg/L0.010.030.070.221.060.0720
Total phosphatemg/L0.010.020.020.032.360.031,181
Se, dissolvedmg/L0.0010.0010.0010.0011.260.001313
Se, totalmg/L00.0010.0010.0051.260.002496
Fecal coliformCounts/100 mL036170600250,0001512,035
DOmg/L1.028.29.210.318.359.32,182
pHStandard units6.027.277.627.9610.487.592,210
Catchment areakm20.1732.3116.96525.836153.0147.644717
Temperature°C−0.2815.118.421.331.9172,210
HabitatRBP score49115130145192127.82,186

Measure of biological effect and threshold effect levels

Extirpation is the depletion of a population to the point that it is no longer a viable resource or is unlikely to fulfill its function in the ecosystem 15. In the present study, extirpation is operationally defined for a genus as the conductivity value below which 95% of the observations of the genus occur. In other words, the probability is 0.05 that an observation of a genus occurs above its extirpation concentration (XC95). The proportion of extirpated genera was selected as the effect for the benchmark. The laboratory–based method uses 5% of affected genera, so the 5th percentiles was also used in this field–based method to identify the hazardous levels of ionic strength (HC05).

Data sets

The Central Appalachia (69) and Western Allegheny Plateau (70) ecoregions (Fig. 1) were selected for development of a benchmark for conductivity because available data were of sufficient quantity and quality and because conductivity has been implicated as a cause of biological impairment in these ecoregions 5, 8, 13, 16, 17. These two regions were judged to be similar in terms of water quality, including resident biota and sources of mineral ions. Confidence in the quality of reference sites in West Virginia was relatively high owing to the extensively forested areas of the region and a well-documented process by which the West Virginia Department of Environmental Protection (WVDEP) assigns reference status. The WVDEP uses a tiered approach. We used only tier 1 when analyses involved the use of reference sites, thus avoiding the use of conductivity as a the characteristic of reference condition. Conductivity values from WVDEP's reference sites were low and similar in different months collected over several years (Fig. 2a), providing evidence that they were reasonable reference sites. The 75th percentiles of reference sites were <200 µS/cm in most months. The 25th percentiles from samples from randomly selected sites and from the full data set were <200 µS/cm in most months (Fig. 2b and c). Also, a wide range of conductivity levels were sampled, which is useful for modeling the response of organisms to different levels of ionic strength.

Figure 1.

Points are sampling locations used to develop the benchmark from level III spatial scale for ecoregions 69 (dark gray) and 70 (light gray) in West Virginia.

Figure 2.

(A) Box plot showing seasonal variation of conductivity (µS/cm) in the reference streams of ecoregions 69 and 70 in West Virginia from 1999 to 2006. A total of 97 samples from 70 reference stations were used for this analysis. The 75th percentiles were below 200 µS/cm in all months except June. (B) Box plot showing seasonal variation of conductivity (µS/cm) from a randomly selected sample of streams of ecoregions 69 and 70 in West Virginia from 1997 to 2007. A total of 1,271 samples were used for this analysis. The 25th percentiles were below 200 µS/cm (horizontal dashed line) except in the September and October samples. (C) Box plot showing seasonal variation of conductivity (µS/cm) from the data set used to develop the benchmark. A total of 2,210 samples from 2000 to 2007 from ecoregions 69 and 70 in West Virginia are represented. The 25th percentiles were less than 200 µS/cm except in the August and November (n = 2) samples. The wide range of conductivities allows the 95th percentiles extirpation concentration to be well characterized.

All data used for benchmark derivation were taken from the WVDEP's in-house water analysis database (WABbase) from 1999 to 2007. The WABbase contains data from the level III spatial scale for ecoregions 66, 67, 69, and 70 in West Virginia 18, 19. In this assessment, only data from ecoregions 69 and 70 were used (Fig. 1). Chemical, physical, and/or biological samples were collected from 2,542 distinct locations (2,668 samples) during the sampling years 1999–2007. The WVDEP uses a tiered sampling design that collects measurements from long-term monitoring stations, targeted sites within watersheds on a rotating basin schedule, randomly selected sample sites 20, and sites chosen to further define impaired stream segments in support of total maximum daily load development 21. Most sites are sampled once during an annual sampling period, but most total maximum daily load sites are sampled monthly for water-quality parameters. Some targeted sites represent least-disturbed or reference sites that have been selected by a combination of screening values and best professional judgment 22. Water quality, habitat, watershed characteristics, macroinvertebrate data (both raw data and calculated metrics), and supporting information are used by the state to develop U.S. Clean Water Act–mandated reports to the U.S. EPA 21. All sites were in perennial reaches of streams.

The WVDEP collects macroinvertebrates from a 1–m2 area of a 100–m reach at each site. When using a 0.5–m–wide rectangular kicknet (595–µ mesh), four 0.25–m2 riffle areas are sampled. In narrow or shallow water, nine areas are sampled with a 0.33–m–wide D–frame dipnet of the same mesh size. Composited samples are preserved in 95% denatured ethanol. A random subsample of 200 individuals (± 20%) is identified in the laboratory. All contracted analyses for chemistry and macroinvertebrate identification follow WVDEP's internal quality-control and quality-assurance protocols 23, 24. We judged the quality assurance to be excellent, based on the database itself and supporting documentation.

Multiple biological samples from the same location were not excluded from the data set. Summary statistics for ion concentrations and other parameters for the data set are provided in Table 1. The benchmark applies to waters with a similar composition to those in Table 1. We used a total of 2,210 samples from ecoregions 69 and 70 to determine the conductivity benchmark (Fig. 1 and Table 2). The data set resulted from a larger data set with some sites excluded. We excluded 10 sampling sites that lacked a conductivity measurement. We excluded 295 samples from large rivers (>155 km2) because the sampling methods differed 25. We excluded four sites that had an ionic mixture with more Cl than SOmath image + HCOmath image (conductivity > 1,000 µS/cm, SO4 < 125 mg/L, and Cl > 250 mg/L). This ionic mixture is expected to have a different toxicity 11, 13. Because Cl was not measured at all sites, some sites with a different ionic composition may still occur in the data set.

Table 2. Number of samples with reported genera and conductivity meeting our acceptance criteria for calculating the benchmark valuea
RegionJanFebMarAprMayJunJulAugSepOctNovDecTotal
  • a

    Presented for each month and ecoregion.

69841631871037926923254061,006
7043341872321791942371208241,204
Total12375250419282273506352622102,210

The effects of low pH were eliminated by excluding 147 sites with pH <6. This prevented confounding of conductivity effects by acid mine drainage 8, 14. An existing freshwater chronic criterion already requires waters to be maintained between pH 6.5 and 9 26. The conductivity benchmark was derived from waters having pH between 6.0 and 10. Thus, the circumneutral range of the data encompasses pH levels that are seldom toxic to freshwater organisms.

A taxon was excluded from calculations if it was not identified to the genus level, and a genus was excluded if it was never observed at reference sites or it was observed in fewer than 25 samples. Invertebrate genera that did not occur at WVDEP tier 1 reference sites represented 11.4% of all genera 27. They were excluded so that the data would be relevant to potentially unimpaired conditions and to reduce the influence of nonnative and opportunistic salt–tolerant organisms. Genera observed at fewer than 25 sampling locations in the composited ecoregions were excluded to ensure reasonable confidence in the evaluation of the relationship between conductivity and the presence or absence of a genus.

In the WABbase, 497 benthic invertebrate genera were identified in Ecoregions 69 and 70. Those ecoregions had 308 genera in common. Of these, 220 genera occurred at least once at one of the 70 reference sites in the two ecoregions. Greater than 95% of genera observed at reference sites as defined by the WVDEP occur in both ecoregions 69 and 70. This indicates that the same sensitive genera exist in both ecoregions, which is one of the reasons it was reasonable to combine the two regions for analysis. Of the 220 genera, 163 occurred at 25 or more sampling locations in ecoregions 69 and 70. Of the genera occurring at 25 or more sampling sites, 162 occurred in ecoregion 69 and 163 in ecoregion 70.

We verified the benchmark value using a data set from the coal–producing regions in eastern Kentucky, USA 28. The ionic composition and conductivity range were similar to the West Virginia data set, but the relative number of samples across the range was more uniform. Similar genera were collected in both states. The Kentucky data set represents fewer sites (n = 282); however, 105 genera were identified in at least 25 samples in the Kentucky data set. The actual number of 105 genera is greater than the 59 genera predicted to arise from a similarly sized data set from West Virginia (Fig. 3). This occurs because more invertebrate specimens are identified from each Kentucky sample (all specimens identified per sample) than in the WVDEP protocol (200 specimens identified per sample).

Figure 3.

Adequacy of the number of samples used to model the 5th percentiles hazardous concentration (HC05) based on the West Virginia data set. As sample size increases the number of genera included in the species sensitivity distribution increases (triangles). The HC05 stabilizes, reaching an asymptote at approximately 800 sites sampled (circles) and 120 genera evaluated. The 95% confidence intervals are indicated by vertical bars.

METHODS

The approach used to derive the benchmark 11 is based on an adaptation of the standard method for U.S. EPA's published Section 304(a) Ambient Water–Quality Criteria 12. We used the statistical package R, Version 2.12.1 (December 2010), for all statistical analyses 29.

The calculation of HC05 involved four steps: First, the relationship between conductivity and the probability of observing each genus was modeled using a weighted cumulative distribution function. Second, the XC95 conductivity value for each genus was identified from the 95th percentiles using two-point interpolation. Third, the XC95 values for all genera were ordered from lowest to highest conductivity value. Fourth, the HC05 was determined as the 5th percentiles of the distribution of genera.

RESULTS

Calculating extirpation concentrations

Observed conductivity values were nonuniformly distributed across a range of possible values 27, and therefore, we were more likely to observe a genus at certain conductivity values simply because more samples were collected at those values. To correct for the uneven sampling frequency, a weighted cumulative distribution function was used to estimate the XC95 values for each genus. Each XC95 value was estimated from the cumulative distribution of probabilities of observing a genus at a site with respect to the concurrently measured conductivity at that site. An example of a weighted cumulative distribution function is shown in Figure 4 for the mayfly Epeorus.

Figure 4.

Examples of weighted cumulative distribution functions and the associated 95th percentiles extirpation concentration (XC95) values. The step function shows weighted proportion of samples for (A) Epeorus and (B) Nigronia present at or below the indicated conductivity value (µS/cm). Horizontal dashed line indicates the point of extirpation where F(x) = 0.95 intersects the cumulative distribution functions. Vertical dashed line indicates the XC95 conductivity value on the x axis. (A) Genera that are affected by increasing conductivity (e.g., Epeorus) show a steep slope, whereas (B) genera unaffected by increasing conductivity (e.g., Nigronia) have a steady increase and do not reach a clear asymptote.

Not all 95th percentiles correspond to extirpation, and some imprecisely estimate the extirpation threshold. To examine the trend of occurrence along the conductivity gradient, we used a nonparametric function (generalized additive model with 3 degrees of freedom) to model the likelihood of a taxon being observed with increasing conductivity (Fig. 5) (27 Appendix E 1–29). Results for individual genera are available from the U.S. EPA (27 Appendix D 2–7). If the generalized additive model mean curve at maximum conductivity was approximately equal to 0 (defined as <1% of the maximum modeled probability), then the XC95 was listed without qualification. If the generalized additive model mean curve at maximum conductivity was >0 but the lower confidence limit approximated 0 (<1% of the maximum mean modeled probability), the value was listed as approximate. If the generalized additive model lower confidence limit was >0, then the XC95 was listed as greater than the 95th percentiles. For example, the XC95 for Cheumatopsyche (an extremely salt-tolerant genus) is >9,180 µS/cm (Fig. 5c). We also visually inspected all model fits and the scatter of points for anomalies, and if the model poorly fit the data, the uncertainty level was increased to either approximately or greater than designation. A list of XC95 results for individual genera is available in the Supplemental Data. The values, which are designated as approximately and greater than, do not affect the HC05 because most cases occur well above the 5th percentiles; but the qualified values indicate the uncertainty of some XC95 values for other uses such as comparison with toxicity test results or with results from other geographic regions 13.

Figure 5.

Three typical distributions of observation probabilities. Open circles are the probabilities of observing the genus within a range of conductivities. Circles at zero probability indicate no individual at any site was found at these conductivities. Solid line is the mean smoothing spline fitted to the probabilities. Vertical dashed line indicates the 95th percentiles extirpation concentration (XC95) from the weighted cumulative distribution. Genera respond differently to increasing ionic strength: (A) Epeorus declines, (B) Nixe has an optimum, and (C) Cheumatopsyche increases. The XC95 for genera like Cheumatopsyche is reported as “greater than” because extirpation did not occur in the measured range.

Calculating the HC05 and benchmark

The exposure–response model is a species sensitivity distribution (SSD) that characterizes the proportion of genera that are extirpated with increasing conductivity. This relationship can be plotted as a cumulative distribution plot of XC95 values for each genus relative to conductivity (Fig. 6).

Figure 6.

Species sensitivity distribution (SSD). Each point is a 95th percentiles extirpation concentration value for a genus (total 163 genera). The 5th percentiles hazardous concentration (HC05; 295 µS/cm) is the conductivity at the intercept of the SSD with the horizontal line at the 5th percentiles.

The HC05 is the conductivity at which 5% of genera are extirpated. The cumulative proportion for each genus P is calculated as P = R ÷ (N + 1), where R is the rank of the genus's XC95 value and N is the number of genera. The HC05 was derived using a two–point interpolation to estimate the centile between the XC95 values bracketing P = 0.05 (i.e., the 5th percentiles of modeled genera). The benchmark of 300 µS/cm is obtained by rounding the HC05 to two significant figures 12.

Confidence bounds

Because the XC95 values were estimated from field data and then the HC05 values were derived from those XC95 values, we used a method that generated distributions and confidence bounds in the first step and propagated the statistical uncertainty of the first step through the second step 14.

Bootstrap estimates of the XC95 were made for each genus used in the derivation of the benchmark by resampling 2,210 times (the number of observations in the data set) with replacement 14. From each bootstrap sample, the XC95 was calculated for each genus by the same method applied to the original data. That process was repeated 1,000 times to create a distribution of XC95 values for each genus. These distributions were used to calculate a two–tailed 95% confidence interval on the XC95 for each genus 11, 27.

Uncertainty in the HC05 value was evaluated by generating an HC05 from each of the 1,000 sets of bootstrapped XC95 estimates. The distribution of 1,000 HC05 values was used to generate two–tailed 95% confidence bounds on these bootstrap–derived values. The estimated two–tailed 95% lower confidence bound of the HC05 point estimate is 228 µS/cm and the upper bound is 303 µS/cm. (See Figure 5 in 11 for a graphed illustration.)

Defining the region

For the present study, we chose two adjoining regions that have abundant data, >95% of genera in common, and a common dominant source of the stressor of concern. Ecoregions 69 (central Appalachia) and 70 (Western Allegheny Plateau) in the eastern United States are very similar, including having similar bedrock types; but the relative abundances of rock types differ. The coal–producing subregions of the ecoregion 69 are 69a (forested hills and mountains) and 69d (Cumberland Mountains). According to Woods et al. 19, “Ecoregion 69 . . . is a high, dissected, and rugged plateau made up of sandstone, shale, conglomerate, and coal of Pennsylvanian and Mississippian age. The plateau is locally punctuated by a limestone valley (the Greenbrier Karst; subregion 69c) and a few anticlinal ridges.” Ecoregion 70 has more heterogeneous bedrock formations than subregions 69a and 69d. It is underlain by shale, siltstone, limestone, sandstone, and coal, including the interbedded limestone, shale, sandstone, and coal of the Monongahela group and the Pennsylvanian sandstone, shale, and coal of the Conemaugh and Allegheny groups 19.

Individual analyses of ecoregions 69 and 70 result in a somewhat lower HC05 value for ecoregion 69 and a somewhat higher value for 70 (254 µS/cm in ecoregion 69 and 345 µS/cm in ecoregion 70). This difference might be attributed to the background water chemistry, but this did not seem to be the case. If the genera were adapted to high conductivity in ecoregion 70 and low conductivity in 69 or if they were represented by more resistant species in 70 and more sensitive species in 69, it would be expected that the XC95 values would consistently go up in ecoregion 70 and down in ecoregion 69 relative to the values in the combined data set. However, XC95 values go up and down in both ecoregions when they are analyzed individually.

The differences in HC05 values appear to result from random differences in which rarer genera do not meet the minimum sample size of 25 occurrences in a smaller data set. When the data set is split by ecoregion, the SSD model is reduced by 31 genera for ecoregion 69 and 35 genera for ecoregion 70. Furthermore, the two ecoregions had similar genera; and although ecoregion 70 had a slightly higher estimated background, there were sites that had conductivity below 100, suggesting that the truly undisturbed background would be low. Hence, we did not derive benchmarks for individual ecoregions because the evidence did not justify the increase in uncertainty associated with the reduced sample size and number of genera.

Replicate samples

Although most sites in the WABbase were sampled only once, 3.5% of sites were sampled twice and 0.7% more than twice. Inverse weighting sites sampled more than once did not materially change the result (HC05 = 293 µS/cm). Therefore, we have not deleted or differentially weighted the replicate samples. In future applications of this method, however, if there is a potential for bias due to replication of some samples, an appropriate weighting scheme could be applied. It was not necessary in this case.

Evaluating adequacy of number of samples

Bootstrapping was also used to evaluate the effect of sample size on the HC05 values and their confidence bounds. Means and confidence bounds on HC05 values were calculated, as described previously, for selected sample sizes ranging from 100 to 2,210 samples (Fig. 3). The HC05 is consistent for SSDs composed of more than 123 genera for this data set using this method. The HC05 values stabilize at approximately 800 to 1,000 samples, suggesting that 800 is a minimum sample size for this data set using this method.

Treatment of potential confounders

Potentially confounding variables for the relationship of conductivity with the extirpation of stream invertebrates were evaluated in several ways (see Suter and Cormier 14, this issue, for a description of evaluation methods). We evaluated habitat, organic enrichment, nitrates and phosphates, deposited sediments, pH, selenium, water temperature, lack of headwaters, catchment area, settling ponds, dissolved oxygen, and metals. These variables do affect species in the region, but their effects do not alter the relationship with conductivity or the benchmark value. The signal from conductivity was strong, and other potential confounders that were not strongly influential could be ignored with reasonable or greater confidence. However, one potential confounder, low pH, was known to cause effects and was controlled by removing sites with pH < 6.

Estimating background conductivity

In general, a benchmark should be greater than natural background. The background conductivities of streams were estimated using the portion of the WABbase that consists of probability–based samples (i.e., samples from locations selected to represent streams within a stream order with equal probability). We selected the 25th percentiles of these randomly selected samples to estimate the upper limit of background because disturbed and even impaired sites are included in the sample 30. A total of 1,271 randomly selected samples were collected from ecoregions 69 and 70. The background values were 72 µS/cm for ecoregion 69, 153 µS/cm for ecoregion 70, and 116 µS/cm when samples from ecoregions 69 and 70 were combined (Fig. 2b).

We also estimated the background conductivity using reference sites in the WABbase (Fig. 2a). Sampling locations were among the least disturbed based on the WVDEP's best professional judgment 21, 24. It is conventional to use the 75th percentiles of reference sites to estimate background based on precedent and on the collective experience of U.S. EPA field ecologists 30. The 75th percentiles from 43 sites in ecoregion 69 and 27 sites in ecoregion 70 are 66 and 214 µS/cm, respectively. When samples from ecoregions 69 and 70 were combined, the 75th percentiles was 150 µS/cm.

Background between ecoregions 69 and 70 appears to be different; however, none of these values exceeds the benchmark of 300 µS/cm. The higher estimates of background conductivity in ecoregion 70 relative to ecoregion 69 may be attributed to the variable occurrence of limestone and limestone–derived soils. The higher level of development and population density in ecoregion 70 may also contribute, but this was not evaluated.

Selection of invertebrate genera

Only genera observed in at least one reference site were included in the SSD. In this particular case, using all genera, including invasive species, would increase the HC05 by <2%. Mussels were not represented because genera did not occur in a minimum of 25 samples, probably owing to the WVDEP sampling methods. Genera were also selected for statistical reasons. We restricted genera used in the analyses to those recorded at a minimum of 25 sampling sites to reduce the chance that an apparent extirpation is due to sampling variance and to increase the likelihood that the models and quantitative analyses for potential confounding were reasonably strong. This decision was made because an analysis showed that the benchmark varied within <5% when SSD models were constructed from 20 or more occurrences of each genus, whereas the benchmark steadily decreased when XC95 values were derived from fewer than 15 occurrences (Supplemental Data, Fig. S1).

Inclusion of sensitive taxa

Only benthic macroinvertebrates sampled by a kicknet method were included in the SSD. Fish were not included because their occurrence is strongly affected by stream size, making it difficult to determine XC95 values. Indeed, some of the affected streams naturally have no fish. In addition, the WABbase data set used to derive the benchmark does not contain data for fish. Other data sets that do contain fish are not as large and do not contain as great a range of conductivity values. An SSD might be developed for fish once these technical issues are resolved. Data for plants and amphibians are not available. To date, no evidence has been presented that fish, amphibians, or plants are more sensitive than benthic invertebrates.

Seasonality, life history, and sampling methods

The seasonality of life-history events such as emergence of aquatic insects can affect the probability of detecting a species because eggs and early instars are not captured by the sampling methods. As a result, annual insects that emerge in the spring, although present, are less likely to be detected in the summer, when conductivities increase in some streams.

We evaluated the effects of seasonality and life history by comparing HC05 values partitioned into spring and summer based on seasonal patterns of conductivity in the full data set (Fig. 2c). The spring season was March through June, and the summer season was July through October. The HC05 values were 317 µS/cm for spring (132 genera) and 415 µS/cm for summer (120 genera). The greater summer HC05 resulted from the loss of sensitive taxa from the SSD. The lower end of the SSD for the full data set and spring samples are fairly similar 27. Lower effect levels in the spring are not the result of an insufficient test range of conductivities because exposures as high as 5,200 µS/cm occurred in the spring samples. Because the spring data set included both sensitive genera and a full range of exposures, we judged it more reliable than the summer model.

Selection of the effects end point

We used the extirpation concentration as the effects end point because it is easy to understand that an adverse effect has occurred when a genus is lost from an ecosystem. However, for the same reason, it may not be considered as protective. Because this endpoint is based on full life–cycle exposures and responses of populations to multigenerational exposures, it is considered a chronic benchmark.

Treatment of mixtures

In natural waters, salinity is a result of mixtures of ions. A metric is required to express the strength of that mixture. We use conductivity because it is a measure of the ionic strength of the solution, because it is related to biological effects, and because it is readily measured accurately. However, conductivity per se is not the cause of toxic effects, and waters with different mixtures of ions but the same conductivity may have different toxicities 31. In this case, the benchmark value was calculated for a relatively uniform mixture of ions in Appalachian streams with Ca2+, Mg2+, SOmath image, Cl, and HCOmath image ions at circumneutral to mildly alkaline pH (pH 6–10). Recent increases in drilling for natural gas may change the toxicity of ionic strength in this region, and monitoring should be designed to evaluate differences. The relative contributions of individual ions from large–scale surface coal mining are described by Pond et al. 5. Whereas Ca2+, Mg2+, SOmath image, and HCOmath image are the four most abundant ions to drain from surface coal mines 32, ions of Na+ and Cl are the two most common in seawater and brines from Marcellus shale drilling operations 13. Because the few sites with very elevated Cl were found to be outliers in the distributions of occurrence, they were deleted from the data set used to derive the XC95 values. Hence, the use of the benchmark value in other regions or in waters that are contaminated by other sources, such as road salt or irrigation return waters, may not be appropriate. However, for the circumneutral to alkaline drainage from similar geological sources these four primary ions are highly correlated with conductivity (Table 3).

Table 3. Spearman rank correlation (|r|) of water-quality parametersa
 ConductivityAlkalinitySulfateChlorideHardnessMgCa
  • a

    HCOmath image measured as alkalinity.

Conductivity1.000.780.890.640.950.930.92
Alkalinity0.781.000.600.560.780.700.79
Sulfate0.890.601.000.410.850.900.80
Chloride0.640.560.411.000.500.430.50
Hardness0.950.780.850.501.000.960.99
Mg0.930.700.900.430.961.000.91
Ca0.920.790.800.500.990.911.00

Forms of exposure–response relationships

The diversity of the forms of the exposure–response relationships (i.e., decreasing, unimodal, increasing, and no relationship) (Fig. 5 and 27, Appendix E) has required some methodological decisions. The forms are expected, given the nature of the ionic regulation and the variance in sensitivity. The ionic mixture includes nutrients and essential elements, and like other pollutants that are essential at low exposure levels (e.g., copper and selenium), the response to this mixture is expected to have a unimodal distribution (Fig. 5b). In the ascending (left) limb, nutrient and essential element needs are increasingly being met. In the descending (right) limb, toxicity is increasing. However, many of the empirical exposure–response relationships do not display both limbs. They may show the descending portion of the curve, because none of the observed conductivity levels are sufficiently low to show deficiency for the taxon (Fig. 5a); the entire unimodal curve, because their optimum is near the center of observed conductivity levels and the range from deficiency to toxicity is relatively narrow (Fig. 5b); the ascending portion, because none of the observed conductivity levels are sufficiently high to show toxicity for the taxon (Fig. 5c); or no trend, because the optimum is more of a plateau than a peak, so it extends across the range of observed conductivities (see Nigronia 27 Appendix E–26).

To estimate effects to sensitive genera, it may be necessary to exclude genera favored by the pollutant if the region is highly modified. This was not done with the Appalachian data set. All genera, regardless of the exposure–response form, were included in the SSD. However, the XC values for those that do not descend to zero in the observed range, such as Cheumatopsyche, are treated as “greater than” values. Because the 5th percentiles of the SSD is derived by interpolation, it is not necessary to provide point estimates of the XC values for resistant taxa. The setting of the benchmark in a conductivity range in which the occurrence of some genera is increasing suggests that the benchmark could result in the extirpation of some genera. However, that is not the case. All but one of the 163 genera occur in sites with low conductivity (<100 µS/cm). Even if that were not the case, the concern for resistant taxa is unwarranted. This benchmark is designed to protect taxa that occur in unpolluted streams, not taxa that require pollution.

Validation of the benchmark

The aquatic life benchmark was validated with an independent data set. Application of the same methodology to data from the state of Kentucky gave a very similar result, 282 µS/cm with a lower confidence bound of 169 µS/cm and an upper bound of 380 µS/cm (27, Appendix G).

Characterization of the benchmark

The aquatic life benchmark of 300 µS/cm is appropriate for year–round application. This level is expected to prevent the extirpation of 95% of invertebrate genera in this region. The estimated two–tailed 95% lower confidence bound of the HC05 point estimate is 228 µS/cm and the upper bound is 303 µS/cm.

The method used to develop the benchmark is an adaptation of the standard method for deriving water–quality criteria for aquatic life (i.e., Stephen et al, 12), so it is supported by precedent. Because the organisms are exposed throughout their life cycle, this is a chronic value. Acute exposures were not evaluated.

The aquatic life benchmark for conductivity is provided as scientific advice for reducing the increasing loss of aquatic life associated with ionic mixture with Ca+, Mg+, SOmath image, Cl and HCOmath image at circumneutral pH. Because there are well-documented studies of the physiological role of anions in the function of chloride cells, a reasonable characterization of the mixture on a mass basis is SOmath image + HCOmath image greater than or equal to Cl. The aquatic life benchmark for conductivity is applicable to the parts of West Virginia that provided the data for its derivation and to Kentucky, which gave essentially the same result. It may be relevant to ecoregions 68, 69, and 70 outside the sampled area 33. This is because the ionic matrix and background are expected to be similar throughout the ecoregions. Note that ecoregion 68, Southwestern Appalachia, does not occur in West Virginia and is not included in the derivation of the benchmark value; but it is included in the validation data set from Kentucky. The aquatic life benchmark may also be appropriate for other nearby regions. However, this benchmark level may not be relevant when the relative concentrations of dissolved ions are different.

DISCUSSION AND CONCLUSIONS

The derivation of this aquatic life benchmark using conductivity illustrates the practical use of the field–based method for developing water-quality benchmarks for pollutants that are not amenable to laboratory methods 11. The method is credible because it is adapted from methods that have been successfully used for nearly 30 years to develop water-quality criteria using laboratory data and because the field-based method has withstood extensive public and peer review. The derived benchmark is credible because it has been validated and has withstood tests of the models, causation, and potential confounding.

Ecological relationships are dependent on environmental conditions. Regions where many genera are already extirpated would result in a benchmark that would protect only remaining species. Field data that are collected after susceptible taxa have emerged as terrestrial insects—in this case, in the summer after spring emergence—will be based on more tolerant taxa and could result in the extirpation of many genera. Inclusion of other ionic mixtures may also lead to higher XC95 values that are not protective of the ionic mixture evaluated in the present study.

For these reasons, when we used the method, we restricted the case to a well-defined region and a relatively homogeneous set of streams with a common type of source. It will be important to develop experience and guidelines for using the method in more complex situations.

The sensitivity distribution is a model of how representative species in general respond to a stressor and does not require that the species or genera be the same in all applications or at all locations. In this example, the SSD represents genera inhabiting naturally dilute waters. Therefore, the conductivity benchmark may be relevant outside of the region tested with the data set if there is no contradictory information such as evidence that undisturbed background is naturally greater than the benchmark and if the ionic composition on a mass basis contains SOmath image + HCOmath image ≤ Cl. Based on these restrictions for the extrapolation of the benchmark outside the tested area, we speculate that the conductivity benchmark developed in the present study will be applicable to all naturally low-conductivity streams affected by leachates from calcareous minerals but that it is not applicable to streams affected by coastal saltwater intrusion or road salt application in winter. We expect that a different benchmark would be needed where the ionic composition is primarily Na+ and Cl. Those hypotheses must be tested by further research.

Some situations require field data to develop a protective or remedial benchmark. This method worked well in this case and might be useful for other environmental agents that have measurable deleterious effects in the field, such as dissolved oxygen, nitrates and phosphates, suspended and deposited sediment, organic enrichment, and hydrologic flow.

SUPPLEMENTAl DATA

Table S1.

Figure S1. (279 KB DOC)

Acknowledgements

We thank the West Virginia Department of Environmental Protection, which provided data for analysis. Reviewers, both anonymous and named, improved the quality of the content and presentation: L. Yuan, M. Griffith, D. Petersen, C. Delos, M. Passmore, J. VanSickle, P. White, C. Schmitt, C. Menzie, and C. Hawkins, and members of the U.S. EPA Biological Advisory Committee. We also thank the members of the U.S. EPA Science Advisory Board for their careful review, interdisciplinary insights, and encouragement: D. Patten, E. Boyer, W. Clements, J. Dinger, G. Geidel, K. Hartman, R. Hilderbrand, A. Huryn, L. Johnson, T.W. La Point, S.N. Luoma, D. McLaughlin, M.C. Newman, T. Petty, E. Rankin, D. Soucek, B. Sweeney, P. Townsend, and R. Warner. The work was supported by the U.S. EPA. The views expressed in the present study are those of the authors and do not necessarily represent the views or policies of the U.S. EPA.

Ancillary