Using dose–response functions to improve calculations of the impact of anthropogenic noise

2. Regulators often prefer a single threshold to a full dose–response function, but much of the variability observed in the threshold at which different individuals respond to a stressor is an inherent characteristic of populations that needs to be taken into account to predict the effects of stressors. When selecting an exposure threshold, regulators need information on the proportion of the population that will be protected.

KEYWORDS acoustic threshold, behavioural response, behavioural take, dose:response, dose-response, stressor exposure 1 | INTRODUCTION When regulators want to protect a population from a hazard, they often aim to find a single threshold that constrains the risk to an acceptable level. For example, motorways may have a maximum speed limit, or exposure to a chemical may be limited to a maximum safe dosage.
Selection of an appropriate limit often depends on a decision about what level of risk is permissible. For example, noise in US workplaces is regulated (under 29 CFR 1910.95, n.d.) by a permissible noise exposure limit of 90 dBA (a dB scale weighted for human hearing) averaged over an 8 h workday. However, the US National Institute for Occupational Safety and Health recommends a noise exposure limit of 85 dBA because it is associated with an 8% excess risk of developing hearing loss over a 40 year working life, a risk that is preferred to the 25% excess risk expected under the 90 dBA permissible limit (NIOSH, 1998).
Acoustic thresholds are also used to estimate the potential impact of noise on wildlife. Some thresholds are applied with respect to the sound source, and others are applied to the sound level as experienced by the animal. As an example of the source-based approach, the German government aims to protect the hearing of marine mammals by limiting noise from impact pile driving to a single-strike sound exposure level L E,p of 160 dB re 1 μPa 2 s and a peak pressure level L p,pk of 184 dB re 1 μPa measured 750 m away from the sound source (Dähne, Tougaard, Carstensen, Rose, & Nabe-Nielsen, 2017). Most approaches to estimating the impacts of noise focus on the sound as received at the animal as opposed to at the source. Faulkner, Farcas, and Merchant (2018) review the process for environmental impact assessments of noise according to European and US regulations. In most jurisdictions, the environmental impact assessment process assesses environmental risk by comparing the distribution of sensitive receivers with that of the potential hazard. In the case of noise, hearing is used to identify which receivers are sensitive to a particular noise source and the sound field around the source is estimated using propagation models. Thresholds for noise exposure are then selected depending upon the characteristics of the source and receiver, and the relevant regulatory criteria.
In the US, the Marine Mammal Protection Act (MMPA;Marine Mammal Commission, 2015), prohibits the killing, injury and harassment of marine mammals. The US National Marine Fisheries Service (NMFS) has established specific levels of underwater sound exposure that are expected to injure or harass most marine mammals. NMFS (2016) provides acoustic thresholds for effects of noise on hearing, using different sound exposure levels for different taxa and sound types. The US criteria for behavioural harassment are root-meansquare sound pressure levels (L p,rms ) of 120 dB re 1 μPa for most continuous sounds such as vessel noise and 160 dB re 1 μPa for impulsive sounds such as pile-driving or airguns used in seismic surveys (NOAA Fisheries West Coast Region, 2018). More sophisticated analyses that weight exposure levels by hearing capabilities of different species and that rank severity of response have been developed in Europe (e.g. Verboom, 2002) and in the US (Southall et al., 2007). These are described in NMFS (2018) technical guidance but have not yet been incorporated into regulations.
Here emphasis is placed on the importance of quantifying variability in responsiveness to sound in order to estimate the number of animals impacted, a topic that has been overlooked in most reviews of environmental impact assessment. Evaluations of the impact of chemical pollutants often use a dose-response function to estimate impact, but evaluations of the effects of noise often use a single number to estimate impact, assuming that no animals are affected below that number and that all animals exposed above that number are affected.
Recent environmental assessments of seismic surveys illustrate how regulators in the US use step function thresholds to estimate the number of animals impacted by a sound source. The Bureau of Ocean Energy Management (BOEM) leases offshore areas of the US for energy development and is responsible for assessing the environmental impact of these developments, including seismic surveys. Their environmental impact statements are required to estimate the number of animals taken by killing, by potential for injury (which is called level A harassment in the MMPA) and by disruption of behaviour (which is called level B harassment in the MMPA). BOEM (2014, states 'The NMFS considers behavioral response criteria as a step-function (all-or-none) threshold based solely on the rms value of received levels' and the threshold likely to cause 'behavioral disruption for impulsive sounds [Level B harassment]) is 160 dB re 1 μPa (rms). For non-impulsive sound sources, such as those associated with vessel traffic, aircraft, and drilling and dredging activities, the sub-injurious threshold is 120 dB re 1 μPa (rms).' The terminology in this quote refers to root-mean-square sound pressure levels, which we refer to in this paper as L p,rms , following ISO (2017). Thus, US regulations require an estimate of the number of animals 'taken' by level B harassment, which is defined as a received sound pressure level L p,rms above 160 dB re 1 μPa for impulsive sounds. Calculation of these take estimates by BOEM (2014) uses sophisticated modelling of acoustic sources, sound propagation and marine mammal distribution and abundance. However, the step function criterion assumes that no animals exposed below 160 dB re 1 μPa are impacted and that all animals exposed above 160 dB re 1 μPa are impacted.
The NMFS acoustic criteria for behavioural harassment were based upon studies of reactions of marine mammals to anthropogenic sounds that document a range of received levels associated with response. For example, Malme, Miles, Clark, Tyack, and Bird (1984) generated dose-response functions for avoidance responses of migrating grey whales to continuous noises and impulsive noises associated with the offshore energy industry. For several different continuous noise sources, avoidance started at received levels L p,rms of 110-119 dB re 1 μPa, with >80% of animals responding at received levels L p,rms of 130 dB re 1 μPa or more. The received level of continuous sound avoided by 50% of migrating gray whales, a criterion called the RLp50 here, was L p,rms = 120 dB re 1 μPa. US regulators use this RLp50 of 120 dB re 1 μPa as a threshold for level B 'takes' by disruption of behaviour (Green et al., 1994;NOAA Fisheries West Coast Region, 2018). In contrast, much higher levels were required to evoke similar avoidance responses for impulsive noises, which in the Malme et al. (1984) study were generated by air guns used for seismic surveys: 10% of whales avoided exposures of L p,rms = 164 dB re 1 μPa, with 90% of animals responding at L p,rms = 180 dB re 1 μPa and an RLp50 of L p,rms = 170 dB re 1 μPa. Malme, Würsig, Bird, and Tyack (1987) also investigated the response of feeding gray whales to airgun impulses, and found an RLp50 of L p,rms = 173 dB re 1 μPa (68% confidence limits of L p,rms = 170-175 dB re 1 μPa), slightly higher than that for migrating whales. US regulators use a behavioural disruption threshold of L p,rms = 160 dB re 1 μPa for response to impulsive sounds, which is not only below the RLp50 but is even lower than the 10% probability of avoidance, perhaps because other studies have demonstrated responses of other species at lower received levels (High Energy Seismic Survey, 1999;Richardson, Greene, Malme, & Thomson, 1995).
Methods have recently been developed to estimate probabilistic functions relating acoustic exposure to behavioural responses of marine mammals (e.g. Miller et al., 2014) and to integrate data to estimate dose-response functions from different behavioural response studies (Harris et al., 2015). Here a dose-response function from Miller et al. (2014) is used to illustrate how use of an RLp50 step function, as currently employed for environmental impact assessment, leads to substantial underestimates of how many animals will be impacted. The details of how impact is calculated depends on specifics of the dose-response function and how sound attenuates as it travels through the ocean, but the general point is relevant for estimating impact of exposure to all stressors for which there is variation in sensitivity within the population.
The Miller et al. (2014) dose-response function is used as an example to show how the number of impacted animals can be estimated in a way that accounts for the spatial distribution of the hazard and the subjects, and further we show how an appropriate threshold, which we call the effective received level (ERL), can be calculated. This threshold, when used as if it was a step function, gives the same number of impacted animals for specific sound propagation conditions as would be obtained from the full dose-response function, and so is an appropriate threshold if regulators prefer a single-step function to estimate the number of animals affected by a noise source in a particular site. Sometimes, a full dose-response function is not available, but estimates exist of the proportion of animals responding at various levels of dose. This summary information can be used to give a good approximation to the correct ERL.
BOEM (2017, 26287) explains why take estimates in their environmental assessments ignore known sources of uncertainty: 'confidence intervals were not developed for the exposure estimate results, in part because calculating confidence limits for numbers of Level B harassment takes would imply a level of quantification and statistical certainty that does not currently exist'. Many of the elements used to estimate takes, including specification of acoustic sources, sound propagation modelling, and estimates of density and abundance of marine mammals, estimate the distribution of values to be expected.
New methods have been developed to quantify the uncertainty of the relationship between acoustic dosage and the probability that animals will respond, which enables quantification of uncertainty in estimates of impact. Risk assessments developed by the US Navy (2016; Moretti et al., 2014) and the methods described by Miller et al. (2014) and Harris et al. (2015) all estimate continuous functions of acoustic dosage and the probability of response. Here we discuss how simulation can estimate uncertainty in take estimates using probabilistic dose-response functions and estimates of the distribution of relevant parameters.

| EXAMPLE: USING A DOSE-RESPONSE FUNCTION TO ESTIMATE THE NUMBER OF ANIMALS AFFECTED
Estimating the number of animals that would be affected by transmissions of an anthropogenic sound requires combining the relationship between acoustic dosage and the probability of response with the function that predicts how received sound level decreases with range from the source and overlaying this on an estimate of the spatial distribution of animals in the region of interest. Figure 1 shows the FIGURE 1 Dose-response function derived from experiments performed on free-swimming killer whales exposed to a steadily increasing level of sonar sounds (Miller et al., 2014). The x-axis shows the received level (root-mean-square sound pressure level, L p,rms ) of sonar sounds, and the y-axis shows the probability of whales responding as a function of received level. The dotted lines show the 95% posterior credible interval, illustrating important uncertainty owing to the small sample of whales in the study. The received level at which the most sensitive 50% of the population are expected to respond (RLp50) for this function is 141 dB re 1 μPa, illustrated by the red lines. The received level at which the most sensitive 10% of the population are expected to respond is 100 dB re 1 μPa, illustrated by the blue arrows relationship between acoustic dosage and probability of response estimated by Miller et al. (2014) for avoidance responses of killer whales, Orcinus orca, exposed to sonar sounds. The analysis assumed that no whales would respond to sonar below a level of L p,rms = 60 dB re 1 μPa, which is near the limit of hearing sensitivity of killer whales at this frequency, and that all whales would respond at a received level of L p,rms = 200 dB re 1 μPa.
The dose-response function shown in Figure 1 uses data from eight controlled exposure experiments to predict the probability of a killer whale showing an avoidance response to received levels of sonar between L p,rms = 60 and 200 dB re 1 μPa. The blue arrows show that the most sensitive 10% of whales are expected to respond at a received level of L p,rms = 100 dB re 1 μPa and the red arrows show that half of the whales are expected to respond at a received level of L p,rms = 141 dB re 1 μPai.e. that in this example RLp50 = L p,rms = 141 dB re 1 μPa.

| Using the RLp50 threshold greatly underestimates number impacted
To estimate how many whales would be impacted by sonar transmissions, it is necessary to calculate how the intensity of the sonar sound decreases with range from the sound source. For the purposes of our example, the sonar sound is assumed to spread equally in all directions, following an inverse-square 1/r 2 spherical spreading (where r is the distance from source to receiver). The Miller et al. (2014) dose-response function was developed for sonar signals at 1-2 and 6-7 kHz; statistical modelling provided little support for differentiating response by frequency, so here, when modelling frequency-dependent sound propagation, a nominal frequency of 3 kHz is used, splitting the difference between the two frequencies tested. For a sonar producing a sound source level of L S = 210 dB re 1 μPa m at an assumed frequency of this paper is that sound can propagate so efficiently underwater that noise may cause impacts at greater ranges than is intuitive to humans with experience of sound in air. This can cause a mismatch between regulations and actual effects. For example, the German limitations on source levels of piling as measured at 750 m are designed to protect porpoise hearing at close ranges. However, even with mitigation measures to reduce source level, porpoises showed significant avoidance out to 12 km for up to 5 h after piling stopped (Dähne et al., 2017). Given such pronounced avoidance, habitat exclusion of many animals at large ranges is probably of greater concern than hearing damage of a few animals at close ranges. The US Navy has calculated numbers of takes using methods similar to the ones recommended here but has recently added cut-off distances beyond which they truncate the probability of responses to zero (Navy, 2017)  Received level as a function of range (distance from the sound source) for a sonar signal with a source level of 210 dB re 1 μPa m and a frequency of 3 kHz. Red arrows indicate the 2.7 km range at which 50% of whales with dose-response function shown in Figure 1 are estimated to respond, while blue arrows indicate the 71 km range at which 10% of whales are estimated to respond such interactions can readily be incorporated into the approach presented here.

| Using the dose-response function to improve estimates of the number of whales impacted by a stressor
To calculate the expected number of animals responding using a doseresponse function, we simply multiply the number of animals expected to be at each distance from the source by the probability that these animals will respond. The number of animals at each distance is obtained from our assumption about animal density. The probability of response at each distance is obtained from the dose-response function and sound propagation model. Mathematically, the way to do this accurately is through integration; a simple approach is to divide the area around the source into a large number of equally spaced range bins between zero and the distance at which probability of response becomes equivalent to zero for regulatory purposes, then to calculate the number of expected responses in each bin and to add them up.
Taking the Miller et al. (2014) example, in addition to showing a plot of the dose-response function, the authors provide (in their Table   4) a set of quantiles for probability of response over a range of doses.
The current authors fitted a simple smooth curve to these values (a spline-based interpolation -R code given in Supporting Information) and used this to predict the probability of response at the mid-points of a set of 10,000 distance bins from 0 to 240 km (this latter distance being the range at which the received level drops below L p,rms = 60 dB re 1 μPa and so the probability of response is assumed to be zero), each 24 m wide. For example, the midpoint of the first bin is at 12 m, and the predicted received level at this range is L p,rms = 210-21.59 = 188.41 dB re 1 μPa. From the interpolated dose-response function, the probability of response at this received level is 0.95.
The area of this bin is π × 0.024 2 = 0.0018 km 2 , hence the expected number of animals is 0.0018 (fractional animals will be the norm given such small bin widthsbut rounding must not be done at this stage). Hence the expected number of animals responding in this bin is 0.0018 × 0.95 = 0.0017. Similarly, the midpoint of the second bin is at 36 m, the corresponding received level is L p,rms = 178.87 dB re 1 μPa and the probability of response is 0.90. The area of this bin, which is a ring with inside radius 24 m and outside radius 48 m is π × 0.048 2 − π × 0.024 2 = 0.0054 km 2 . Hence, the expected number of animals responding in this bin is 0.0054 × 0.90 = 0.0049. Note that this is more animals than the previous bin because, although the probability of response is lower, the area of the bin is greater.
Repeating this exercise for all of the bins gives the pattern shown in These calculations require that researchers provide enough information to enable the probability of response to be calculated for any given acoustic dose. Miller et al. (2014) provided a table of quantiles that we used for this purpose. Malme et al. (1984) similarly tabulated the received levels and ranges at which different proportions of grey whales would be expected to avoid airguns. As a useful alternative, Moretti et al. (2014) provide a parametric equation that closely approximates the dose-response function they fitted for cessation of feeding dives in Blainville's beaked whales as a function of received sonar level. This enables probability of response to be calculated at any desired level of dose using, for example, a simple spreadsheet.
2.3 | Calculating a single threshold value that yields the same effect as the dose-response function: the effective received level The dose-response function provides the basis for estimates of the number of animals affected by an anthropogenic sound source, but if regulators in some jurisdictions prefer an effective radius or an acoustic criterion that is just one single number, then it is possible to combine information from the dose-response function, sound source level and models of acoustic propagation and animal distribution to calculate these values for each specific case. One way to conceptualize this effective radius is to start with the estimates derived in the previous section of the number of animals expected to respond in FIGURE 3 The number of animals expected to respond to sonar as a function of distance from the sound source. The solid black line shows the number of animals expected to respond to sonar in each of 10,000 equally spaced range bins from 0 to 240 km. This is calculated by multiplying the number of animals expected to be in each range bin, shown by the dashed black line, by the probability that each animal at that range will respond (derived from the dose-response function shown in Figure 1 and the received level to range conversion shown in Figure 2). Also shown, as a vertical green line, is the effective response radius (ERR), i.e. the range at which as many animals are expected to respond beyond that distance from the source (denoted by the magenta polygon) as do not respond within that distance (orange polygon, of equal area to the magenta polygon) each distance band, and to calculate the range at which as many animals respond beyond this range as fail to respond within it ( Figure 3). Then, by definition, the number of animals (responding or not) within this range is exactly equal to the total number of animals responding. We term this range the effective response range (ERR), after a similar concept used in point transect surveys of wildlife populations (Buckland et al., 2001). This is readily translated, via the propagation model, into an estimate of the corresponding received level of sound, the effective received level (ERL).
This concept is further illustrated graphically in Figure 4, using simulated animal positions. The left panel of Figure 4 shows a simulated distribution of animals, with those responding indicated in red and those not responding in black. The right panel shows the distribution if each red point outside of the effective radius is moved to replace a black point inside the radius. The ERR is the radius that encompasses an area including the total number of animals estimated to be impacted.
In our case, the total number of 6437 animals corresponds to an area of 6437 km 2 at a density of 1 animal/km 2 . The ERR for this area is 45.3 km, which corresponds to an ERL of L p,rms = 109 dB re 1 μPa.
Note that, in general, an assumption about absolute animal density is not required to calculate the ERR or ERL. The ERR and ERL given will be identical for assumed densities of 1, 10 or 100 animals/km 2 (or any other value). We do require an assumption about the spatial distribution of animals around the source, and in general (in the absence of other information) the assumption is made that animals are uniformly distributed around the source. An estimate of density is, however, required to estimate the absolute number of animals impacted: this number is simply π × ERR 2 × density. In cases where information is available to estimate non-uniform distribution around the source, then the information about animal distribution is required to calculate ERR and ERL. The primary obstacle to analysing uncertainty with respect to estimating the number of animals impacted by sound that is produced by a human activity has stemmed from the dose-response function. Most activities that generate sound in the ocean are able to specify variation in features of the sound that is produced. Similarly, models and measurements of sound propagation in the ocean can be used to quantify uncertainty in the level received by an animal some distance from the sound source. Biologists who estimate the sizes of wildlife populations FIGURE 4 Conceptual illustration of the process for calculating an effective response range or effective received level to predict the number of animals impacted by a sound source in an environment with known properties of sound propagation. The left panel shows a simulated distribution with animals that respond indicated in red and animals that do not respond indicated in black. The intensity of the blue background scales to the probability of response at that distance from the sound source. In the right panel, each red point outside of the effective radius (the green circle) is moved to replace a black point inside the radius. The effective response range is the radius that encompasses an area including the total number of animals estimated to be impacted. In our example, the total number of 6437 animals corresponds to an area of 6437 km 2 at a density of 1 animal/ km 2 . The radius of a circle with this area, the effective response range (ERR) is 45.3 km, which corresponds to an effective received level (ERL) of 109 dB re 1 μPa are usually very disciplined in calculating uncertainty in their estimates. The same agencies that ignore uncertainty in estimating takes recognize the critical importance of incorporating uncertainty in other management models. For example, the protocol used by NMFS to calculate an allowable mortality of marine mammals caused by humans uses a minimum population estimate defined as the lower 20th percentile of the estimated abundance distribution (Wade, 1998). Taylor, Wade, De Master, and Barlow (2000) used simulations to show that using the best estimate of population size resulted in many populations being unacceptably depleted, while use of the 20th percentile of the population estimate prevented most unacceptable outcomes.

| Quantifying uncertainty
The methods developed to derive probabilistic dose-response functions (e.g. Miller et al., 2014) make it possible to quantify uncertainty about dose-response. There are simple ways to calculate the effect of uncertainty in the dose-response function alone. For example, the dotted lines in Figure 1 indicate the 95% credible interval (the Bayesian analogue to a confidence interval) for the function relating killer whale avoidance to received levels of sonar sound. By repeating the calculation described in Section 2.2, using the 2.5 and 97.5% quantiles from Table 4 of Miller et al. (2014), rather than the mean estimate for probability of response, one can calculate a 95% interval on the expected number of animals impacted, which is 548 to 20541.
In fact, this is a slight over-estimate of the uncertainty arising from the dose-response function in this example, for a technical reason: the dotted lines in Figure 1 are pointwise credible intervals, i.e. they show uncertainty in probability of response for a given dose. What is actually required is a credible interval on the whole function, which will be narrower. This was not given by Miller et al. (2014), but here we calculated 1000 replicate dose-response functions sampled from the posterior distribution on their model parameters, and this was used to calculate a 95% interval on the expected number of animals impacted (code and data provided in Supporting Information) of 733-20111. In general, 95% uncertainty intervals should be provided by researchers analysing the ERLthis is readily converted into a 95% interval on numbers impacted, given a model of sound propagation and animal density (in the current example, this interval is L p,rms = 97.3-123.5 dB re 1 μPa).
These intervals account only for uncertainty in the dose-response function, where in reality there are other sources of uncertainty, probably the most important of which is animal density in the impact zone.
In general, where the uncertainties have been quantified, multiple sources of uncertainty can be readily combined by researchers to estimate resulting uncertainty in the numbers affected using a Monte Carlo simulation approach. A random sample is drawn from the distribution of dose-response functions, animal density, etc., and the resulting estimated number impacted is computed. This process is repeated many times, to give a distribution on the estimated number affected.

| DISCUSSION
This paper describes how use of a step function to define the relationship between exposure and response of wildlife to a stressor can lead to errors in estimating the impact of the stressor if variability in responsiveness within the population is not taken into account. Newly developed methods to quantify probabilistic functions that relate acoustic dosage to behavioural response (e.g. Harris et al., 2015;Miller et al., 2014) show how prior information coupled with relatively low sample sizes of controlled experiments can be used to define probabilistic dose-response functions. These functions can be combined with site-specific information about sound propagation and animal distribution to estimate the number of animals likely to be affected by a human activity that introduces sound into the ocean.
Much of the variability observed in the threshold at which different individuals respond to a stressor is not measurement error but is an inherent characteristic of populations that needs to be taken into account to predict the effects of stressors. Every population of organisms will be expected to show variation in sensitivity to any stressor.
We know that disruption of behaviour by sound depends on the characteristics of the sound and the hearing sensitivity of each animal, and the likelihood of disruption often depends upon the age/sex class of the animal, its experience with similar sounds, and the behavioural context in which it hears the sound (Ellison, Southall, Clark, & Frankel, 2012). All of these factors lead us to expect considerable variability in responsiveness across a wildlife population, which in fact has been observed by most studies on this topic. Ellison et al. (2012) review evidence that the context in which an animal is exposed to a sound can strongly affect the probability or the severity of a behavioural response. This leads them to argue that dose-response functions should only be used to predict the probability of response at high sound levels, with multivariate contextual variables being used at low sound levels either to replace acoustic exposure as a predictor for probability of response (Ellison et al., 2012, Figure 2) or in a weighted combination with acoustic exposure. It is not obvious how a management approach that ignores the dosage of sound, especially at low exposure levels, can predict the number of animals likely to be impacted. There may be some circumstances where regulators may choose to prohibit a sound source or activity within detection range of a wildlife population engaged in a specific activity (such as breeding), either because the population is particularly sensitive at that time and place or because disruption of behaviour would be likely to lead to unacceptable population impacts. This approach would be particularly difficult for intense low-frequency sound sources that can routinely be detected hundreds of kilometres away (e.g. Nieukirk et al., 2012).

| Dose-response
In settings where it is not possible to prevent overlap of a stressor and the affected population, it is essential to use dose-response functions coupled with estimates of intensity of exposure for individuals to estimate the number of animals impacted by the stressor. The practicality of using full dose-response functions to estimate takes is demonstrated by the long-standing use for over a decade of a sonar risk continuum function in environmental impact statements that evaluate the effect of naval sonar on marine mammals (Navy, 2002(Navy, , 2016. A significant benefit of functions that relate the probability of response to acoustic exposure is that they enable the selection of a probability of response that is appropriate for each specific policy context. In contrast, the use of a single threshold, such as the RLp50, hinders this calibration of risk in terms of the proportion of the population that is impacted. Malme et al. (1984) selected the RLp50 avoidance value 'rather than the customary 0.95 level since the 0.95 level is not adequately defined by the available data'. This may be reasonable from a scientific perspective, but limiting the focus to RLp50 to estimate the number of takes not only prevents the correct calculation of impact, but also narrows the criterion to a value that may be inappropriate for many regulatory functions. The acceptable percentage of animals impacted depends upon the policy context.
For example, Norwegian support for the Miller et al. (2014) study was motivated by concerns expressed by a whale watch industry that Norwegian naval exercises caused killer whales to vacate the whale watch area, harming whale watch companies (Kuningas, Kvadsheim, Lam, & Miller, 2013). In this case, maintaining half of the whales available for whale watching might meet the needs of the industry. In contrast, the southern resident population of killer whales in Puget Sound is listed as endangered under the US Endangered Species Act, in part because of the risk of behavioural disruption by anthropogenic noise (Krahn et al., 2004;NMFS, 2011). Here it is unlikely that regulators would select an RLp50 threshold of impact that allowed half of the animals exposed above the threshold level to be adversely impacted.
Similarly, acoustic criteria are used by many regulators to establish shut-down zonesan area around a sound source where the source must be shut down if animals are sighted within it to prevent them being harmed. If such a shutdown zone were established using an RLp50 based upon hearing damage, then the shutdown would only protect the least sensitive half of the population. There are few jurisdictions that would accept protective criteria that allow half of the population to be harmed even when exposure is limited to below the threshold level.
Use of the Miller et al. (2014) dose-response function to estimate how many animals are likely to be affected by sound at various distances from the source emphasizes that large numbers of animals are likely to be affected by exposure at long ranges from the source.
At ranges close enough for the probability of response to be high, the area may be small enough that few animals are likely to be affected. At long ranges where the probability of response is low, if the area affected is large enough, then large numbers of animals may be affected because the small probability is multiplied by the large area. Many behavioural response studies have emphasized providing exposures with received levels high enough for high probability of response, but our analysis here emphasizes the importance of quantifying probability of response at low levels of exposure far from the source where the probability of response is relatively low. Such studies will require larger sample sizes to quantify low probabilities of response. Achieving the necessary sample sizes may be facilitated by tagging a large number of animals at varying ranges from the source and/or passive acoustic monitoring of vocal responses of many animals over large areas. The availability of tags that can measure exposure and response over long periods of time would facilitate monitoring responses to operational use of loud sources if animals can be tagged far enough in advance of sound transmission to quantify pre-exposure behaviour, and then can log exposure and potential responses.

| Selection of appropriate exposure and response measures
An important aspect of studying dose-response functions is selecting appropriate exposure and response measures (Ellison et al., 2012;Madsen, 2005). As in toxicology, the selection of response measures depends upon a combination of science, policy and regulations. The key for estimating takes by level B harassment under the MMPA is to define responses that cross the threshold of evoking prohibited disturbance. Responses where a subject turns away to avoid exposure to a sound are often treated as a disturbance reaction (e.g. Malme et al., 1984). Avoidance responses are also relevant in other jurisdictions, especially if they involve shifts in distribution over large scales of time and space. For example, a study in Norwegian waters focused on avoidance responses after whale watching companies complained that naval sonar exercises caused a decline in killer whale sightings, harming the industry (Kuningas et al., 2013). Responses treated as disturbance also include cases where exposure to sound causes a subject to switch from one behavioural state such as foraging to another behaviour such as travelling (e.g. Goldbogen et al., 2013;Isojunno et al., 2016), and NMFS even defines specific behavioural events, such as breaching, tail lobbing, underwater exhalation or an animal leaving its group, as strong adverse reactions to human activities (NMFS, 2007). Recent efforts to estimate the population consequences of acoustic disturbance (Pirotta et al., 2018) provide models to help decide which changes in behaviour may reach a threshold appropriate to trigger regulations that are driven by effects on populations. Important parameters to measure in these cases involve the energetic cost of response and the time required for a return to pre-exposure baseline conditions.
The appropriate exposure measure depends on the response being studied. For example, extensive studies on the sound exposures required to reduce hearing sensitivity (temporary threshold shift or TTS) suggest that to a first approximation best predictor is either a very high peak pressure level or the cumulative dose of sound energy (Southall et al., 2007). Most studies on what sounds might disturb a marine mammal have tended to measure the received level of individual sounds, expressed as a root-mean-square or RMS sound pressure level L p,rms , as this can be measured directly (Southall et al., 2007;see Madsen, 2005 for issues concerning RMS measures for transient signals). However, as Ellison et al. (2012) point out, it is often useful to include additional acoustic measures for predicting probability of response. The annoyance value of a loud sound may relate to how much louder it is than a subject's hearing sensitivity. This difference, called the sensation level, is also helpful for estimating how faint a signal a subject can detect in quiet conditions. The sensation level is also used for predicting onset of a specific response called the acoustic startle response that is shared among mammals. This aversive response is triggered in mammals by intense sounds with a sensation level >90 dB that have a rise time of 15 ms (Yeomans, Li, Scott, & Frankland, 2002; in marine mammals Götz & Janik, 2011).
In cases where the hearing sensitivity of subject species at the frequencies of an anthropogenic noise is known, audiograms can be used to calculate sensation level, which can be incorporated into dose-response studies. Ellison et al. (2012) argue that measurements of behavioural responses 'invalidate the use of an absolute, doseresponse RL approach'. However, selection of exposure measures, such as sensation level, that require audiometric measurements is problematic for species such as baleen whales with no measurements of hearing sensitivity. For species with some measurements of hearing sensitivity, the use of the sensation level will add new sources of uncertainty if information about variability in hearing sensitivity within a population is incomplete, especially if the subjects whose hearing has been measured might come from a biased sample with abnormal hearing, for example owing to injuries related to stranding. Information about hearing can be incorporated into Bayesian analyses in other ways. For example, Miller et al. (2014) assume a zero probability of response for received levels lower than the whales could hear, enabling this hearing threshold to be included in their Bayesian analyses, which used L p,rms as a response parameter.
For a marine mammal to detect an anthropogenic sound, the animal's hearing must be sensitive enough at the frequency of the sound and the sound must have enough energy above the ambient noise at that frequency. The hearing of marine mammals is very acute, but if a noise source lies outside the frequencies of best hearing, a marine mammal might not be able to hear it. For example, the noise generated by offshore windmills is far enough below the frequency of best hearing for bottlenose dolphins (45 kHz, Popov et al., 2007) that they would not detect windmill noise below 1 kHz recorded in a variety of shallow water habitats (Madsen, Wahlberg, Tougaard, Lucke, & Tyack, 2006). Southall et al. (2007) address these issues by pooling marine mammals into species groups defined by hearing capabilities, and they develop weighting functions to discount sound energy at frequencies the animals are estimated not to hear well. Weighting the levels of the sound stimulus by these functions makes it possible to estimate the sound energy that an animal is likely to hear, even for species without audiometric data.
For the many marine mammal species whose hearing sensitivity has not been measured, most analyses would have to assume that ambient noise limits their ability to detect acoustic signals. For analyses of noise-limited detection ranges, measurement of the noise level at the frequencies of the anthropogenic sound of concern is essential for estimating signal levels below which a subject is unlikely to respond. The signal-to-noise ratio is a critical parameter for this estimate, which requires estimates of the frequency bands over which the subjects' ears integrate acoustic energy. Most mammalian auditory systems integrate sound energy over about a third of an octave, so this is commonly assumed. It is important to note that the bandwidth over which noise should be integrated is a critical parameter for estimating range of effect.
In addition to the frequency range of signals, their duration is also important for defining acoustic parameters of different stimulus types.
The time window over which the auditory system integrates sound energy is important for estimating the perceived loudness of signals of different durations. Analysis of this integration time for marine mammal ears suggests use of a 125-200 ms window for estimating L p,rms values, even for signals with longer durations, along with longer time windows for cumulative sound exposure measures such as L E (Madsen, 2005;Tougaard, Wright, & Madsen, 2015).

| Uncertainty
In addition to dealing with inherent variability within populations, there is considerable uncertainty about many of the estimates used to predict impact of human activities. Many jurisdictions adhere to precautionary regulations, which require regulators to be more conservative the less they know (Foster, Vecchia, & Repacholi, 2000).
Methods to quantify uncertainty help regulators to meet the legal demands of underlying legislation that calls for such precaution. We advocate the use of Monte Carlo simulations to estimate the expected distribution of number of animals impacted based upon distributions of all of the factors that affect this.
An important source of uncertainty derives from the necessity of extrapolating dose-response functions from species that have been studied to those that have not been studied. Southall et al. (2007) solved this problem by pooling marine mammals into groups thought to have similar hearing. However, enough evidence of heightened sensitivity of beaked whales has caused NMFS to suggest a different acoustic criterion for harassment of beaked whales compared with the other members of the Southall et al. (2007) mid-frequency hearing group for cetaceans. This suggests that selecting appropriate doseresponse functions for poorly studied species can be problematic (Gomez et al., 2016).
Outside of the dose-response function, a major source of uncertainty in estimated impact is often due to uncertainty in the animal density within the impact zone. One potential method to make more accurate predictions of animal density is through habitat modelling of survey data (e.g. Roberts et al., 2016)this can be thought of as the animal density equivalent of context modelling of dose-response functions or location-specific acoustic propagation models: all of these seek to make more accurate predictions by better understanding the factors that cause variation, where these factors can be known in the time and place for which impact is to be estimated. However, as noted by Rob- Faced with all of this uncertainty, the reader may be tempted to go back to the use of simple thresholds, such as the RLp50 range. We argue to the contrary that coupling simple models for animal density and acoustic propagation with a dose-response function will yield a much more realistic answer than using a single RLp50 threshold. As has been shown, the RLp50 can lead to greater than two orders of magnitude underestimation of effect, much more error than expected from simple models of animal density and acoustic propagation. The use of simplified models of uniform animal distribution and uniform sound propagation is a well-established first-order approximation that yields reasonable estimates if more precise information is not available. The RLp50 calculation on the other hand is biased and will yield incorrect estimates for the propagation model and dose-response functions selected as reasonable examples here.

| CONCLUSIONS
The dose-response functions discussed in this paper are more complicated to describe and to apply than the single-value-step functions that are common in today's regulations. This complexity is necessary to avoid errors in estimating the number of animals impacted, but some readers may still question whether the complexity is essential for correct implementation of policy. Once the necessary information is available, a new step functionthe effective received levelis defined here to better estimate the number of animals impacted. It is important to emphasize that nearly all of the other parameters essential for evaluating impact, parameters that include (a) specifying the acoustic properties of sources, (b) how sound propagates and (c) estimating the distribution and abundance of affected animals, require quantitative analytical procedures that are at least as complex as those described here for dose-response functions. Therefore, the primary complication introduced by this approach is to force explicit quantitative judgments about risk and uncertainty about the proportion of a population that is impacted. These kinds of judgments are routine in acoustic source specifications, sound propagation modelling and population estimation. Surely the protection of species at risk deserves the same level of attention. This paper highlights the importance for conservation of not just accounting for high probabilities of impact on a few animals very near a sound source.
Given the shape of the dose-response function and how efficiently sound propagates in the ocean, the number of animals whose exposure level predicts low probability of response may be the dominant impact of the sound source.

SUPPORTING INFORMATION
Additional supporting information may be found online in the Supporting Information section at the end of the article. The intensity of a sound source is called the source level (L S ) and is evaluated with respect to a reference range of 1 m and is expressed as dB re 1 μPa m. When a sound spreads equally in all directions, following an inverse-square 1/r 2 function, the spherical spreading loss in sound energy as a sound passes from 1 to r metres is equal to 20 log 10 (r). Some sound energy is also absorbed as it passes through the ocean.
This absorption loss depends on the frequency of the sound. Here we will assume a sonar sound operating at a frequency of 3 kHz, which has an absorption loss of 0.000185 dB/m in normal sea conditions (Ainslie & McColm, 1998). The overall loss of sound energy as a sound passes from 1 to r metres, called the propagation loss or PL, is the sum of the spreading loss, 20*log 10 (r), and the absorption loss r * 0.00018.
The passive sonar equation is used to estimate the loudness of a sound received at range r from a sound source with a source level of L S (Urick, 2013). This equation simply states that the level received at range r equals the source level measured at 1 m minus the loss in energy as the sound travels from 1 to r metres: the received level L p,rms = L S -PL. So, for a sound source of L S = 210 dB re 1 μPa m transmitting in an environment with the PL described above, the received level = L S -PL translates to L p,rms = 210 − 20 log(r) − r * 0.000185.