Double‐observer line transect surveys with Markov‐modulated Poisson process models for animal availability

We develop maximum likelihood methods for line transect surveys in which animals go undetected at distance zero, either because they are stochastically unavailable while within view or because they are missed when they are available. These incorporate a Markov‐modulated Poisson process model for animal availability, allowing more clustered availability events than is possible with Poisson availability models. They include a mark‐recapture component arising from the independent‐observer survey, leading to more accurate estimation of detection probability given availability. We develop models for situations in which (a) multiple detections of the same individual are possible and (b) some or all of the availability process parameters are estimated from the line transect survey itself, rather than from independent data. We investigate estimator performance by simulation, and compare the multiple‐detection estimators with estimators that use only initial detections of individuals, and with a single‐observer estimator. Simultaneous estimation of detection function parameters and availability model parameters is shown to be feasible from the line transect survey alone with multiple detections and double‐observer data but not with single‐observer data. Recording multiple detections of individuals improves estimator precision substantially when estimating the availability model parameters from survey data, and we recommend that these data be gathered. We apply the methods to estimate detection probability from a double‐observer survey of North Atlantic minke whales, and find that double‐observer data greatly improve estimator precision here too.


Introduction
Line transect (LT) surveys are one of the most widely used methods of estimating animal abundance (see Buckland et al., 2001Buckland et al., , 2014, for an overview of LT and related methods). A persistent problem with LT surveys of marine mammals is failure of one of the key assumptions of standard LT theory, namely that animals at perpendicular distance zero are always detected (Buckland et al., 2001). If this probability is g 0 and true density is D, and if all other LT assumptions are met, then the expected value of a density estimate obtained using standard LT methods is g 0 D (Buckland et al., 2014).
Mark-recapture distance-sampling (MRDS) methods are widely used for dealing with the problem of g 0 < 1: see Burt et al. (2015) and references therein for an overview. They involve independent observers acting as "capture occasions" and rely on "recaptures" (or "duplicates") to estimate g 0 . They are susceptible to bias when there is correlation in detections by the two observers, due to latent variable(s) that are not in the model ("unmodelled heterogeneity"). Correlation can be induced by individual-level or environmental variables that affect the detection probabilities of both observers, and by animals being stochastically available for detection when animal availability to one observer is correlated with its availability to the other. We focus on the issue of availability in this article but include a brief discussion of unmodelled heterogeneity due to other latent variables in Section 6.
Stochastic availability can be dealt with by incorporating a model for the availability process. We refer to such models as "availability process line transect" (APLT) models. APLT models were initially developed assuming a Poisson availability process Okamura, 2003;Okamura et al., 2003), but more recent work has shown that an assumption of Poisson availability can result in substantially biased estimates of detection probability if it is violated (Okamura et al., 2012;Borchers et al., 2013;Langrock et al., 2013). To accommodate more clustered availability series than is possible with a Poisson model, Hiby and Lovell (1998) developed a continuous-time Markov process model, Borchers et al. (2013) developed a discrete-time hidden Markov model (HMM) and Langrock et al. (2013) developed a Markovmodulated Poisson process (MMPP) model for availability.
APLT models consider detection probability as a function of radial distance, and Langrock et al. (2013) show that when using a single observer it is sometimes, but not always, possible to estimate probability of detection at radial distance zero. A more reliable way of estimating this probability is to incorporate a mark-recapture (MR) component in APLT models. This is done by using two independent observers, as with MRDS methods. Some such methods do exist, but all either assume Poisson availability (Schweder et al., 1996;Skaug and Schweder, 1999) or incorporate the availability process non-parametrically by resampling from availability time series data Okamura et al., 2012).
We extend the MMPP model of Langrock et al. (2013) to accommodate double-observer surveys by introducing an MR component. We illustrate using the minke whale survey considered by Skaug and Schweder (1999), re-analyzing their data using a model that allows availability events to be more clustered than under a Poisson process. The latter is a nested special case of our model. Our model also accommodates multiple detections of the same animal at different times; these are informative about the availability process. We investigate estimator performance by simulation when the availability process is assumed known, when the rate parameters of the MMPP availability model are estimated and parameters of the Markov component are known, and when all the parameters of the MMPP are estimated.

Animal Availability Model
We detail the model for the general case in which animals can be in any of N states, although in the application below only the two-state case is considered. Animals switch between N "behavioral" states as determined by an N-state Markov chain in continuous time, with infinitesimal generator matrix Q = q ij i,j=1,...,N . The duration of a stay in state k (k = 1, . . . , N) is exponentially distributed with parameter −q kk . The initial state (which below will be the state the animal is in when entering the detectable range) is determined by the initial state distribution vector π, whose kth element is the probability of initially being in state k. Given that the survey start time is random with respect to animal state, it is reasonable to assume π to be the stationary distribution of the process. This distribution gives the probabilities of the process being in each state at a randomly chosen time. In the case N = 2, the stationary probability of the process being in state 1 is π 1 = q 21 /(q 12 + q 21 ).
The availability model of Hiby and Lovell (1998) is the above process with two states. To complete the MMPP, a Poisson process is superimposed on this model, such that availability events occur according to a Poisson process with state-dependent rate λ k when the animal is in state k. For example, in a two-state model, with state 1 corresponding to the animal resting/breathing at the surface, and state 2 corresponding to foraging dives, λ 1 will typically be much larger than λ 2 ; in fact, the latter will often be zero (which would correspond to animals not surfacing at all while in foraging state).

Embedding the MMPP in a Distance-Sampling
Survey As is the case with most LT models, we assume that animals do not change their perpendicular distance from the line while within detectable range. Our model also assumes that if they move in the forward (along-transect) direction, they all move at a constant speed (which may be zero). To establish our notation, consider a whale at perpendicular distance x that enters the observers' field of view at forward distance y max (x) and is detected at forward distance y, by which time it has been in view for a distance l = y max (x) − y (see Figure 1).
Conditional on an animal at x being available at a distance l from y max (x), we assume that it is detected with probability h (l, x). Although it is a probability, we refer to h as the detection hazard function in order to distinguish it from conventional detection probability functions on line transect Observers move in the y-direction, animals remain at fixed perpendicular distances (x), and observers can see no farther ahead than the curve y max (x). When an animal at x is detected at forward distance y, it has been in view for a distance l = y max (x) − y.
surveys, which depend only on perpendicular distance and not on forward distance. We thus assume that the observed detections are realizations of MMPPs that operate on the forward distance scale after being thinned according to the detection hazard function h(l, x). The MMPP described above is assumed to generate the actual availability events, and for an event at (l, x) a Bernoulli distribution with success probability h(l, x) determines whether that event is detected. We assume that conditional on availability at (l, x), detections are independent between observers so that in the case of two observers, with probabilities of detecting a surfacing (availability event) given by h 1 (l, x) and h 2 (l, x) respectively for the two observers, the probability that any observer detects the surfacing is h(l, (We discuss how violation of this independence assumption might be dealt with in Section 6.) It follows that the distances (y or l) at detections are the events of a Markov-modulated nonhomogeneous Poisson process with state-and distance-dependent rate parameters λ k h(l, x) (k = 1, . . . , N), and with parameters of the underlying Markov chain being {q ij }, i, j = 1, . . . , N (cf. Langrock et al., 2013, for more details).

Density and Cumulative Distribution of Sightings
As will become clear below, in order to estimate the model parameters, we need, among other things, the density function for the forward distances (or distances from y max (x)) of detections and the cumulative distribution for the distance at first detection. Although there are closed-form expressions for these in the case of homogeneous MMPPs, there are in general no closed-form expressions available for these in the case of non-homogeneous MMPPs. However, Langrock et al. (2013) showed that these can be approximated arbitrarily accurately using an h(l, x) that with respect to l is piecewise constant on the intervals [i 0 , i 1 ), [i 1 , i 2 ), . . . , [i p−1 , i p ] (with i 0 = 0 and i p = y max (x)). Let h (r) (x) denote the constant value of h(l, x) on the rth interval, [i r−1 , i r ). Let (x) (r) = diag(λ 1 · h (r) (x), . . . , λ N · h (r) (x)). Langrock et al. (2013) showed that (i) the joint density of detections of an animal (or of a group of animals) occurring at distances l = (l 1 < l 2 < . . . < l d ), with i R j ≤ l j ≤ i R j +1 for j = 1, . . . , d, is given by where l 0 = 0, 1 ∈ IR N is a row vector of ones and We can therefore approximate f l (l|x) and F l (l|x) arbitrarily accurately using a function h(l, x) that is a step function with respect to l by using enough steps.
In addition to generating a vector of observed locations l for a detected animal at perpendicular distance x, a doubleobserver survey also generates a "capture history" for each detection, i.e., a realization of a random variable ω j for detection j, such that ω j = 1 if seen by observer 1 only, ω j = 2 if seen by observer 2 only, and ω j = 3 if seen by both. Conditional on detection at l j , the probability mass function for ω j is trinomial, with index 1 and dependence on x as follows: Then, if we let ω = (ω 1 , ω 2 , . . . , ω d ) denote the capture histories associated with the detections of an animal (or of a group of animals) made at distances l = (l 1 < l 2 < . . . < l d ), the joint density of l and ω, given x, is where both h 1 (l, x) and h 2 (l, x) are piecewise constant with respect to l on the intervals The proof is straightforward, since conditional on l (and x), and assuming independent detections both between observers and of the same animal on different occasions, the probability for ω is f ω|l (ω|l, x) = d j=1 p(ω j |l j , x) and Equation (2) is just the product of this conditional probability mass function and the density function f l (l|x), given in Equation (1). (We deal with dependence of detections of the same animal on different occasions by allowing the hazard to change after first detection-see Sections 4 and 6 below.) Since with this model, each animal generates binary trials (with "success" being detection of a surfacing event) for each observer, irrespective of whether the other observer has detected the animal, duplicates need not involve both observers detecting the same surfacing.

Likelihood
To obtain the full likelihood, for observed capture histories and forward and perpendicular distances at which animals were detected, we condition on the fact that the detection was made within maximum perpendicular distance and within [0, y max (x)] in the forward distance dimension. Let π(x) denote the pdf of perpendicular distances x of all animals within some maximum distance W from the transect line. For detected animals i, i = 1, . . . , n, let l i = (l i1 , . . . , l id i ) and ω i = (ω i1 , . . . , ω id i ) denote detection distances and capture histories, respectively, and let x i be the perpendicular detection distance. Furthermore, let φ be the vector comprising the parameters of the MMPP for availability, let θ be the vector of parameters of the detection hazards h 1 (l, x) and h 2 (l, x), and let β = (φ, θ). Then, making the dependence of f l,ω ( ) and F l ( ) on β explicit, the likelihood for β, conditional on detection within perpendicular distance W and forward distance y max (x i ), is given by This follows from the definition of conditional probability (cf. Langrock et al., 2013). The denominator can be computed via numerical integration, and it simplifies if transect lines are placed randomly, in which case π(x) = 1/W and π(x) cancels.

Parameter Estimation
Data that allow in situ estimation of availability process parameters are very rarely gathered on line transect surveys. Auxiliary data containing information on the surfacing pattern are most commonly used to model the availability process (cf. Skaug and Schweder, 1999). A typical source for such data is GPS tagging. (See Section 3.4 for details of how multiple sets of tag data can be incorporated in our model.) We recommend that data that allows estimation of the availability process be gathered on the survey itself if this is possible. This gives the analyst more options: to use these data alone for availability process estimation, to use them with some availability parameters estimated from auxiliary data, or to use only availability parameters estimated from auxiliary data (when these are available). (See Sections 4 and 5 for more details, and Section 6 for further discussion.) If we take some or all of the availability process parameters φ = ({q jk }, {λ j }), j, k = 1, . . . , N states, to be known, then Equation (3) is a likelihood for the detection hazard parameter vector θ and whichever (if any) parameters of the availability process are not known. We can then obtain maximum likelihood estimates of θ and any unknown availability process parameters by numerically maximizing Equation (3). Confidence intervals for the parameters can be obtained either based on the Hessian of the log-likelihood or by bootstrapping. If the hazard function parameters are obtained conditionally on φ (obtained from auxiliary data), then a two-stage bootstrap should be used rather than the Hessian, with the first stage involving obtaining bootstrap resamples ofφ and the second stage involving bootstrapping the sightings survey data and re-estimation of θ (nonparametrically, with transects as sampling units, for example), conditional on randomly selected resamples ofφ.
Two detection hazard forms suggested by Skaug and Schweder (1999) (given in terms of forward distance y = y max (x) − l) are the exponential power and inverse power models: In the case of automated detectors (acoustic detectors, for example), the assumption of repeated independent detections of the same animal by the same observer is plausible, but this within-observer, within-animal independence is unlikely to hold for human observers. Initial detection is likely to increase the probability of subsequent detections by humans. This can be dealt with by assuming that for each observer the hazard function changes after the initial detection, with a set of parameters θ ini determining the hazard function until the initial sighting, and another set of parameters θ sub determining the hazard function subsequent to the initial sighting. The formulae given above change accordingly. In order to obtain an abundance estimate from a fitted nonhomogeneous MMPP using a conditional likelihood approach (see Borchers and Burnham, 2004, for a discussion of conditional vs full likelihood), we require the effective strip half-width, μ = W 0 F l (y max (z), z; θ)dz. The probability of detecting an animal that is within distance W of the transect line is μ/W (Buckland et al., 2001). For given model parameters, the value of μ can easily be computed using numerical integration. The uncertainty in μ can be quantified through bootstrapping.

Marginalizing Over Individual-Level Availability
It is common to have separate estimates of availability process parameters for a number of different animals. In this case, individual-level heterogeneity in animal availability patterns can readily be accommodated by evaluating the likelihood Equation (3) separately for each set of availability process parameters and summing the likelihoods. This amounts to marginalizing the likelihood over the empirical distribution of individual-level availability model parameters. See Borchers et al. (2013) and Langrock et al. (2013) for details. If some availability model parameters (φ a , say) are estimated directly from availability time series and others (φ b , say) are estimated by maximizing the likelihood given in Equation (3) conditional on these, then the above approach to dealing with individual-level variation in availability parameters would need to be modified. One could still marginalize non-parametrically over φ b (as above) but one would need to model the conditional distribution of φ a given φ b parametrically and marginalize the likelihood over φ a for each individual φ b .

Simulation Design
We conduct two sets of simulations, one with a setup that results in the density of distances in view, l, at initial detection of animals on the trackline, i.e., f l (l 1 |x = 0) = d dl F l (l|x = 0), having mode roughly in the middle of the interval [0, y max (x)] (Scenario I) and the other for which f l (l 1 |x = 0) is monotonically increasing as l increases, up to a mode that is close to y = 0 (Scenario II). Whether or not f l (l 1 |x = 0) has a mode close to y = 0 depends on the detection process and availability process and is not under the control of the surveyor or analyst. Examples of surveys with modes at y = 0 and y > 0 can be found in Figure 3 of Borchers et al. (2013) and Figure  2 of Rekdal et al. (2014). The densities for Scenarios I and II are shown in Figure 2. Note that in Figure 2, we have not conditioned on animals being detected within detectable range, hence these are not pdfs (because they do not integrate to unity-their integral is the probability of being detected by forward distance zero). Estimation of parameters turns out to be more difficult in Scenario II, and we included both these scenarios in this study to emphasize some practical aspects of surveys that need to be considered when using the suggested approach (see discussion at the end of this section).
For each of the two scenarios, we consider single-observer and double-observer estimators, and within each of these we consider two kinds of data, namely data from the initial sighting of an animal only or data from multiple sightings ("IS" vs. "MS"), and when multiple sightings data are used, we either have observers detecting all availability events after their initial sightings with certainty ("h(·) = 1") and specify this in the estimator, or we have h(l, x) increasing after initial detection (but to values less than 1), and estimate two sets of hazard function parameters for each observer; one before first detection, the other after ("h(·) < 1"). Finally, for each of the above scenarios and cases, three kinds of estimators were considered, corresponding to different amounts of information on the availability process being available: one in which the MMPP availability process is fully known to the estimator ("fully known availability": "FKA"), one in which the Poisson rates of the MMPP availability model are estimated and parameters of the state process are known ("partially known availability": "PKA"), and one in which all the parameters of the MMPP are estimated ("not known availability": "NKA"). The two single-observer estimators with FKA and MS (those for h(·) = 1 and h(·) < 1) are not investigated as in these cases the detection of further availability events subsequent to the initial detection does not add any information about the effective strip half-width, compared to the IS case: the performance of the estimator is identical to that of the single-observer estimator in the FKA/IS case.
We therefore have sixteen cases for each of the two scenarios (cf. Tables 1 and 2). The simulations were constructed for situations similar to those from the minke whale survey considered by Skaug and Schweder (1999). For each case, 100 simulations were conducted, where in each simulation run animals passed within perpendicular distance W of observers until 300 animals were detected. The inverse power form (5) of the detection hazard h(y, x) was used, and the shapes of the functions used in the two different scenarios are shown in Figure 3. (Only the functions associated with the detection probability at the initial sighting are shown; in scenarios with multiple sightings and h(·) < 1, the hazard functions have similar shapes but are slightly higher for surfacings subsequent to the initial sighting.) We fixed W = 3000 and y max (x) = 3000 for all x. The MMPP parameters were specified as q 12 = 0.002, q 21 = 0.001, λ 1 = 0.01, λ 2 = 0 in Scenario I and as q 12 = 0.002, q 21 = 0.001, λ 1 = 0.002, λ 2 = 0 in Scenario II. In both scenarios, there are hence no surfacings in state 2, the "diving" state. The expected dive cycle duration in both scenarios is 1500 units. Furthermore, in both scenarios 2/3 of the time is spent in state 2, the state in which animals cannot be detected because they do not come to the surface. In state 1, surfacings occur on average every 100 units in Scenario I and every 500 units in Scenario II. Animals on the trackline (x = 0) are detected with probability approximately 0.79 and Table 1 Simulation scenario I: percentage biases and coefficients of variation (in parentheses) of ESHW estimates for observer 1 (true ESHW: 1618) obtained in different scenarios (100 simulation runs in each scenario); FKA, PKA, and NKA stand for full/partial/no knowledge of the availability process, and IS and MS stand for "initial sighting(s) only" and "multiple sightings", respectively. In case of multiple sightings, h s (·) indicates whether or not detections of an individual's surfacings following its initial detection were considered to be certain (h s (·) = 1) or probabilistic but with changed parameter values of the detection hazard function (h s (·) < 1). stand for full/partial/no knowledge of the availability process, and IS and MS stand for "initial sighting(s) only" and "multiple sightings", respectively. In case of multiple sightings, h s (·) indicates whether or not detections of an individual's surfacings following its initial detection were considered to be certain (h s (·) = 1) or probabilistic but with changed parameter values of the detection hazard function (h s (·) < 1). In some scenarios in which not all parameters are identifiable, it is still possible to estimate effective strip half-width. For example, in the PKA/IS single-observer case, it is possible to estimate λ 1 θ 1 , the product of the surfacing rate and the intercept of the detection hazard function, even though λ 1 and θ 1 are not individually identifiable. In these cases, we reparameterized the model in terms of these estimatable compound parameters.

Simulation Results
Simulation results are shown in 1 and 2. In all cases, we anticipate better estimator performance in Scenario I than Scenario II, because the mode of f l (l 1 |x = 0) is well ahead of the observer in Scenario I but not Scenario II. Estimation of the probability of detection at x = 0 amounts to estimation of F l (y max (0)|x = 0), i.e., estimation of the proportion of the area under f l (l|x = 0) from 0 to ∞ that occurs before l = y max (0). To estimate this proportion, one has to effectively estimate the area under f l (l|x = 0) in that part of the support of f l (l|x = 0) from which one cannot get data (i.e., for l > y max ). If there is a clear decline in f l (l|x = 0) before l = y max (0) (the region from which one does get data), then given that f l (l|x = 0) is unimodal and smooth, the data are somewhat informative about the area under f l (l|x = 0) beyond l = y max , and so one is better able to estimate the proportion than if there is no decline. By contrast, if f l (l|x = 0) increases monotonically with l for l < y max (0), there is very little information in the data about where the mode of f l (l|x = 0) occurs, how high it is, and how long it takes for f l (l|x = 0) to decrease to zero. This makes Scenario II more difficult than Scenario I.
We start by considering the results for the IS (initial sighting only) simulations. The double-observer estimator is less biased (much less in the case of Scenario II) and in all cases much more precise than the single-observer estimator. Indeed, the performance of the single-observer estimator is too poor in Scenario II to be practically useful (with high bias and coefficients of variation greater than 100% in all cases). The double-observer estimator is very much better than the single-observer estimator in the IS case. However, the high bias (−32%) and coefficient of variation (83%) of the doubleobserver estimator with no knowledge of the availability process (NKA) in Scenario II means it may be of questionable utility with initial sightings only in scenarios like this when the availability process is unknown.
The picture is similar in the case of MS (h s (·) < 1). The performance of the single-observer estimator improves to the extent that it is now practically useful in Scenario I (although not Scenario II), but it remains substantially worse than that of the double-observer estimator. In addition, for all of the practically useful estimators, performance is better in the case of MS (h s (·) < 1) than IS, and when the estimator has no knowledge of the MMPP (rows NKA in Table 1), it is quite substantially better. Moreover, the double-observer estimator performs well in the NKA case for both Scenarios. Improved performance is not surprising: when one is ignorant of the MMPP parameters, getting repeat detections of individuals'

Scenario II, observer 2
Figure 3. Detection hazard function, i.e., the probability of a surfacing at forward distance l and perpendicular distance x leading to a sighting, in Scenarios I and II and in each case for the two observers at the initial sighting.
availability events improves one's ability to draw inferences about these parameters. The performance of the estimators in the MS (h s (·) = 1) simulations is broadly similar to that with MS (h s (·) < 1), but estimators are for the most part slightly less biased and slightly less variable. With biases of around 20% and coefficients of variation of about 0.9, the single-observer estimator remains practically unusable in Scenario II.

An Analysis of a Double-Observer Survey of Minke Whales
To estimate minke whale abundance in the Northeastern Atlantic, line transect surveys with two independent observer platforms were conducted from 1996 to 2001 (see Skaug et al., 2004). Transects were located using a systematic design with random start point, and the randomization makes it reasonable to assume that animals are uniformly distributed with respect to perpendicular distance from the lines (i.e., p(x) = 1/W; here W = 2000). After its initial sighting, each animal was continually observed, with the encounter history giving all the animal's sighting events until it had either been seen by observers on both platforms or it had left the detectable range (case MS in the simulations above, with the slight difference that here observation of an animal stopped when it was seen from both platforms). In this survey, it can reasonably assumed (Skaug, pers. commun.) that all availability events of an animal after its initial detection will be detected by the platform making the initial detection (h s (·) = 1 in the simulations above). A total of 870 minke whales were detected in the survey, with between 1 and 8 sighting events recorded per animal (1.7 on average).
Separate data on minke whale surfacing times were gathered by VHF tagging of eight different minke whales . Using these auxiliary data, we first implemented the FKA scenario, fitting a two-state MMPP to the auxiliary data (with one state being diving, in which no surfacing events occur), and then embedding this MMPP in the double-observer shipboard survey analysis. Beaufort  Skaug et al. (2004), the sea state was assumed to influence the scale parameter of the hazard function (Equations (4) or (5) The likelihood was maximized numerically using nlm in R. The exponential power hazard model (Equation (4)) yielded log(L(β)) = −17446.45, while the inverse power hazard model (Equation (5)) gave log(L(β)) = −17432.99; in the following, we focus on the models that use the inverse power model. The likelihood ratio test for the hypothesis that the sea state has no influence on the scale parameter of the hazard function, i.e., that β 1,1 = β 1,2 = β 2,1 = β 2,2 = 0, yields the test statistic 125.5 and hence a p-value of 0, and the hypothesis is rejected.
We also implemented the scenarios PKA and NKA. With PKA, the model was fitted to the survey data conditional on the mean dive cycle length of 391.5 seconds. This value was chosen based on Christiansen et al. (2011), who fitted HMMs to inter-surfacing times and reported mean surfacing intervals of 43 seconds in the surfacing state and 155 seconds in the deep dive state, with a dive cycle comprising "five to six regular dives [in the surfacing state] followed by one deep dive." Table 3 gives the estimates and associated confidence intervals of the MMPP availability process parameters and of the effective strip half-widths, for the three different scenarios considered (FKA, PKA, and NKA). To obtain interval estimates, we conducted, in each of the three scenarios, a bootstrap with 1000 iterations, in each iteration drawing a non-parametric bootstrap sample from the distances associated with sightings (i.e., sampling with replacement from the observed triplets of forward distances, perpendicular distances, and associated covariate value), and, in case of the FKA scenario, an additional parametric bootstrap sample from the models for surfacings (i.e., generating observations from the fitted MMPP availability model and then refitting the model). In case of the PKA scenario, the uncertainty in the expected dive cycle length was accounted for using the uncertainty quantification given in Christiansen et al. (2011).
The differences between the FKA estimates on the one hand and the PKA and NKA estimates on the other are striking. ESHW estimates for PKA and NKA are about 40% of those from FKA and would result in density estimates being about 2.5 times larger for PKA and NKA than FKA. The difference between FKA and PKA estimates is a consequence of the fact that mean dive cycle length obtained by fitting to the minke VHF tag data is 153 seconds, while that of Christiansen et al. (2011) is 2.55 times longer. Estimates of mean dive cycle length and other parameters for NKA are broadly consistent with those from PKA. The differences in dive cycle length and related parameters may be a consequence of the diving behavior at the time and location of the survey being different from those of VHF-tagged animals (which were obtained at different times and locations)-in which case the PKA or NKA estimates are more appropriate. They might also be a consequence of failure of the assumption that all surfacings after initial detection are seen-in which case something closer to the FKA estimates may be appropriate.

Discussion
We have extended the approach developed by Langrock et al. (2013) for single-observer LT surveys with stochastic animal availability, to allow for MR data generated by double observers. Viewed from an MR perspective, the model is a continuous-time variant of the closed-population conditional MR likelihoods of Huggins (1989) and Alho (1990), with detection probability that changes with time, with an MMPP temporary immigration/emigration structure (rather than the completely random or Markovian immigration/emigration structures of Kendall et al., 1997, for example), and with an observed random effect (namely x) in detection probability. As such, it can be classed as a continuous-time MR model of type M th (Otis et al., 1978) with temporary immigration and emigration. If we allow the parameters of the hazards to change after first detection, then it is of type M tbh .
The new approach was shown to improve estimation precision and allows more reliable estimation of ESHW and abundance. It also allows estimation using limited or no independent information on the availability process in some circumstances, particularly when data include multiple sightings of animals by the same observer. Including multiple sightings roughly halved estimator coefficient of variation in our (simulation) Scenario I and reduced it by a factor of between 2 and 4 in our Scenario II.
The estimates of ESHW from the minke survey were sensitive to assumptions about the availability process. As it is often the case that little is known about how availability patterns vary in time and space, using availability process estimates external to the survey may involve strong assumptions, which if violated, can result in biased ESHW and abundance estimates. There is therefore merit in being able to estimate availability process parameters simultaneously with detection hazard parameters from the LT survey data. The cost of incorporating resightings to do this is the need to model how detection probability changes for subsequent sightings, except in cases where detection probability does not change after initial detection. We considered only the case in which all hazard parameters change after first sighting and then stay unchanged. It is possible that only some parameters change, and/or that change depends on how many times an animal has been detected by that point and/or how much time has passed. In principle one can construct hazards accordingly and choose between models using model selection criteria. Sample size may of course limit the ability to distinguish between models. Nevertheless, using resightings data seems to us something worth doing as these data contain information on availability and one may avoid substantial bias by estimating availability in situ.
In the light of our analyses, we recommend: (i) that doubleobserver survey methods be used when this is feasible and (ii) that consideration be given to instructing observers to record all detections of animals, not just the initial detection. If it is infeasible to record multiple detections in areas of high animal density, we recommend that this be attempted only in lower density areas. Detection probability estimates from such data will rely on the assumption that availability processes are similar in high-and low-density areas, and surveyors should consider whether this is a plausible assumption in the context of their surveys.
A final issue, to which we alluded in Section 1, is bias due to unmodelled heterogeneity. We can investigate how much of the bias from unmodelled stochastic availability has been accounted for by modeling the availability process, by comparing detection probability from our model (p = μ/W) with detection probability from the so-called "full independence" (FI) model (see Borchers et al., 2006;Burt et al., 2015), which assumes independence conditional on x: p FI = W 0 p FI (x)π(x)dx, where p FI (x) = 1 − {1 − p 1 (x)}{1 − p 2 (x)}, p i (x) = F l(i) (y max (z), z; θ), and F l(i) () is the same as F l ( ), but evaluated using h i ( ) in place of h( ) (i = 1, 2). For Scenarios I and II, p is 5 and 10% smaller than p FI , respectively, suggesting that p FI is biased by 5% to 10% due to neglecting stochastic availability. (Bias can be much larger; we constructed other scenarios with bias of more than 30%.) Neglect of other variables may also cause bias. We incorporated sea state in our analysis of minke sightings but there may be unrecorded variables that affect detectability of animals. MRDS estimators deal with this using "point independence" (PI) models (see Borchers et al., 2006;Burt et al., 2015, for details). These model the shape of the detection function using a model, p shape (x), with intercept 1, for the distribution of detections by any observer in the x dimension (without using recapture data or assuming independence between observers' detections), and a separate FI model, p(x), that uses the recapture data and assumes independent detections, given x. They are combined thus to estimate detection probability: p(0)p shape (x). One could do the same at the level of the detection hazard, conditional on (l, x). For this one needs a model h shape (l, x), with intercept 1, for the distribution of detections by any observer in the (l, x) plane. A PI hazard function might then be h(y max (0), 0)h shape (l, x), where h(y max (0), 0) is h(l, x) = 1 − {1 − h 1 (l, x)}{1 − h 2 (l, x)} evaluated at (l = y max (0), x = 0).
There are, however, difficulties that arise when using PI at the hazard level rather than the perpendicular distance detection function level. For example, having a single h shape (l, x) is in general inadequate because animals that are undetected by l should have a different h shape (l, x) from animals seen by observer 1 only by l, and these may have a different h shape (l, x) from animals seen by 2 only by l. In general, one would require three related h shape (l, x) functions (and need to estimate up to three sets of parameters), and it is not obvious how they should be related, without making the kind of independence assumption that one is trying to avoid by constructing a PI model. This is a topic for future research.

Supplementary Materials
The R code used to fit the models and generate the results of this article is available at the Biometrics website on Wiley Online Library.