Analysis of photo-id data allowing for missed matches and individuals identified from opposite sides



  1. In many species, photo-identification could be used as an alternative to artificial marking to provide data on demographic parameters. However, unless the population is very small or fragmented, software may be required to pre-screen and reject most image pairs as potential matches.

  2. Depending on the species and method used to obtain images, currently available software may falsely reject some matches. We estimate the false rejection rate (FRR) of the ExtractCompare (EC) program when used to pre-screen images of female grey seals. Filtering images manually to reduce the FRR involves subjective assessment of image quality, reduces the amount of data available and may bias the results in favour of relatively well-marked individuals.

  3. The data may contain individuals identified only from the left side or the right side, as well as individuals identified from both sides.

  4. Missed matches resulting from false rejections by pre-screening software and/or inclusion of individuals identified only from opposite sides cause some individuals to generate multiple encounter histories.

  5. We describe an open population model for data of this type which, given a measured risk of missing a match between a randomly selected pair of images of the same individual, provides maximum likelihood (ML) estimates of initial population size, survival/emigration and immigration/recruitment by calculating the expected frequency of any encounter history that could be generated.

  6. As a case study for the method, we used EC to pre-screen photographs of female grey seals on a breeding colony and generate encounter histories over five successive seasons. Allowing for the measured FRR, we calculated ML estimates for comparison with estimates from previous studies.

  7. We also used the model with encounter histories simulated using the same FRR to give the same mixture of left side, right side and both sides histories and derived ML estimates for comparison with the values used to drive the simulation.

  8. With FRR set at up to 33%, the method gave estimates of the abundance and survival parameters used in the simulation model that were biased by at most 4·7% up and 3% down, respectively. The results of the grey seal case study were consistent with previous estimates of apparent survival and trends in abundance.


The use of photographs of individually distinct animal markings to provide data on population dynamics is widespread and likely to increase with improvements in camera and camera trap technology (Sarmento et al. 2010; Sollmann et al. 2011). As the size of the resulting photograph catalogues increases, interest in automated systems for maintaining those catalogues is also increasing (Arzoumanian, Holmberg & Norman 2005; Van Tienhoven et al. 2007). Almost invariably data derived from photo-id studies are analysed as capture–mark–recapture (CMR) data, exploiting the availability of software such as MARK (White & Burnham 1999) to provide the estimates (Carroll et al. 2011; Graham et al. 2011). Data are entered as a capture history matrix with rows of ones and zeroes representing the capture of an individual (in this case on a photograph) or the failure to capture it during successive sampling occasions; in a multi-state model, the ones may be replaced by codes representing the capture of individuals in different states, for example, at different locations (e.g. Nichols & Kendall 1995). However, whereas the history matrix provides sufficient statistics for parameter estimation from CMR data, the same does not hold, in general, for photo-id data where false rejections by the pre-screening software and individuals identified from opposite sides may generate multiple histories from a single individual.

It may be possible to avoid multiple encounter histories by restricting the photographs to those of sufficient quality to eliminate the risk of missing matches and that either all show the same side or are in pairs showing both sides, but only at the cost of reducing the amount of data available (e.g. Arzoumanian, Holmberg & Norman 2005). Furthermore, establishing the criteria by which to select photographs on the basis of quality is complex and any correlation between photograph quality and pattern distinctiveness may lead to selection in favour of a subset of individuals and bias in estimation of population size (Arnbom 1987; Friday et al. 2000, 2008).

Stevick et al. (2001) developed a correction for the Petersen two-sample abundance estimator to account for a measured false rejection rate (FRR), and da Silva (2009) extended the correction for closed populations sampled on more than two occasions by using Bayesian models in an iterative process to suggest likely values for the actual but unknown numbers of different individuals, sample sizes and resightings. Link et al. (2010) also relate observed encounter history frequencies to the underlying ‘latent’ histories in a closed population model, but the relationship is based on the assumption that all false-negative errors lead to single-encounter ‘ghost’ histories (Yoshizaki 2007), which does not hold in general for photo-id.

Morrison et al. (2011) estimate survival in a Cormack–Jolly–Seber model by censoring initial photographs to avoid multiple histories resulting from missed matches. However, their analysis is again based on the assumption that all missed matches are due to low-quality images that cannot be matched to any other image.

In this study, we investigate an alternative solution to dealing with multiple encounter histories resulting from both missed matches and individuals identified from different sides. We use test sets of images of known animals to measure the risk that a randomly selected pair of images showing the same side of the same individual will be falsely rejected by the automated pre-screening process. We then use that ‘pairwise FRR’ in a Jolly–Seber open population model to calculate the expected frequency of each encounter history recognising that, as a result of false rejections, an individual can give rise to more than one history. Furthermore, if the history matrix, although not multi-state in the usual sense, specifies the state of each encounter as being from the left, right or both sides, individuals identified only from different sides can be included by calculating the expected frequency of each aspect-specific history. Corkrey et al. (2008) used the same approach to construct ‘bilateral histories’ from photographs of bottlenose dolphins in north-east Scotland; however, they were able to neglect any risk of false rejection because automated pre-screening was not required with the relatively small population of marked dolphins and ‘only high-quality pictures were used to minimise errors in misidentification’.

We tested our method using photographs of female grey seals taken at a breeding colony on the island of North Rona from 2004 to 2008. They were pre-screened using the ExtractCompare (EC) program originally developed for photographs of grey seals swimming off a haulout (Hiby & Lovell 1990) and used since to maintain photo-id catalogues for a number of other species (e.g. Kelly 2001; Hastings, Hiby & Small 2008; Hiby et al. 2009). Previous studies of this colony have provided a large sample of photographs already known by visual comparison and in a few cases by brands or flipper tags to show the same animals. We used pairs of photographs taken on different dates to measure the FRR of the EC programme on the type of photographs available from the breeding site. Those studies also provided data on longitudinal behavioural ecology and population dynamics at this site (Smout, King & Pomeroy 2011), which has been monitored annually using aerial survey, thus providing trends in abundance as well as previous estimates of population parameters to compare with the results of our analysis.

Materials and methods

To avoid disturbance, photographs were taken from a hide over a photographable area (PA) of the breeding colony at North Rona, U.K. (59°06′N, 05°50′W) (Fig. 1). Only adult females were photographed because adult males do not have pelage pattern suitable for the comparison algorithms. A daily map of the position of individuals over the PA was maintained and the photographer attempted to obtain an ID photograph of the left and right side of as many individuals as possible.

Figure 1.

Location of the North Rona breeding colony relative to the United Kingdom mainland with dark shading delineating the photographable area (PA). The location of the hide from where photographs were taken is also indicated. Around 40% of the island's pup production comes from the PA, and most of the remainder are born at the northernmost tip (Pomeroy et al. 1994).

The EC software calculates a similarity score between patterns scanned from photographs showing the same side of the body. To minimise the FRR, we use ‘multibiometric identification’ (Jain 2007), combining the scores returned by two algorithms on patterns scanned from the neck region and, when visible, from the flank region. Potential matches are those pairs for which EC returns a combined score exceeding a threshold value and are visually confirmed to identify which seals were photographed within which season and whether a seal was recorded only from the left (coded as ‘L’), only from the right (‘R’) or from both sides (‘B’) within a season. Visual confirmation is conservative: if there is any doubt that two photographs show the same seal, they are not matched.

To estimate the pairwise FRR, we accumulated photographs taken anywhere on the island of 156 adult female grey seals known visually or via flipper tags or brands. The photographs provided 155 pairs taken of the same seal on different dates and showing the same side of the neck. The known seals were those selected randomly for previous physiological studies and are thus representative of the adult female grey seal population in terms of their pelage patterns. We then calculated the percentage of pairs for which the combined score failed to exceed a threshold value of 0·95.

Data Analysis

Data generated by any photo-id project in which the risk of missing a match is not negligible will include multiple encounter histories from some of the animals. Over five sampling occasions, an individual (in our case study, a female grey seal) may, for example, generate history 01001 and 00100 if the image taken on the third sampling occasion is not matched to either of those taken on the second or fifth. That might be because it was identified only from the left in samples two and five and only from the right in sample three, or because although all images were from the same side, the standardised scores between the third image and the other two failed to exceed the threshold used.

The risk of multiple encounter histories precludes the use of the usual multinomial model for presence/absence capture history frequencies. Inference in our approach is based on maximising the likelihood of encounter history frequencies where each history specifies the sides from which the seal was photographed during each season. For example, 0L00L represents a seal that we know was photographed from the left at least once during seasons 2 and 5, whereas 0L00B represents a seal that we know was photographed at least once from the left during season 2 and from both sides during season 5. We calculate, according to the demographic parameter values at that stage and allowing for the measured pairwise FRR, the expected frequency of each possible encounter history. Any history consisting of only L or R encounters is possible, plus any mix of L and R provided there is at least one B, for example, L0R0B. A seal identified from both sides in the same breeding season only as a result of matches to a B encounter in a different season generates two histories (for example, L0R0B and 00L0B, if there were separate right and left side encounters with the same seal during season three).

To calculate the expected frequency of a given history, we use the following notation, where subscript i refers to the season number from season 1 to n.

FS and LS, the first and last encounter in the history, for example, 2 and 5 in 0L00L.

nB, the number of B encounters in the history, for example, one such in L0R0B.

nL and nR, the number of seasons in which the seal was identified from the left and the number in which it was identified from the right, for example, two and two in L0R0B, and three and one in L0L0B.

math formula, the number of seals joining the local population as a result of immigration or recruitment between seasons i and − 1. math formula represents the size of the local population at season 1; thus, the number of females that had their pup within the PA in that year plus recruited females that failed to pup but would have used the PA.

φi , annual apparent survival (thus including emigration) from season − 1 to i. For convenience, φn+1 is set to 0.

Plocal, the probability that a seal in the local population is available to be photographed during a season, assumed to be independent from season to season and the same in each season.

PL,i, the probability an available seal is photographed from the left at least once during season i, varying from season to season as a result of variation in sampling effort.

PR,i, the probability an available seal is photographed from the right at least once during season i, varying from season to season as a result of variation in sampling effort.

τi, the probability that the seal turns when it is encountered so that both sides can be photographed and linked to the same ID at that time.

Pmiss, the estimated pairwise FRR.

math formula, φi, Plocal, PL,i, PR,i and τi are all free parameters used to maximise the likelihood; we regard Plocal, PL,i , PR,i and τi as nuisance parameters and math formula, φi as parameters of interest as estimating initial population size, apparent survival and recruitment/immigration. In the likelihood function, Pmiss is assumed to be a known without error.

To calculate the probabilities of events L, R and B as functions of the nuisance parameters, we assume left and right encounters with an available individual occur as Poisson processes at rates varying with side and from season to season. Then, the probabilities of events L, R and B equal Plocal math formula, Plocal math formula and Plocal math formula, where ∼PL,i and ∼PR,i represent 1 − PL,i and 1 − PR,i (see Appendix S1).

Multiplying those probabilities for the encounters in a given history gives the probability that those encounters would be recorded for a local seal alive at least over that period (i.e. from FS to LS). However, to generate the given history, all those encounters must also be recognised as being with the same seal. The algorithm used to calculate the probability of that happening is described in Appendix S1. Finally, we need to multiply by the probability that it is not recorded in each season for which it was in the local population but for which the encounter type in the given history was 0, allowing for the risk that it could be photographed in that season but not recognised. Assume the risk of failing to recognise an individual reduces exponentially with the number of images available. Then, the probability of not recording the seal at a time i when it was part of the local population equals

display math
display math
display math

T1 is the probability that the seal is photographed from the left only and recognised, T2 that it is photographed from the right only and recognised, and T3 that it is photographed from both sides and recognised, whether or not both sides are linked to the same ID at that time. Here, we assume that only a single photograph representing the left or right side of an individual is retained in a given season; if each individual is represented by a number of photographs within a season to aid identification between seasons, then the definition of Pmiss needs to be modified to the risk of failing to match between randomly selected sets of photographs of that size rather than between single photographs.

To generate a given history, a seal must arrive in the local population by season FS and survive and remain in the population until at least season LS. Let Pr (histhad) represent the probability of observed history h calculated as described above for a seal that arrives in the local population in season a and stays till season d; a and d determine the number of seasons preceding FS and following LS in which the seal was not seen or not recognised. So, to calculate the expected frequency, we sum over the possible arrival and departure times, multiplying Pr (histhad) by the number of seals arriving at those arrival times and their probability of survival over exactly that period:

display math

where math formula denotes expectation, and fh represents the observed frequency of history h.

To derive the likelihood, we assume the observed frequencies are independently distributed as Poisson variates:

display math
display math

and the first summation is over all observed histories, whereas the second is over all possible histories.

Simulation Tests

In addition to applying the analysis to observed grey seal data from North Rona, we applied it to simulated photo-id data to check for bias in the estimates of apparent survival and local population size that might result from failure of the model assumptions. The estimator was tested against simulated data affected by pairwise FRR of up to 33%. The simulation model is described in Appendix S2.


The risk of Missing a Match

Using all photographs that showed the neck region and including the scores between flank regions when those were visible, the percentage of image pairs for which the combined score failed to exceed the 0·95 threshold was 33%. The percentage reduced to 14% when the test pairs were restricted to those for which image quality in both neck and flank region had been subjectively classified as ‘good’. The percentage was further reduced to 10% by reducing the threshold to 0·75 but at the expense of increasing the false acceptance rate (FAR) of the EC screening software, that is, the percentage of non-matching images that need to be inspected, from 0·24% to 0·52%. Further reduction in the threshold value results in a rapid increase in the percentage of non-matching images that need to be inspected for only a limited reduction in the risk of missing a match.

The FRR is rapidly reduced by the availability of additional images as illustrated in Fig. 2, based on test results using the flank region only to give a deliberately high pairwise FRR of 46%.

Figure 2.

This figure uses groups of from one to eight images taken at North Rona that show both the neck and flank regions and are known to be of the same seal via comparison of the neck region patterns. In some groups the latest image could also be identified via the flank region pattern because the standardised score between its flank region pattern and the flank region pattern in at least one of the previous images exceeded the 0·95 threshold. The percentage of latest images in which that was the case is plotted, as the black columns, against the number of previous images. The grey columns show the expected percentage identifiable via the flank region if the risk of failing to achieve a standardised score of 0·95 with each previous image was independent at the measured pairwise risk of 46%.

Data Analysis and Simulation

The point estimates in rows A to C of Table 1 were derived by maximising the likelihood of observed encounter history frequencies based on all photographs showing the neck region, including the scores between flank regions when those were visible, and, therefore, subject to a pairwise FRR of 33%; whereas in row D, the images were filtered to retain only those from which it was possible to scan ‘good’ pattern extracts from both the neck and flank regions and, therefore, subject to a pairwise FRR of 14%. Confidence intervals were calculated using the asymptotic chi-squared distribution of the likelihood ratio.

Table 1. Models A to C were derived by maximising the likelihood of observed encounter history frequencies based on all photographs showing at least the neck region. Pmiss represents the pairwise false rejection rate (FRR). Estimated parameters included apparent survival (math formula), local initial population size (math formula ) and recruitment/immigration (math formula). The math formula and math formula for i > 1 are constant at math formula and math formula in all models except C. In model D, images were restricted to those from which it was possible to scan ‘good’ pattern extracts from both the neck and flank regions
ModelExtract typePmiss (%) math formula math formula math formula i > 1ΔAIC
ANeck330·84 (0·794–0·889)373 (303–452)30 (16–43)0
BNeck00·79 (0·754–0·835)440 (364–526)56 (41–71)62·39
CNeck330·86409 (229–410)382·73
DFiltered neck & flank140·79 (0·660–0·954)314 (190–454)0 (0–22)n/a

Models A to C differ in the size of the correction for FRR applied and whether or not apparent survival and recruitment/immigration were time-dependent, that is, allowed to vary with inter-season interval. In all models A to D, the estimated probabilities of being photographed from left, right or both sides were free to vary from season to season because the number of images obtained in each season was very variable.

In model A, apparent survival and recruitment/immigration were not time-dependent and the estimated pairwise FRR of 33% was used as the Pmiss correction. The 0·84 estimate for apparent annual survival math formula over the period 2004 to 2008 (95% confidence limits = 0·79–0·89) is consistent with the geometric mean of 0·89 given in Smout, King & Pomeroy (2011) for the period 1978–2004 with lower estimates over the second half of that period (their Fig. 7>). From 2004 to 2008, the math formula, math formula and math formula estimates give an average of 321 seals using the PA. There is a significant reduction (likelihood ratio test, P < 0·05) from 373 in 2004 to 277 in 2008, as shown in Fig. 3 for comparison with average and peak counts over the PA and aerial survey estimates of pup production over the whole island.

Figure 3.

White triangles show the trajectory of local population size estimates from the capture–mark–recapture (CMR) model corresponding to the estimates of initial population size, apparent survival and recruitment/immigration for seals in the North Rona photographable area (PA). The black triangles, black circles and white circles show the annual pup production estimates at North Rona and the peak and average counts over the PA taken from the hide, respectively.

To check for bias, the simulation model was run 1000 times with pattern quality set at a constant 1·4 to give a 33% pairwise FRR (see Appendix S2) and parameter values set equal to the model A estimates. The resulting estimates of apparent survival and average population size were on average 0·6% lower and 3·5% higher than the values used in the simulation. As illustrated in Appendix S2, variation in pattern quality between individuals will cause the risk of failing to identify an individual to reduce with an increasing number of available photographs more slowly than assumed in the estimation model. To investigate this effect on the estimates, the simulations were repeated with pattern quality uniformly distributed from 0 to 3·8. The resulting estimates of apparent survival and average population size were on average 3% lower and 4·7% higher than the values used in the simulation.

In model B, the correction for FRR was omitted resulting in a reduction in estimated apparent survival and increase in estimated local population size, in line with our simulation results. Averaging over 1000 runs of the simulation with pairwise FRR set at 33%, apparent survival was biased down by 6% and average local population size up by 33% when the simulated history frequencies were analysed with Pmiss set to zero.

In model C, we allowed apparent survival and recruitment/immigration to vary between inter-season intervals, so the values given in the table are averages but the increase in the log likelihood is insufficient to support variation in those parameters over this period.

Finally, in model D, we investigated the option of reducing the FRR by filtering the images according to quality assessment, restricting the images used to those from which it was possible to scan ‘good’ pattern extracts from both the neck and flank regions. The sample size was reduced from 709 to 227 potentially different seals and the pairwise FRR reduced to 14%. The confidence interval on the survival estimate was much wider; furthermore, the population size estimate dropped below the almost 250 seals counted within the PA in some years (Fig. 3), suggesting that in addition to being reduced in size, the sample may be biased towards a subset of seals that pup relatively close to the hide from which the photographs are taken. There is also a risk that restricting images to those of high quality may cause a bias in favour of well-marked animals because pattern quality may affect the assessment of image quality. The results thus suggest that it may be more effective to correct for the risk of missing matches than to try to reduce that risk by filtering the images on the basis of a subjective quality measure.


There are good reasons for analysing photo-id as conventional CMR data. Software for analysing CMR data is not only readily available but provides access to a very wide range of models supported by an extensive literature (White & Burnham 1999; Choquet et al. 2004; Choquet, Rouan & Pradel 2009). However, the loss of information required to derive CMR data from the photographs may be considerable, particularly where there is little control over camera angle and animal posture. As a very simple example, consider estimating the minimum size of a remnant population from camera trap photographs. To provide conventional CMR data, each new photograph would be recognised as showing an existing individual, accepted as showing a new individual or rejected as not showing enough to place it reliably in either category. However, by recording which of the existing individuals are definitely not shown in a photograph which would otherwise be rejected, it is often possible to place a lower bound on the population size that exceeds the number of different animals identified. The point is not purely academic: in a remnant population, a lower bound may be more valuable than a population estimate.

The alternative to censoring to leave data suitable for CMR analysis is to incorporate a correction for the false-negative errors (we assume the risk of false-positive errors is negligible). Our approach differs from previous work (e.g. da Silva 2009; Link et al. 2010) in being based on an open population model, incorporating identifications from only one or both sides and rejecting the assumption that the ‘ghost’ individual created by a false-negative error cannot be matched to another photograph. For the special case of a closed population with all individuals photographed from the same side in only two samples, our abundance estimator is the same as the modified Petersen estimator in Stevick et al. (2001) except that we do not allow for false-negative errors within the samples. In this study, we examined a situation in which many of the study animals were identified from opposite sides so that restricting the analysis to animals recognisable from the same side (as in, for example, Mackey et al. 2007; Holmberg, Norman & Arzoumanian 2009; Durban et al. 2010; Carroll et al. 2011) would have meant discarding a large amount of data. Pre-screening to reduce the number of images that needed to be compared visually must also have resulted in a large number of false rejections so that even if analysis had been restricted to individuals recognisable from the same side, the data would still have contained multiple encounter histories for the same individuals. We believe these are commonly occurring problems. Pre-screening software will continue to improve, but currently there is none that would eliminate any risk of false rejection given the type of photographs available in the case study or, for example, most of those produced by camera trapping (Hiby et al. 2009; Sollmann et al. 2011).

Filtering by image quality is often proposed as a way of reducing the FRR but is problematic. Subjective assessment did reduce the FRR in this study but at the expense of discarding most of the photographs. Failure of the software to return a sufficiently high score does not result just from one or both images being of low quality but rather as a result of differences in lighting, camera angle and posture for which the pattern sample registration and processing fails to compensate adequately. Filtering may also result in bias, particularly in estimates of abundance, because it may select in favour of well-marked individuals and, in this study, for animals relatively close to the hide.

Censoring initial photographs is proposed by Morrison et al. (2011) as a way of eliminating multiple capture histories because every missed match is seen as generating two histories, one of which is a ‘ghost history’. Such a history consists of a single photograph on the assumption that, as stated for example in Link et al. (2010), ‘ghosts are not resighted’. However, those authors also recognise that, although reasonable for genetic tags, ‘this assumption might be questionable for photo identification data’. For photo-identification data, it would require that all false rejections are the result of low-quality images that cannot be matched to any other images. That would lead in turn to the risk of failing to identify an individual being independent of the number of other images in which the individual is identified. That is certainly not true for the case study data (see Fig. 2) and just censoring initial photographs was not sufficient to avoid serious bias in estimates of both survival and abundance from the simulated data.

The case study thus provided an example of the type of photographs for which the risk of false rejection by available pre-screening software is large and cannot be eliminated by filtering on image quality or by censoring single-encounter histories from the data. Nevertheless, automated pre-screening is essential when a large number of such photographs are to be compared (Kelly 2001; Arzoumanian, Holmberg & Norman 2005; Foster, Krijger & Bangay 2006; Van Tienhoven et al. 2007). The EC software is used, for example, to maintain a catalogue that currently contains over 25 000 grey seal photographs taken at breeding and summer haulout sites along the United Kingdom east coast by rejecting 99·5% of the photograph pairs as potential matches. Under such circumstances, we suggest that it is more effective to incorporate a measured pairwise FRR in a model of the encounter history frequencies rather than trying to eliminate missed matches. The pre-screening software, which is responsible for the missed matches, also makes it easy to measure the pairwise FRR. The multinomial model of observed encounter history frequencies used for conventional CMR data is no longer appropriate and is replaced by the expected encounter frequencies that result when, as a result of false rejections, an individual can generate more than one history. It is then straightforward to also incorporate multiple histories that result from photographs not being matched because they show different sides and no longer necessary to restrict the data to animals that are all identified from the same side.

The current model does not allow for a risk of matches being missed between photographs taken of the same side of the same individual within a sample period. A correction for that risk is problematic because it depends on the frequency distribution for the number of times an individual is unknowingly photographed within the same sample. The correction in Stevick et al. (2001) for within-sample false rejection errors is based on a FRR that is assumed to depend on photograph quality but not on the number of photographs of the individual, whereas in our data, the FRR depends critically on that number. We suspect the risk was small for the case study data because by working across the mapped distribution of seals within the PA, the photographer was unlikely to take repeat photographs of the same side of the same seal and any repeats were likely to be linked to the same ID in the field. In general, an alternative to attempting a correction for within-season missed matches is to subdivide the samples to allow all matches within a subdivision to be identified by eye (e.g. Hiby et al. 2007).

For the case study, we used photographs from a grey seal breeding site that had already been intensively studied and thus provided test sets of known animals and parameter estimates with which to compare our results. The photographs are challenging for automated matching and the FRR was high. Nevertheless, our estimates of annual survival are consistent with those derived previously (Smout, King & Pomeroy 2011), while our local population estimates reflect the declining trend observed in counts conducted in the field. Imaging technology and software for pre-screening the images will continue to improve to give large photo-id data sets that can be analysed as CMR data (e.g. Sherley et al. 2010). In the meantime, our results suggest that in many cases, alternative ways of analysing photo-id data are required to avoid bias in estimates of the demographic parameters.


We wish to thank all those who assisted in the field seasons referred to in this work and who contributed to photo-id studies. This work was funded by NERC through core funding to SMRU and grants NER/A/S/2000/00368 and NE/G008930/1, and by the Esmée Fairburn Foundation. The Northern Lighthouse Board, Scottish Natural Heritage and the Coastguard Agency provided assistance in each field season. We also thank Sophie Smout for her valuable comments during the preparation of this manuscript.