A computer-aided program for pattern-matching of natural marks on the spotted raggedtooth shark Carcharias taurus


A. M. van Tienhoven, School of Biological and Conservation Sciences, University of KwaZulu Natal, Private Bag X54001, Durban 4000, South Africa. E-mail amvantienhoven@yahoo.com


  • 1The ability to identify individual animals is a critical aid in wildlife and conservation studies requiring information on behaviour, distribution, habitat use, population and life-history parameters. We present a computer-aided photo-identification technique that relies on natural marks to identify individuals of Carcharias taurus, a shark species that is critically endangered off the eastern Australian coast and considered globally vulnerable. The technique could potentially be applied to a range of species of similar form and bearing natural marks.
  • 2The use of natural marks for photo-identification is a non-invasive technique for identifying individual animals. As photo-identification databases grow larger, and their implementation spans several years, the historically used visual-matching processes lose accuracy and speed. A computerized pattern-matching system that requires initial user interaction to select the key features aids researchers by considerably reducing the time needed for identification of individuals.
  • 3Our method uses a two-dimensional affine transformation to compare two individuals in a commonly defined reference space. The methodology was developed using a database of 221 individually identifiable sharks that were photographically marked and rephotographed over 9 years, demonstrating both the efficacy of the technique and that the natural pigment marks of C. taurus are a reliable means of tracking individuals over several years.
  • 4Synthesis and applications. The identification of individual animals that are naturally marked with spots or similar patterns is achieved with an interactive pattern-matching system that uses an affine transformation to compare selected points in a single-user computer-aided interface. Our technique has been used successfully on C. taurus and we believe the methodology can be applied to other species of a similar form that have natural marks or patterns. The identification of individuals allows accurate tracking of their movements and distribution, and contributes to better population estimates for improved wildlife management and conservation planning.


The identification of individuals of a particular species may be used to track animal movements and develop population estimates (Hammond 1986). Natural body markings have been used successfully to identify individual animals in both terrestrial and aquatic environments for a range of animals, from greylag geese Anser anser (Lorenz 1937) to nurse sharks Ginglymostoma cirratum (Castro & Rosa 2005). Natural marks in marine animals include features such as callosities and fluke patterns (Whitehead, Christal & Tyack 2000) as well as tears, marks and notches in fins and tail flukes (Würsig & Jefferson 1990; Dufalt & Whitehead 1995). Myrberg & Gruber (1974) used spot patterns, scars and fin tears on bonnethead sharks Sphyrna tiburo to identify 10 individuals held in a shallow, semi-natural pool. Natural markings have been used to identify individual sharks in the wild, including great white sharks Carcharodon carcharias (Anderson & Goldman 1996; Klimley & Anderson 1996), nurse sharks and whale sharks Rhincodon typus (Arzoumanian, Holmberg & Norman 2005).

Several other shark species lend themselves to such individual identification techniques, including the raggedtooth shark Carcharias taurus (Rafinesque 1810), also known as the grey nurse or sand tiger shark, which is a large brown shark with distinct spots or pigment marks on the flanks (Compagno 2001). Ireland (1984) first documented use of the pigment spots on the flanks of C. taurus in order to distinguish between two large males. Subsequently, Peddemors & Thurman (1996) and Allen & Peddemors (2001) successfully demonstrated that wild individual C. taurus could be distinguished using natural marks such as tears and notches in fins, fin spots and flank spots. The use of natural marks is also preferred over conventional tagging approaches because it is stressless, inexpensive and reliable over a much longer period, features that are particularly suited to species that are of conservation interest. World-wide, C. taurus is currently listed by the IUCN as vulnerable (Pollard & Smith 2000) and off the east coast of Australia as critically endangered (Pollard et al. 2003), while on the South African coast it is currently listed as not threatened.

In this paper we present a system of pattern-matching using the pigment marks or spots on C. taurus. Although a computer-aided identification system has already been reported for whale sharks (Arzoumanian, Holmberg & Norman 2005), our system allows simpler, yet more rapid, data entry and matching by allowing the user to select the key pattern features at the outset. This approach improves accuracy considerably, reduces processing time and allows analyses of larger databases than those that have been used to date.

Materials and methods

species and study site

Carcharias taurus is strongly migratory in parts of its range (Compagno 2001). Off the South African coast, the animals move between the nursery areas in the warm, temperate waters of the Eastern Cape to the more tropical waters further north, where courtship and mating occur (Compagno 2001; Smale 2002). During migration, sharks congregate at certain sites, such as the Aliwal Shoal, a submerged sandstone reef approximately 40 km south of the city of Durban on the KwaZulu–Natal coast of South Africa (Cliff 1989; Ramsay 1998). The reef is easily accessed by boat and the depth ranges (c. 6 to c. 26 m) make it an important and popular scuba-diving destination. The placid nature of the C. taurus sharks at this reef has permitted their close observation and photography (Peddemors 1995) as well as attracting recreational divers seeking a safe shark encounter. Carcharias taurus has consequently proved ideal for testing whether their natural markings can be used for photo-identification purposes.

photo-identification catalogue

Inevitably, pigmentation marks differ on each side of an individual. For this study, the left side was chosen for photography. Photographs were taken from distances of between 2 and 6 m from the shark, preferably at closer distances because water turbidity may reduce photographic image clarity. To ensure replicability and consistent image quality, photographs were taken perpendicular to the flank of the shark at a time when the shark exhibited minimal body flexing. The area of interest for pattern-matching in C. taurus is the origin of the first dorsal fin to the caudal peduncle. Digital and transparency photographs of sharks taken between 1995 and 2003 were used in this study. Slide transparency images were scanned into a digital format for further processing. The database subsequently used for this study included 739 images.

pattern-matching methodology

Defining the reference system

A common reference system is necessary to enable comparison of the markings on two animals. Our technique maps the relevant information, in this case the pigment markings or spots of each shark image, to a common space. Once the marks on the two animals have been mapped, they are compared and a score is calculated to indicate the quality of the match between the two sharks. A two-dimensional affine transformation is used for the mapping into the common space, which requires that each shark is strictly regarded as two-dimensional. The markings of the two sharks are mapped onto each other using a two-dimensional affine transformation that can incorporate scaling, rotation, translation and correction of perspective. The affine transformation matrix M of a coordinate (x, y) is expressed as:

image( eqn 1)

The calculation of the transformation M involves six unknown variables: m11, m21, m12, m22, t1 and t2. These six unknowns are calculated from a set of three corresponding point pairs, where each pair consists of a source point and a destination point. Each point comprises an x and a y image coordinate. The transformation matrix M maps the source points onto the destination points.

In this case the three point pairs consist of three points in shark image number 1 and three corresponding points in shark image number 2. To enable reliable comparison between two shark images, it is essential that the three reference points are the same for each shark. For C. taurus, the three reference points on each shark are the origins of the two dorsal fins and the origin of the pelvic fin (the triangles in Fig. 1). Each point consists of an x and a y image coordinate.

Figure 1.

Transformation of reference points (shown as black triangles) from one shark image onto another.

Figure 1 illustrates how the two shark images are mapped onto each other. Using the three known reference points, a set of six linear equations is derived, as shown below. For illustration purposes, the transformation of point pair (100, 90) to (90, 80), yields:

m11 × 100 + m21 × 90 + t1 = 90
m12 × 100 + m22 × 90 + t2 = 80

Similarly, the other two point pairs will yield the remaining four equations as follows:

m11 × 140 + m21 × 70 + t1 = 140
m12 × 140 + m22 × 70 + t2 = 90
m11 × 120 + m21 × 60 + t1 = 130
m12 × 120 + m22 × 60 + t2 = 50

Thus three reference point pairs yield six equations with six unknowns. M will result from solving this set of six linear equations. We used the Gauss–Jordan elimination algorithm described together with the source code in Press et al. (1992) to derive the six unknown variables. Solving the above set of equations for the example in Fig. 1 results in:

image( eqn 2)

As an example, the transformation of the first coordinate of image number 1 into the corresponding coordinate of image number 2 is shown below:

image( eqn 3)

Conversion of the other two points of image number 1 will result in the corresponding points of image number 2. Thus, matrix M is the transformation matrix used to map all the markings of shark image number 1 onto those of shark image number 2, which then allows automatic comparison of the markings.

Comparison of natural spot marks

The spot marks of two shark images are transformed onto a common reference space (Fig. 2). The closed squares denote the spot marks from the first shark image, while the open circles represent those of the second shark image. The lines indicate matching spot pairs. Spot pairs are accepted as a match if the nearest alternative spot is at least twice the distance of the current match. From the matching spot pairs, a distance metric is calculated that is used to rank each shark image in the database with respect to the candidate image.

Figure 2.

Comparison of the spot marks in two different images of the same shark. The closed squares denote the spot marks from one shark image while the open circles represent those of the second shark image. The lines indicate matching spot pairs. 

In the case of n spot pairs, the Euclidean distance between the two pairing spots is denoted by dist(n). The distance metric to calculate the match between two shark images is then defined by:

image( eqn 4)

Where images share many spot pairs, there is a greater likelihood of a match between the shark images. Thus the number of spot pairs in the denominator is squared to favour high numbers of spot pairs over low numbers. A low score in the distance metric indicates a better match than a high score.

Exhaustive search

The three reference points of the dorsal and pelvic fins only provide a first estimate for the transformation, termed the ‘quick search’. In an optional second step, termed the ‘exhaustive search’, a large number of affine transformations is calculated from the spot pairs to get the best possible match between two images. Given the spot pairs resulting from the first step using only the three reference points, all possible combinations of three pairs are selected as input to calculate a new affine transformation. The best score from all these possible transformations is then taken as the final matching score between two images.

For example, if the first step results in four spot pairs, p1.. p4, these pairs can also be used to redo the first step only with other point pairs. In this example, the following sets of three pairs are possible: (p1, p2, p3) (p1, p2, p4) (p2, p3, p4) (p1, p3, p4). If one of the transformations, derived from these triple point pairs, yields a better score than the initial transformation, this score is kept as the final score.

This second transformation step, the exhaustive search, can be computationally expensive. A number of n spot pairs resulting from the first step will yield n · (n − 1) · (n − 2)/6 transformations to evaluate. For example, two shark images with 16 spot pairs in common will require 560 transformations to be evaluated. Although the second exhaustive step is about 100–150 times slower than the first step, the results of the exhaustive search are much better. On a modern desktop computer, comparing an image against a database of hundreds of images, using the exhaustive search option will require a few seconds at most.

The software interface

The pattern-matching program was developed using Java 1.4.2 and C++, and requires the Java Run-time environment to run on personal computers (available from http://www.java.com, accessed 2 Jan 2007). The entire code for the pattern-matching system, known as I3S (Interactive Individual Identification System), is available at http://www.reijns.com/i3s (accessed 2 Jan 2007).

An image is opened in the application. Using a computer mouse, the user manually selects the three reference points to define the common reference area. The user then points out the most distinct pigment spots on the left flank of each shark. Between 12 and 40 spots are selected within the reference area bounded by the two dorsal fins and the pelvic fin. The centre of each spot is marked and, where spots overlap or join, the apparent centres of such spots are marked. The size of the spots is currently not considered important, whereas their relative position to one another is.

A key feature of the system is that human judgement is used to distinguish between pigment marks and artefacts such as reflections, shadows and particles in the water. Once all comparisons have been made, the user is presented with a list of possible matches. The most likely match, based on the scores calculated from the distance metric, is listed first. The user can then compare the image of the unknown shark with up to 50 possible matches provided in the list.

database trial and shark residency characteristics

The I3S software was subsequently trialled using a database of 739 images taken on the Aliwal Shoal over 9 years. The effectiveness of the pattern-matching system was quickly demonstrated as it helped to identify four sharks in the database that had been erroneously identified as new individuals in previous studies. Two-hundred and twenty-one C. taurus were identified as distinct individuals that visited the Aliwal Shoal during the 9-year study period.

We found that photographs of moderate quality, for example if slightly out of focus or blurred because of movement or high levels of suspension particles, could still be used as long as the arrangement of the spots, and their relative position to one another, could be discerned. A subsample of 10 sharks recognized over several years showed the consistency of characteristic spot patterns that were recognized by the I3S software. No changes in the relative positions of the spots were discerned during the years (Fig. 3).

Figure 3.

Photographs of the same male shark showing how flank markings are retained and can be traced from year to year. The first dorsal fin is also an unusual shape and serves as an additional feature for identification.

The performance of the software was tested rigorously by randomly selecting one, two and three reference images per shark from the data set and using all the remaining shark images as a test set. We tested sharks that were positively identified using a combination of features besides spot patterns, such as fin notches, tears and spots and, in some cases, scars that could be tracked over several months. Both the quick and exhaustive search options were used (Table 1). We measured the number of times the correct shark was ranked in the top 1, 3, 5 and 10 best matches as a percentage. To correct for random effects, the experiment was repeated 100 times and the average calculated over all results. Results of the photo-identification data were compared interannually and the 10 most regularly recorded sharks investigated in more detail.

Table 1.  The likelihood of obtaining a correct match between shark images, using the two search options and varying the number of reference images per shark
Search optionPercentage of images correctly ranked in one of the following ranks
Top 1Top 3Top 5Top 10
Quick search
1 reference image per shark35·743·547·353·1
2 reference images per shark52·862·166·071·5
3 reference images per shark62·871·475·079·7
Exhaustive search
1 reference image per shark71·876·178·180·7
2 reference images per shark87·490·291·092·4
3 reference images per shark91·793·994·595·4


software evaluation

The rigorous test of the database using the I3S software highlighted that previously identified individual sharks could be recognized successfully using both the quick and exhaustive search options (Table 1).

The quick search provided a greater than 50% likelihood of recognizing previously identified individuals as a match when there was more than one reference image. With only one reference image to test against and using the quick search option, 36% of the images in the test database were correctly ranked as the correct choice (top 1). This rose to 63% when three reference images were available. As seen in Table 1, the efficacy of correctly identifying individuals rapidly increased with an increasing number of reference images, implying that researchers should maintain images of the same individual in their database, particularly if they require the quick search option of the software.

The exhaustive search algorithm was more accurate than the quick search option. Even if only one reference image was available for the comparison, it yielded a 72% chance of correctly identifying the individual in the catalogue. This recognition ability increased to an almost 80% chance of it being in the top 10 ranking using only a single reference image. Again, the efficacy of correctly identifying individuals increased rapidly with an increasing number of reference images to work off, yielding an almost 92% chance that the correct shark ranked in the top 1 position, if there were three reference images in the database (Table 1). If the user was prepared to confirm from a selection of 10 likely matches, then the chance of a correct match was 95%.

In this study, even though 40% of the sharks in the database were represented by only one reference image, the exhaustive search option still provided a 72% likelihood of a positive match (Table 1). This substantially reduced search effort by researchers with large databases.

shark occurrences

Inter-annual variation in the number of photo-identified individuals and their resightings is shown in Table 2 for each year since 1995. Prior to 1999, data had been collected opportunistically during recreational scuba dives. The peak in the number of identified individuals in 1999 corresponded with the initiation of a new photo-identification research project. The total number of identified animals was 221 for the period 1995–2003. It was evident that a high proportion of previously recorded individuals was resighted every year, for example almost 63% of animals recorded in 2003 had been seen at least once in earlier years.

Table 2.  The annual sightings and resightings of individual C. taurus sharks on Aliwal Shoal
Total identified sharks per year
Number of sharks previously sighted at least once before (% of total)
 45493627 373 8 1 

More detailed analysis of 10 individually recognized sharks that had been regularly photographed at the Aliwal Shoal indicated that most animals probably return to the Aliwal Shoal annually. The only exception was the shark known as UDparachute, which appeared to visit biennially (Table 3).

Table 3.  A summary of the annual occurrence of 10 sharks photo-identified on Aliwal Shoal from 1995 to 2003
TickF +  + ++ 
NickM+++++  + 
HalfpecF ++ + ++ 
LeighF+++ +    
WalterM++++++ + 
MaryF ++++ +  
Dice 5M   ++  ++
UDparachuteM+ + + +  


The natural pigment marks on the flanks and their relative arrangement forms the basis from which identifications are made using the I3S software. The identifications were tested using 221 individually recognizable C. taurus. The ability to identify individuals with less than perfect representative images is of enormous benefit to field workers studying free-ranging animals in challenging marine environments, such as those off the South African east coast.

The identification algorithm is based on a two-dimensional affine transformation that assumes that a shark is a linear, rigid, two-dimensional object. Ideally, photographs should be taken at right angles to the shark's side, but even imperfect photographs taken at an oblique angle provided a correct match, thereby reducing the potential heterogeneity bias often associated with photo-identification studies (Hammond 1990). Correct identification would still be jeopardized if the body of the shark was flexing or turning. Nevertheless, even with imperfect images, the system has proved to be a useful identification aid, provided that the user is extremely critical of the suggested matches and verifies matches using other features. In studies with dolphins, well-marked individuals are often recognized by more than one feature, which may include a combination of attributes such as marked fins, shape of fin, shading patterns, scrapes, scratches and wound marks, as well as pigment patterns (Würsig & Jefferson 1990; Karczmarski & Cockcroft 1998). The investigator can then use other features, such as fin shape and notches, tears, scars and spots in the fin and, in adult male sharks, clasper size and form, to identify individual sharks. The pigment spots on C. taurus are largely unchanging over successive years, and can be traced from year to year, as illustrated by the example shown in Fig. 3. In this example, the tip of the first dorsal fin is unusually squared-off and serves as a double mark.

The key feature of the software is that it is not fully automated. The user must point out the reference points, which in the case of C. taurus are the fin origins, and the most distinctive marks. Finally, the user must select the best match from a ranked list of possible known shark images. As the user manually points out the natural marks, image artefacts such as particle reflection in the water, backscatter from incorrect flash position and flash overexposure of the flanks, can be ignored. Only those natural marks that can be clearly discerned by the human eye are selected, thereby ensuring the best possible choice. Additionally, the use of this software is beneficial as a clear image focus is not as stringent a requirement for spot patterns as it would be for other natural marks, such as notches and tears in fins. We believe that this is more beneficial to correct identification of individuals and represents a preferable option over the system reported by Arzoumanian, Holmberg & Norman (2005).

We found that pattern-matching performance improved with a greater number of reference images against which comparisons could be made. Thus the image catalogue should include at least three good-quality images of each individual for efficient identification. In many historical studies of photo-identification, only the best quality images are retained as the reference material because of limited storage capacity for photographs. The use of I3S therefore may require a larger catalogue of images to be maintained, but as it is all digital and the software searches the entire catalogue, irrespective of the order of image storage, we believe that a typical modern computer hard-drive would provide ample storage space.

Ideally, photographs of both sides of the animal should be obtained (Würsig & Jefferson 1990) but this is not always possible. We elected to consistently photograph the left side to avoid any possible confusion. Although the current application was specifically tailored to use only the left side of the animal, it is a simple step to include both sides of the animal in future versions. The recognition performance could be further enhanced by making a distinction between males, females and animals of unknown sex. Sex is difficult to determine in small individuals, when the claspers of the male may not yet be clearly distinguishable, or if the angle of the photograph does not allow the claspers to be seen.

An added benefit of our system is that only one computer-based package is required and the entire process, from image download, spot selection and matching, requires less than 5 min if the shark already exists in the database. If a new shark is recorded that has not been previously identified, then a more rigorous visual inspection of the database is necessary but still will be completed in a substantially shorter time than reported for other identification software packages (Arzoumanian, Holmberg & Norman 2005).

The I3S software therefore would be a useful tool for long-term studies that inevitably include large databases. The non-invasive nature of photo-identification mark–recapture studies makes them ideal for assisting in population estimates of critically endangered species such as C. taurus on the east coast of Australia (Otway, Bradshaw & Harcourt 2004). In this study at Aliwal Shoal, use of the I3S software substantially eased the process of correctly identifying individuals and allowed an estimation of the numbers of animals visiting the reef each year.

Additionally, this software can assist in obtaining more detailed data on the movements and residency status of individual animals. This study highlights the apparent philopatric nature of C. taurus to a particular reef system, with individuals apparently returning on a near-annual basis to this reef. Whether the Aliwal Shoal serves as a critical habitat for individuals of C. taurus, and thus may influence the entire population, requires confirmation of the nature and regularity of visits to the reef with more substantial data capture effort. Nevertheless, this work highlights the importance that protection of individual reefs may have for conservation of the species. The I3S software may then serve as an important tool in assuring accurate data analysis in international efforts to protect particular species (such as C. taurus) through proclamation of marine protected areas.

In its present form, the system has only been rigorously used and tested with C. taurus. The possibility that natural markings on the sharks may fade or change over time does not appear to be a concern. Even with intervals of several years between photographs, the sharks can generally be readily distinguished. Potentially the system could be adapted for other similarly shaped animals with spots (or other consistent features). Indeed, trials with tiger sharks and the shorttail stingray Dasyatis brevicaudata are being considered, while current tests with whale sharks are extremely promising (Speed, Meekan & Bradshaw 2007).


We thank Graham Thurman and Bryce Allen for the earlier photographs. We greatly appreciate the support from the CSIR, Natal Sharks Board, Aliwal Shoal dive charter operators and the scuba divers who provided photographs or dive assistance. WWF-SA Nedbank Green Trust and the South African National Research Foundation (NRF) provided funding for the project. A. T. Forbes provided insightful comments on an earlier draft of the manuscript, as did M. Smale and an anonymous referee. This technique was developed during the first author's doctoral studies.