Detection dogs in nature conservation: A database on their world‐wide deployment with a review on breeds used and their performance compared to other methods

Over the last century, dogs have been increasingly used to detect rare and elusive species or traces of them. The use of wildlife detection dogs (WDD) is particularly well‐established in North America, Europe and Oceania, and projects deploying them have increased world‐wide. However, if they are to make a significant contribution to conservation and management, their strengths, abilities and limitations should be fully identified. We reviewed the use of WDD with particular focus on the breeds used in different countries and for various targets, as well as their overall performance compared to other methods, by developing and analysing a database of 1,220 publications, including 916 scientific ones, covering 2,464 individual cases—most of them (1,840) scientific. With the world‐wide increase in the use of WDD, associated tasks have changed and become much more diverse. Since 1930, reports exist for 62 countries and 408 animal, 42 plant, 26 fungi and six bacteria species. Altogether, 108 FCI‐classified and 20 non‐FCI‐classified breeds have worked as WDD. While certain breeds have been preferred on different continents and for specific tasks and targets, they were not generally better suited for detection tasks than others. Overall, WDD usually worked more effectively than other monitoring methods. For each species group, regardless of breed, detection dogs were better than other methods in 88.71% of all cases and only worse in 0.98%. It was only for arthropods that Pinshers and Schnauzers performed worse than other breeds. For mono‐ and dicotyledons, detection dogs did less often outperform other methods. Although every breed can be trained as a WDD, choosing the most suitable dog for the task and target may speed up training and increase the chance of success. Albeit selection of the most appropriate WDD is important, excellent training, knowledge about the target density and suitability, and a proper study design all appeared to have the highest impact on performance. Moreover, an appropriate area, habitat and weather are crucial for detection dog work. When these factors are taken into consideration, WDD can be an outstanding monitoring method.


| INTRODUC TI ON
With ongoing biodiversity loss and the rising number of threatened and extinct species (Butchart et al., 2010;Díaz et al., 2019), the need for science to inform nature conservation and wildlife management is becoming increasingly important. Biodiversity loss has become one of the core issues that has already exceeded the high-risk boundary for destabilising the earth's system (Steffen et al., 2015). Thus, nature and species conservation actions are of global and existential importance to humankind.
Conservation actions are largely determined by species monitoring (Niemelä, 2000), which is often challenging, particularly for elusive, rare, nocturnal or highly mobile species. Furthermore, there is often limited access to the areas in question and in spite of the world-wide knowledge of various monitoring methods (Hill et al., 2005), some state-of-the-art methods cannot be applied due to high costs or poor infrastructure (Christie et al., 2016).
Moreover, reliable monitoring and species identification can often only be carried out by experts with many years of experience in their field (Grimm-Seyfarth et al., 2019). Insufficient species monitoring data could in turn contribute to misinterpretation and mismanagement, resulting in further biodiversity loss (Ferreira et al., 2016).
Together with advances in other more recent technologies, such as GPS and DNA extraction from small traces of a species, wildlife detection dogs (WDD) are one method for monitoring species of all kingdoms that could otherwise not or hardly be studied (Bennett et al., 2019;Dahlgren et al., 2012;MacKay et al., 2008).
Compared to the six million olfactory receptor cells that humans have, sheepdog noses have more than 200 million, and beagle noses over 300 million (Horowitz, 2009). Dogs also have more different kinds of olfactory cells, enabling them to detect many more different odours (Horowitz, 2009) and to recognise specific substances at concentrations of up to 500 parts per trillion (Johnston, 1999). Together with their trainability and willingness to work with humans, these traits make dogs an ideal detection technique (DeMatteo et al., 2019).
Detection dogs have been used as a monitoring technique for decades (MacKay et al., 2008), but it is only recently that they have garnered serious attention by ecologists from all over the world. In New Zealand, where conservation detection dogs have the longest tradition world-wide (Appendix S1), they are divided into 'protected species dogs' trained to detect rare species, and 'predator detection dogs' trained to detect invasive alien predators for eradication programs (Cheyne, 2011). Today, detection dogs have an even wider field of application, which includes the detection of pests (e.g. invasive plants, arthropods, fungi), traces (e.g. scat, hair), carcasses (e.g. in wind parks, under power lines or poison monitoring) and animal quarters (e.g. dens, roosts, nests). However, if WDD are to contribute significantly to conservation and management, their strengths, abilities and limitations should be fully identified. Zwickel (1969) has provided a first small review of conservation dogs, a synonym frequently used for WDD. He suggested the following tasks for WDD: (a) locating and (b) collecting wildlife, (c) studying wildlife behaviour, (d) protecting property from wildlife and (e) facilitating the proper harvest of species (Zwickel, 1969).
This was extended by one category in the updated version: (f) live capturing of wildlife (Zwickel, 1980). With growing attention over the last decade, several books and reviews have been published.
Some publications give a short overview (Dahlgren et al., 2012;Hurt & Smith, 2009;Woollett (Smith) et al., 2014), some are dedicated to specific targets, for example, the detection of scats (MacKay et al., 2008), insects (Lehnert & Weeks, 2016), carcasses (Barrientos et al., 2018) or rare species (Bennett et al., 2019). Other publications are dedicated to dogs, for example, pointing dogs as game detection dogs (Watson, 2013) or the selection of an appropriate WDD (Beebe et al., 2016;Jamieson et al., 2017). However, so far there has not been a comprehensive review of both the historical and the most recent use of WDD. Therefore, this literature review compiles the use of detection dogs in nature conservation, wildlife research and management from past to present, demonstrating the potential of this method. We provide an overview on target species and types in different countries and investigate which dog breeds have been preferably used per target and location, thereby summarising trends uncovered in the work of others. We also compiled all studies that compared WDD with other monitoring methods and summarised how these studies determined that WDD performed relative to other methods. Last, if WDD did not outperform other monitoring methods, we compiled limitations in using WDD for species monitoring. We focused mainly on scientific literature, including scientific papers, dissertations and project reports. However, WDD were frequently used for conservation or management purposes without a scientific research project behind them. For a more comprehensive overview of their deployment and performance, we included popular science or newspaper articles when no scientific publication about the project was found. In addition, we used social media platforms to obtain many articles from different countries (Appendix S1.1). In order to avoid multiple citations of the same study for which publications from different sources have been published, we compared each new entry with the entries in the database and preferably included scientific publications, followed by books, popular science and newspaper articles (see Appendix S1.1 for a detailed description).

| Literature review
Dogs used to detect contrabands, poached, trafficked or other illegally taken plants, animals or animal components are frequently called wildlife dogs, but are not commonly considered to be conservation dogs (Hurt & Smith, 2009) and were therefore not considered in this review. Likewise, we did not consider truffle, virus and medical detection dogs. However, dogs detecting bacteria for conservation and pest management were included.

| Database structure
We compiled data from the literature in a relational database (Microsoft Access 2013) consisting of five basic tables: literature, dog breeds, target species, target types and countries (Appendix S2). We classified dog breeds into the 10 FCI classification groups 1 and breeds not listed as 'not classified'. We assigned mixed breeds to a main or first-mentioned breed or to the category 'Mix' when they could not be assigned to a specific breed. We classified target species according to their Latin and English names, genus, family, order, class, phylum and kingdom, adding subspecies names if provided. If the dog detected species groups without further specification (e.g. bat or bird carcasses, rodents, weed), we retained this group only. Taxonomic changes due to splitting of taxa into several species were only made if the allocation to the new species was obvious from the geographic information provided or had already been done by other authors. We divided potential target types into: living or dead individuals; nests, dens, clutches, coveys, roosts; scat, urine, saliva, glandular secretion; spores, eggs; larvae; hair, feathers, pellets, shed skin; and different combinations thereof. Lastly, we classified countries according to the (sub-)continent into North, Central and South America, Europe, Asia, Africa and Oceania, assigning Russia and Turkey to 'Eurasia'. Furthermore, we assigned Australia, New Zealand and all oceanic islands (including Subantarctic islands) to 'Oceania' and made no differentiation to Zealandia.
In a main table, we then assigned each breed-target speciescountry association per reference as a single 'case'. We marked pure-breed dogs and added a second breed for mixed breeds (if provided), as well as the number of dogs per breed and reference (if not mentioned directly, '1' for mentioning 'dog' and '2' for mentioning 'dogs'). We also added specifications to the country (e.g. Islands). If available, we extracted results of the WDD performance compared to other monitoring methods. We classified the performance into four categories: dogs were (a) better; (b) equal; or (c) worse than other methods tested; or (d) mixed results. The factor in comparison was study-specific and could include speed per area or transect, area size, sample size, quality, detectability, specificity, sensitivity, accuracy or precision. We relied on those conservative measures since different monitoring methods can hardly be compared otherwise.
The category 'mixed results' was given when the dogs were better at some factors but worse at others, or when the performance depended on season, year, site or dog. Since we designed the database as a relational database, IDs among the five basic tables and the main

| Differences in the use of different breeds
We particularly focussed on the use of different dog breeds per continents, target species and target types, using Fisher's exact rank test.
If significant, we used pairwise chi-squared tests and the Bonferroni-Holm correction for p-values as a post hoc test (Holm, 1979). To homogenise sample sizes among taxonomic groups, target species were grouped for these analyses as follows: Actinopterygii, Amphibia, To ensure scientific comparability, we performed every analysis four times: with all publications together and with scientific publications only, and based on either cases or references. For example, if one publication described one breed detecting two species, analyses of the breed would 'double-count' this dog when based on cases, but count it only once when based on references. Therefore, we specify the dataset used as all_cases (all references based on cases), scien-tific_cases (scientific references based on cases), all_references (all references based on references, i.e. dropping 'double-counts' where possible) and scientific_references (scientific references based on references). We report the results of the most restrictive approach using the dataset scientific_references in the main text, and the results of the less constrained datasets (all_cases, scientific_cases and all_references), which revealed similar results with slight additional differences, in Appendix S3.

| Performance of wildlife detection dogs
To analyse differences in the performance of WDD, we removed all cases without comparisons and, due to the low numbers, combined those cases where dogs did not perform better than other methods (i.e. equal, worse or mixed results). We tested whether the performance was different among breeds, and whether WDD (regardless of the breed) performed better than other methods for different target species groups or target types. We used Fisher's exact rank test and, if significant, pairwise chi-squared tests with Bonferroni-Holm correction for p-values as a post hoc test. In cases where detection dogs did not perform better than any other method, we separately assessed reasons for this.

| Overview of the literature compiled
In total, we included 1,220 publications (Appendix S2: Table S5)  This biased geographic distribution most likely reflects the historic deployment of WDD (Appendix S1.2). High numbers of scientific publications were found in countries with a long history of WDD employment, for example, the United States, the United Kingdom and New Zealand ( Figure 3). In other countries, the use of WDD is more recent.
Despite many cases reported for Germany, only 37% of the publications were scientific, whereas the median proportion of scientific publications across all countries was 80%. While the search for scientific literature was likely not skewed by country, non-scientific literature, usually written in the national language, was more challenging to find.
We checked publications in the following languages: English, German, Polish, Russian, Norwegian, Swedish, French, Spanish (from Spain and South American countries), Portuguese, Dutch, Danish, Czech, Italian and Japanese. However, the proportion of non-scientific publications was highest for Africa (Table 1).  Figure 5).

F I G U R E 1
In 64% of all reported cases, WDD were trained on living individuals, occasionally combined with nests, eggs, scats or dead individuals. Twenty-five per cent of all cases describe deployment for scat, urine, saliva or glandular secretions only, where they are commonly referred to as scat detection dogs. Another 4.9% were trained on dead individuals only, 4.6% on nests, dens, clutches, coveys or roosts, 0.8% on hairs, feathers, pellets or shed skin and the remaining 0.7% were trained for spores, eggs or larvae ( Figure 6).

| Differences in the use of different breeds
The distribution of FCI-classified breed groups was skewed among continents (Fisher test, p scientific_references = 0.0005, Figure 4, Appendix S3: Table S1), target species (Fisher test, p scientific_references = 0.0005, Figure 5, Appendix S3: Table S2) and target types (Fisher test, p scientific_references = 0.0005, Figure 6, Appendix S3: Table S3). Sheepdogs and Cattledogs were equally deployed across continents but significantly more often for dicotyledons than any other breeds (p = 0.01), and significantly less often for birds (p ≪ 0.001) and thus, also for their typical target type, the combination of living individuals and nests (p = 0.04). However, they were significantly more often deployed as scat detection dogs (p ≪ 0.001) and slightly more often for pest detection from spores and eggs (p = 0.09). Pinshers and Schnauzers were equally deployed across continents and target species, but significantly more often for detecting dead individuals (p ≪ 0.001) as they were often used for fatality searches (Grimm-Seyfarth et al., 2021).

F I G U R E 3
Annual number of publications where wildlife detection dogs (WDD) have been used for the three continents with highest deployment numbers, separated by the most searched species groups. Transparent lines refer to the loess regression with a 75% smoothing span

F I G U R E 4 Number of scientific references per continent separated by FCI dog breed classes. 'not classified'-a breed mentioned that is not listed in the FCI classification; 'Mix'-the publication only mentions a mixed breed but no assignment to a specific breed could be made
Terriers were significantly more deployed in Oceania (p = 0.02), where many eradication programs were conducted (Grimm-Seyfarth et al., 2021). Their use was significantly higher in reptiles than in any other species (p = 0.02) and significantly lower in birds (p = 0.006). In line with their hunting history, Terriers were significantly more often used to detect living individuals than other target types (p = 0.005).
The use of Dachshunds was significantly higher in Africa (p = 0.002) and for detecting shed skin and fish (both p ≪ 0.001). Spitz and primitive TA B L E 2 Summary of the kingdom, phylum and classes and the number of orders, families and species (excluding subspecies) per class for which the use of WDD has been reported, as well as the number of reported cases and references. Numbers in brackets refer to numbers from scientific publications. A '0' in the number of orders, families or species means that no publication specified the exact order, family or species respectively. Note that some references report the use for more than one class and are thus counted multiple times. See Appendix S2: a This refers to a project report where the weed has not been further specified.
types were significantly more often used in Eurasia (p ≪ 0.001) and for mammals (p = 0.02), as they were frequently used as scat detection dogs in Russia (e.g. Krutova, 1993). The deployment of Scent hounds was not geographically biased. Their use was significantly higher for Arthropods (p ≪ 0.001) and living individuals (p = 0.0005) than in other targets, but significantly lower for birds (p = 0.05). Most Scent hounds were used for pest detection (Grimm-Seyfarth et al., 2021).
Pointers and Setters were significantly more often used in Europe (p < 0.001), particularly in the United Kingdom and Scandinavia (Appendix S1.2), for detecting birds and living individuals and nests (both p ≪ 0.001) or nests and coveys (p = 0.002), due to their intensive history of detecting ground-breeding birds-a task very similar to the purpose that they were bred for (Watson, 2013). Their use was significantly lower in Oceania and for mammal or scat detection (all p ≪ 0.001). Retrievers, Flushing Dogs and Water dogs were equally deployed across continents but more often for Bacteria than any other breed (p = 0.003). They were significantly less often used for bird detection (p ≪ 0.001) or living individuals and nests (p = 0.04).
Companion and Toy dogs were also equally deployed across continents but significantly more often used for the detection of Bacteria (p = 0.04) and eggs/larvae and their adults (p ≪ 0.001). However, the number of dogs from this group was very small (4; Table 3). Non-FCI-classified breeds were slightly more often used in North America (p = 0.06) and for reptile detection (p ≪ 0.001) and living individuals (p = 0.07). Mixed breeds (without indication of a main breed) were equally deployed across continents and target types, but more often F I G U R E 5 Number of scientific references per target species group separated by FCI dog breed classes. 'not classified'-a breed mentioned that is not listed in the FCI classification; 'Mix'-the publication only mentions a mixed breed but no assignment to a specific breed could be made F I G U R E 6 Number of scientific references per target type separated by FCI dog breed classes. 'not classified'-a breed mentioned that is not listed in the FCI classification; 'Mix'-the publication only mentions a mixed breed but no assignment to a specific breed could be made for Bacteria detection (p = 0.003). The breed was reported significantly more often in studies from Europe (p = 0.006) and those on arthropods (p < 0.001), but significantly less often in studies from Oceania (p = 0.04) and those on mammals (p ≪ 0.001).

| D ISCUSS I ON
Our findings statistically support previous suggestions that specific breeds have been preferably used for specific tasks and targets (Dahlgren et al., 2012), but contradict the assumption that specific breeds are generally better suited for detection tasks. We found several lines of evidence that terriers did less often outperform other monitoring methods, particularly for mammal detection. However, terriers were mostly used in eradication programs, which need a broad combination of monitoring methods dedicated to different tasks (Clout & Russell, 2006). Therefore, the performance of dogs was often evaluated as 'mixed results'. This also explains why WDD did less often outperform other methods regarding the target types living individuals or living individuals and scat.
Apart from eradications, it was only for arthropods that Pinshers and Schnauzers performed worse than other breeds, which was shown in a study where a Rottweiler performed slightly worse than two Golden Retrievers when detecting red palm weevil Rhynchophorus ferrugineus (Soroker et al., 2013). Irrespective of the breed, WDD did not outperform other methods for mono-and dicotyledons. While dogs detected much smaller plants than humans, they could not differentiate among many plants in high densities (e.g. Sargisson et al., 2010). However, WDD were advantageous for detecting underground plants (NSW, 2020).
WDD performed better in almost 90% of all cases that compared them to other monitoring methods. For example, WDD detected between 3.5 and 4.7 times more black bears Ursus americanus, fishers Martes pennanti and bobcats Lynx rufus than camera traps and seven times more black bears than hair snares, which did not detect any other species (Long et al., 2007). In another study, WDD detected 10 times more bobcats than cameras, hair snares and scent stations combined (Harrison, 2006). Likewise, WDD found four times more scats of kit fox Vulpes macrotis (Smith et al., 2001) and Eurasian otter Lutra lutra (Grimm-Seyfarth et al., 2019) than experienced human searchers. Moreover, they required much less time to determine species presence than camera traps (Clare et al., 2015) or hair snares (Tom, 2012). Another advantage of WDD is a lower sampling and spatial bias than most other methods (Grimm-Seyfarth et al., 2019;Long & MacKay, 2012). Finally, WDD showed a substantially higher species specificity compared to human searchers (e.g. Grimm-Seyfarth et al., 2019;Smith et al., 2001).
Nevertheless, in 11.3% of all cases, WDD did not outperform other monitoring methods. In two of the six cases where WDD performed worse, dogs either dispersed (Invasive Animals CRC, 2013) or even captured the target species (Goodrum, 1940). This strengthens the argument that WDD implementation requires more training than scent detection alone. Generally, we found that most cases where WDD did not perform better indicated problems in training (37 cases), an issue also highlighted for bird dogs (Gutzwiller, 1990) and  (Mowbray, 2002) or those with a low smell (Karp, 2020).
The last is also an example that a proper study design adapted to the species and habitat is necessary for the success of WDD. Another important issue is the difference among individual dogs, which has regularly been mentioned before (e.g. Gutzwiller, 1990;MacKay et al., 2008) and already been suggested since the first publications (Wight, 1931). This was evident in mixed results when other monitoring methods were better than some, but not all dogs (12 cases).
Although selection of the most appropriate WDD is important, training, target density and suitability, and study design appeared to have a greater impact. Moreover, in addition to the dogs' age and experience, their biological, psychological and social characteristics (Beebe et al., 2016) as well as their handling and housing are likely to play a role in their performance (Byosiere et al., 2019). Finally, other reasons for the dogs' performances were excessive costs (nine cases) and target verification issues (three cases; Table 4). Importantly, the effect of some issues can be limited with proper adjustments (Leigh & Dominick, 2015) and training adapted specifically to the dog and the given field conditions (Woollett (Smith) et al., 2014).

| CON CLUS IONS
WDD in conservation, wildlife research and management have been employed for a long time, but gained particular attention over recent decades. With the world-wide increase in the use of WDD, their work tasks have changed and become much more diverse. While specific breeds have been preferred on specific continents and for specific tasks and targets, they are generally not better suited for detection tasks than others. Nevertheless, choosing a dog most suited for the task and target may speed up training and increase the chance of success. Overall, WDD worked more effectively than other monitoring methods in almost 90% of the studies. Although selection of the most appropriate WDD is important, excellent training, knowledge about the target density and suitability, and a proper study design appeared to have the highest impact on performance. If these parameters are correctly addressed, WDD can be an outstanding monitoring method.

ACK N OWLED G EM ENTS
The authors thank all the people who helped in preparing data for the database and translating articles, particularly Jean-Baptiste Mihoub, Catarina Ferreira and Aleksandra Zarzycka. They also thank all of those people who forwarded literature and contributed to the huge collection for this review.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.