Vertebrate diversity revealed by metabarcoding of bulk arthropod samples from tropical forests

Background: Thousands of bulk arthropod samples are collected globally every year for monitoring programs, conservation efforts, and ecosystem assessments. The taxo‐ nomic contents of these samples can be assessed either morphologically or molecu‐ larly using DNA metabarcoding coupled with high‐throughput sequencing, the latter of which has gained popularity in recent years. In a related field, only vertebrateingest‐ ing invertebrates, such as carrion flies and blood‐feeding leeches, are targeted for col‐ lection, and metabarcoding is carried out on the vertebrate DNA in their gut contents to provide information on vertebrate diversity (invertebrate‐derived DNA, iDNA). Aims: Here, we show that the two approaches can be combined, that is, that ver‐ tebrate DNA can be detected through metabarcoding of bulk arthropod samples. Materials and Methods: Two metabarcoding primer sets were used to PCR amplify mammal and vertebrate DNA in DNA extracted from bulk arthropod samples col‐ lected with pitfall and Malaise traps in tropical forests in Brazil and Tanzania. Results: In total, 32 vertebrate taxa were detected representing mammals, amphibians, and birds. Detected taxa were within, or close to, their known geographical distributions. Conclusion: This study demonstrates that with a relatively small additional invest‐ ment, information on vertebrate diversity can be obtained from bulk arthropod samples. This is of particular interest in projects where bulk arthropod samples are collected and extracted with the aim to use metabarcoding to assess arthropod taxa. In such studies, the additional information on vertebrates can further inform ecologi‐ cal assessments and monitoring programs and function as a supplement to traditional survey methods of vertebrates.


| INTRODUC TI ON
Fauna monitoring can be used to assess ecosystem health; detect invasive, rare, and indicator species; define areas for conservation priority settings; and inform biodiversity and ecosystem management decisions (Hajibabaei et al., 2011;Hilty & Merenlender, 2000;Ji et al., 2013;Liu, Guo, Zhong, & Shi, 2018).For this, arthropods are a suitable taxonomic group as they occur in habitats all over the world and are important in ecosystem functioning, for example, by decomposition of organic matter, pollination, and serving as a food source for many aquatic and terrestrial animals (Rosenberg, Danks, & Lehmkuhl, 1986;Siddig, Ellison, Ochs, Villar-Leeman, & Lau, 2016).
On the other hand, vertebrate diversity information may improve the achievement of management goals and public acceptance of management decisions.This is because the presence of popular charismatic vertebrate species such as flagship, keystone, or umbrella birds and mammals (Heywood & Watson 1995;Simberloff, 1998) is perceived more positively than the occurrence of invertebrate taxa (Hunter et al., 2016).
The surveys of vertebrates and invertebrates present different challenges.Taxonomic identification of vertebrate species is generally not challenging, but low abundances and shy behavior of the often-crepuscular animals can make direct surveys time-and labor intense, especially in remote areas and in areas with dense vegetation.Therefore, indirect methods are often applied to monitor vertebrate fauna such as the collection of road kills (Teixeira, Coelho, Esperandio, & Kindel, 2013) and identification of signs such as tracks, nests, and scats (Hoffmann et al., 2010).In contrast, while arthropods occur in high abundance and are easily sampled (Rosenberg et al., 1986), morphology-based taxonomic identification of arthropod community samples requires not only taxonomic expertise for multiple taxonomic groups but also a significant time investment to identify all the taxonomic constituents (Basset et al., 2012).Given the difficulty of their identification, molecular analyses are increasingly applied to identify the taxonomic contents of bulk arthropod samples (Ashfaq et al., 2018;Elbrecht et al., 2017;Gibson et al., 2014;Kocher, Gantler et al., 2017;Liu et al., 2018;Morinière et al., 2016;Oliverio, Gan, Wickings, & Fierer, 2018;Shokralla et al., 2015;Yu et al., 2012).For this, DNA metabarcoding approaches are the most frequently applied.Metabarcoding principally relies on PCR amplification of DNA extracts using primers that are universal for a selected taxonomic group targeted by a taxonomically informative "barcode marker."Unique identifiers are added to sample amplicons before they are sequenced in parallel on a high-throughput sequencing platform.Following sequencing, the identifiers are used to trace the marker sequences back to the samples they originated from (Binladen et al., 2007;Taberlet, Coissac, Pompanon, Brochmann, & Willerslev, 2012).After additional computational processing, the taxa present in the samples can be identified by comparing the "barcode" sequences obtained to a DNA reference database (Ratnasingham & Hebert, 2007).The targeted nature of metabarcoding means that it is a cost-effective and efficient method for identifying the taxonomic contents of hundreds to thousands of samples in parallel (Galan et al., 2018;Taberlet et al., 2012).Metabarcoding of bulk insect samples has, for example, been used to characterize the diversity of insects in montane landscapes in tropical southern China (Zhang et al., 2016), explore the insect diversity in a Saharo-Arabian region with otherwise sparse fauna information (Ashfaq et al., 2018), and monitor temporal changes in arthropod communities in different forest types (Brandon-Mong et al., 2018).
A large number of invertebrate species feed on vertebrates and thereby sample their DNA.Recently, this so-called iDNA, short for invertebrate-derived DNA, has been used to monitor vertebrates.In these iDNA studies, metabarcoding is typically used to target taxonomically informative vertebrate DNA markers in DNA extracted from individual or pooled samples of invertebrates known to feed on flesh, blood, feces, and/or dead or decaying organic matter (reviewed in Calvignac-Spencer et al., 2013;Schnell, Sollmann, et al., 2015).
However, given the aforementioned targeted nature of metabarcoding, studies assessing arthropod taxa in bulk arthropod samples using metabarcoding have so far only identified arthropod taxa and therefore the two fields, metabarcoding of bulk arthropod samples and of iDNA, have until now been quite separated.
In this study, we evaluate whether it is possible to obtain information on vertebrate taxa through metabarcoding of bulk arthropod samples (Figure 1).To investigate this, we used vertebrate and mammal metabarcoding primers on DNA extracted from bulk arthropod samples collected with Malaise and pitfall traps in the Carajás National Forest in Brazil and the Udzungwa Mountains in Tanzania.

| Study sites and sample collection
Bulk arthropod samples were collected as part of ongoing diversity studies in Brazil and Tanzania.Bulk arthropod samples were collected in Malaise and pitfall traps.In Brazil, bulk arthropod samples were collected in September 2017 (dry season) and April 2018 (wet season) in an iron mine area (06°03′31″S 50°10′37″W) in the Carajás National Forest, Pará State (Figure 2).Collection sites included pristine moist Amazonian equatorial forest, savanna-like Canga ecosystems (Mitre et al., 2019), and sites at different stages of environmental rehabilitation following mining.Malaise traps were left for 5 days, and pitfall traps were left for 24 hr before collection.
Arthropods were collected in 70% ethanol in pitfall traps and in propylene glycol in Malaise traps.A total of 50 Malaise and 50 pitfall samples were collected.Samples were stored at room temperature for a maximum of a week before DNA extraction.
In Tanzania, bulk arthropod samples were collected in a mountainous rainforest about 1,000 m above sea level in the Udzungwa Mountains (07°41′07″S 36°55′49″E) (Figure 2).Samples were collected in September and October 2014 (the end of the dry season).
Traps were placed in three locations, about 500 m apart, emptied every day for 7 days, and then emptied every other day for three collection events and finally every week for three collection events.
Propylene glycol was used as collection fluid and samples transferred to 70% ethanol upon collection.A total of 78 Malaise and 78 pitfall samples were collected.After collection, samples were kept at ambient temperature for a maximum of 2 weeks after which they were stored at −20°C until DNA extraction.

| DNA extraction
For samples stored in ethanol, ethanol was carefully poured off the samples and arthropods were transferred to falcon tubes.Falcon tubes were placed in an oven at 55°C without lids to evaporate the remaining ethanol.For samples stored in propylene glycol, propylene glycol was carefully poured off with no further evaporation before DNA extraction.Samples with a volume larger than 30 ml were split into two before DNA extraction.This resulted in a total of 103 extracts from Brazil and 162 from Tanzania giving a total of 265 extracts.A negative extraction control was included for every 11-25 samples.Samples were extracted following a nondestructive protocol modified from Gilbert et al. (2007).In this protocol, a digest buffer is added to unsorted bulk arthropod samples that were not homogenized previously and therefore preserving the exoskeleton (Nielsen, Gilbert, Pape, & Bohmann, 2019).Following extraction, 200 μl digest from samples and negative extraction controls were purified using the QiaQuick PCR Purification Kit (Qiagen) following The use of bulk arthropod samples to assess arthropod and vertebrate diversity starting from setting traps in the field (top), collection of bulk arthropod samples (A), extracting DNA (B), carrying out DNA metabarcoding (C1, C2) to assess diversity (D1, D2) and combining the obtained arthropod and vertebrate diversity data (E).Grey colour represents the general workflow when assessing arthropod diversity using DNA metabarcoding and green colour represents the current study.Many images within the illustrative figure courtesy of the Integration and Application Network, University of Maryland Center for Environmental Science (ian.umces.edu/symbols/) the manufacturer's protocol with minor modifications.Specifically, after addition of elution buffer, samples were incubated for 15 min at 37°C after which they were eluted in 50 μl EB buffer and stored in Eppendorf LoBind tubes at −18°C.
DNA extractions of Tanzanian samples were carried out in a pre-PCR laboratory to minimize contamination risk.The Brazilian samples were extracted in a general use laboratory.
The two metabarcoding primer sets will be referred to as 16S mammal and 12S vertebrate primers, respectively.Nucleotide tags were added to the 5′ ends of both forward and reverse primers to allow parallel sequencing (Binladen et al., 2007).Specifically, tags consisted of a total of 7-8 nucleotides of which 6 nucleotides were the tags and 1-2 were nucleotides added to increase complexity on the flow cell during sequencing (De Barba et al., 2014).Giraffa camelopardalis (giraffe).Negative controls were included in PCR amplifications with both primer sets.PCR amplifications were performed with nonmatching nucleotide tags (e.g., forward primer tag 1-reverse primer tag 2, forward primer tag 1-reverse primer tag 3, and forward primer tag 1-reverse primer tag 4) to allow for more amplicons to be pooled together and reduce laboratory costs (Schnell, Bohmann, & Gilbert, 2015).Moreover, every PCR replicate for each sample was made with a different tag combination.
For the 16S mammal primers, the 25 μl reactions consisted of 1 μl DNA template, 1 U AmpliTaq Gold, 1× Gold PCR Buffer, and 2.5 mM MgCl 2 (all from Applied Biosystems); 0.6 μM each of 5′ nucleotide tagged forward and reverse primer; 0.2 mM dNTP mix (Invitrogen); 0.5 mg/ml bovine serum albumin (BSA); and 5 μM human blocker (5′-3′ GCGACCTCGGAGCAGAACCC-spacerC3) (Vestheim & Jarman, 2008).The thermal cycling profile was 95°C for 10 min, followed by 40 cycles of 94°C for 12 s, 59°C for 30 s, and 70°C for 25 s, with a final extension time of 72°C for 7 min.To evaluate the effect of using a human blocker, the extracts from Tanzania were also PCRamplified with the same conditions but omitting human blocker (see Appendix S1).For the 12S vertebrate primers, reactions were equal to that of the mammal primer except using 0.75 U AmpliTaq Gold, 20 μl reaction volumes, and the human blocker (5′-3′ TACCCC AC TATGC T TAGCCC TA A ACC TC A AC AG T TA A ATC-spacerC3) (Calvignac-Spencer et al., 2013).The thermal cycling profile was 95°C for 10 min, followed by 40 cycles of 94°C for 30 s, 59°C for 45 s, and 72°C for 60 s, with a final extension time of 72°C for 7 min.
Amplified PCR products were visualized on 2% agarose gels with GelRed against a 50 bp ladder.All negative controls appeared negative.Products with visible bands on the agarose gels, carrying different nucleotide tag combinations, were pooled as follows: For the 16S mammal primer, for Brazilian samples, any successfully amplified PCR product was pooled, while, for the Tanzanian

| Data processing and analyses
Sequence data were processed for each primer set separately.Using AdapterRemoval v2.2.2, sequence reads were trimmed to remove adaptors and low-quality bases and paired reads were merged (Schubert, Lindgreen, & Orlando, 2016).Sequences were sorted according to primers and tags using a modified version of DAMe (Bohmann et al., 2018;Zepeda-Mendoza, Bohmann, Carmona Baez, & Gilbert, 2016, https ://github.com/shyamsg/DAMe).Thresholds for filtering sequences across the PCR replicates from each sample were guided by the sequenced negative and positive controls, and sequences were retained if having at least 15 and 37 sequence copies for the 16S mammal and 12S vertebrate primer sets, respectively (Alberdi et al., 2018).Further, sequences present in any of a sample's PCR replicates were kept.The filtered sequences were clustered using SUMACLUST with a similarity score of 97% (Mercier, Boyer, Bonin, & Coissac, 2013).Postclustering curation of the operational taxonomic unit (OTU) tables was carried out using the LULU algorithm with default settings (Frøslev et al., 2017).The LULU algorithm is independent of DNA reference databases and is composed of a core mechanism that retains rare and factual OTUs while discarding artifactual OTUs.It does so by identifying and merging the artifactual OTUs with factual abundant OTUs that are similar in sequence and that consistently co-occur.
The OTU sequences were compared against the NCBI Genbank database (www.ncbi.nlm.nih.gov/) using BLASTn, and the output was imported into MEGAN Community Edition version 6.12.7 (Huson et al., 2016)  Taxonomy of all OTUs was further manually checked to validate assignments.A strict species assignment approach was applied so that species-level assignment was only performed when an OTU sequence had an identity of 100% to a NCBI reference sequence.However, one OTU sequence with 98% identity to Nandinia binotata (African palm civet) was assigned to species level as the taxonomic family that it belongs to consists of only this one species.All detected taxa were evaluated according to their known geographical distribution (https ://www.iucnredlist.org).
One OTU assigned to Tapirus sp. had 100% identity to two tapir species but was assigned to Tapirus terrestris based on the two species' known geographical distributions.A Krona chart (Ondov, Bergman, & Phillippy, 2011) was created for a visual representation of the taxonomic distribution of detected vertebrates.To test for differences in detection rates between trap types, primers, season (Brazilian dataset only), and countries, we built individual general linearized models for the entire dataset, as well as for the Tanzanian and the Brazilian dataset, fitted using binomial logit links.We adjusted p-values for multiple comparisons as suggested by Bonferroni (Amstrong, 2014).

| RE SULTS
Of  Of the 32 vertebrate taxa detected, 14 were identified to species level, 10 to genus level, and the remaining 8 to family level (Figure 3).Eight OTUs could not be identified to a lower taxonomic level than order and were discarded.Six samples only contained at least one of these OTUs and were therefore also discarded.

| Detected vertebrate taxa
The 32 detected vertebrate taxa encompassed 21 mammalian taxa spanning 15 families in 6 orders (Artiodactyla, Carnivora, Chiroptera, Perissodactyla, Primates, and Rodentia), 6 bird taxa spanning 6 families in 3 orders (Galliformes, Passeriformes, and Piciformes), and 5 amphibian taxa spanning 3 families in 1 order (Anura) (Figure 3, Table 1).Fourteen vertebrate taxa could be assigned to species level.these, 12 were known to occur in the study sites (Table 1).Although the two remaining species, Alouatta guariba (brown howler monkey) Brazil and Baeopogon indicator (honeyguide greenbul) in Tanzania, are not known to occur within the study sites, their distribution falls close to these (www. iucnr edlist.org).Furthermore, five of the detected species were confirmed through visual observations during sample collection (Figure 4, Table 1).primates (e.g., A. guariba) (Figure 4, Table 1).The most detected vertebrate in Tanzania was A. xenodactyloides (Chirinda screeching frog), which was detected in 20 extracts.In Brazil, A. guariba (brown howler monkey) was the most detected taxa, with detection in three extracts (Table 1).

| Vertebrate detections per trap type, the time before collection and season
When combining the two primer sets, no differences were found in detection rates of vertebrates in samples collected in Tanzania compared to Brazil.Further, differences between trap types were found, as the majority (13 out of 14) of samples in which vertebrates were detected in Brazil were collected with Malaise traps (t-value = 2.490, adjusted p-value = .037),whereas in Tanzania more vertebrates were detected in pitfall trap samples (t-value = 2.767, adjusted pvalue = .0107).Adjusted p-values indicate no significant differences in detection rates between primer sets in both datasets.Regarding the amount of days that traps were left before collection in Tanzania, it was possible to detect vertebrate DNA in extracts originating from traps that were left for a longer time (2-7 days).Although not statistically significant due to small sample sizes, the success rate of vertebrate detection in these samples did not seem to decrease (Table S1).
In spite of samples collected during the wet season in Brazil having a higher detection rate (23.5%), compared to the samples collected during dry season (0%), detection rate did not differ statistically between sampling season, most likely caused by the low number of samples.

| Primer performance
The 16S mammal primers amplified DNA in 44 (16.6%) of the 265 extracts, while the 12S vertebrate primers amplified DNA in 32 (12.1%) extracts.Of these sequenced extracts, vertebrate DNA, which could be taxonomically identified to family level or lower, was detected in 25 (56.9%) of the extracts amplified using the 16S mammal primer and in 29 (90.6%) of those amplified with the 12S vertebrate primer set.In five extracts, vertebrate taxa were identified with both primer sets.Of the 32 detected vertebrate taxa, 17 were detected with the 16S mammal primer set and 15 with the 12S vertebrate primer set.
Thus, almost twice as many taxa were detected when using both primer sets as opposed to when only using one of them (Table 1).In one sample, the same vertebrate taxon was detected using the two different primer sets.While studies using invertebrate-derived DNA, iDNA, have so far focused on the targeted collection of invertebrates ingesting vertebrates and their genetic material, in this study we show that vertebrate DNA can also be detected in bulk arthropod samples, that is, without targeting a specific invertebrate taxon during collections and therefore with no prior knowledge of the collected arthropod taxa.

| Vertebrate detection
Through metabarcoding using 16S mammal and 12S vertebrate primers, vertebrate taxa were detected in 19.2% of all the analyzed bulk arthropod sample DNA extracts.This detection rate is below those reported by iDNA studies targeting specific invertebrate taxa where some studies reported detection rates of 21%-100% (reviewed in Calvignac-Spencer et al., 2013).Our relatively low vertebrate detection rate might simply be because the untargeted nature of the collection meant that not all samples contained invertebrates that had ingested vertebrate DNA.An additional explanation could lie in the complex mixture of DNA found in bulk arthropod samples.
Specifically, as many invertebrates are pooled together, including both species that do and do not feed on vertebrate-derived samples, overall the invertebrate DNA will dominate any traces of vertebrate DNA.Some iDNA studies have found that the number of detected vertebrates increases when invertebrates are extracted individually in contrast to pooling the invertebrates (Rodgers et al., 2017;Schnell et al., 2012).As such, the pooled nature of the bulk arthropod samples in the present study could explain the low proportion of samples with vertebrate detections.However, sequencing pooled blowflies resulted in the detection of four additional vertebrate taxa as in comparison with individual sequencing (Calvignac-Spencer et al., 2013).
It should be noted that we only sequenced some DNA extracts, more specifically those with successful PCR amplification as assessed by gel electrophoresis, and that the vertebrate detection rate was relatively high in the sequenced sample extracts (90.6% and 56.9% for 12S vertebrate and 16S mammal primer set, respectively).Thirty-two vertebrate taxa were detected in the study sites in Brazil and Tanzania.Although study areas and design differ, this is comparable to the number of vertebrates detected in iDNA studies targeting specific invertebrate taxa such as carrion flies in Côte d'Ivoire and Madagascar (Calvignac-Spencer et al., 2013), carrion flies in Panama (Rodgers et al., 2017), leeches in Borneo (Schnell et al., 2018), blowflies in Malaysia (Lee et al., 2016), leeches in Vietnam (Schnell et al., 2012), and even higher than when targeting ticks in Canada (Gariepy et al., 2012).
While the traditional survey methods are generally limited to vertebrate species from a single forest stratum (generally near ground level), we show that bulk arthropod samples can cover different forest strata, including the detection of canopy-occupying birds and primates (Figure 3 and Table 1).The wide range of vertebrates detected in the present study indicates that bulk arthropod samples offer great potential for supplementing traditional methods for vertebrate surveying such as camera trapping, spoor tracking, and other visual surveys, as already shown for iDNA studies using leeches (Abrams et al., 2019).
Not all identified vertebrate taxa might originate from vertebrates that were ingested by invertebrates.For instance, 19 of the 20 frog detections in Tanzania originated from pitfall traps, which corresponds with the occasional observation of frogs in pitfall traps.
These were, however, discarded from samples upon collection.Frog detections in the bulk samples might therefore be caused by invertebrates ingesting frog DNA but may also be caused by DNA traces of frogs in the collecting fluid.
While 12 of the 14 vertebrate taxa identified to species level had a geographical distribution within the study sites, two species did not (Table 1).Nevertheless, the known geographical distributions of these two species, A. guariba (brown howler monkey) in Brazil and B. indicator (honeyguide greenbul) in Tanzania, are close to the study sites, and therefore, it is not implausible that they might be found in the study areas.
Another explanation is that detection of vertebrate taxa in bulk arthropod samples provides evidence of the presence in a larger area.Because bulk arthropod samples likely consist of arthropod taxa occupying different habitats, having different feeding strategies and dispersal potentials, care should be taken when making inferences about the geographical location and temporal proximity of the detected vertebrates (Lee et al., 2016;Schnell, Sollmann, et al., 2015).A third explanation might be a relatively incomplete DNA reference database.The primate Alouatta belzebul (red-handed howler) belonging to the same genus as the detected A. guariba (brown howler monkey) is known to occur within the Brazilian sample site, but a DNA reference for the 16S marker used in this study has yet to be included in public databases such as Genbank.It is also possible that A. belzebul and A. guariba are identical over the 16S DNA barcode marker used to identify it in this study and that we have detected A. belzebul in our samples.
This highlights the need for further development of DNA reference databases.

| Vertebrate detections per trap type, the time before collection, and season
In the present study, it was possible to detect vertebrate taxa in bulk arthropod samples from both Malaise and pitfall traps (Table 1).We therefore believe this can also be achieved from samples collected with other kinds of traps, such as light traps, pan traps, hand netting, and other mass collecting traps.More vertebrates were detected in samples collected by Malaise traps using the 16S mammal primer set, whereas the 12S vertebrate primers detected the most vertebrates in pitfall samples (Table 1).Although this might be locationspecific, or as in our case influenced by frogs falling into pitfall traps in Tanzania, increasing the types of traps will naturally expand the invertebrate taxa collected and therefore likely also increase the diversity of the vertebrates detected.When collecting bulk arthropod samples, the degradation rate of vertebrate DNA inside the arthropods has to be considered.It has been found that amplifiable vertebrate DNA in Chrysomya megacephala (blowflies) decreased markedly after only a few days postfeeding (Lee et al., 2015), but on the contrary, goat DNA has been found to persist in blood-feeding leeches for several months (Schnell et al., 2012).Additionally, in the present study, vertebrate DNA was detected in traps that had been left for several days (up to 7 days) before collection (Table S1).
Nevertheless, it has been shown that arthropod DNA undergoes slight degradation when in a Malaise trap (Krehenwinkel et al., 2018) and most likely also the iDNA.

| Primer performance
When combining results from the 16S mammal (Taylor, 1996) and the 12S vertebrate (Riaz et al., 2011), primer sets almost twice as many vertebrate taxa were detected as compared to when either primer set was used alone (Table 1).This confirms the findings of Rodgers et al. (2017) that used the same primer sets on DNA extracted from pools of up to 16 carrion flies and found that more species were detected using both primers than using either marker alone.In agreement with our results, they also more often detected primates using the 16S mammal primer as opposed to the 12S vertebrate primer.Interestingly, we only obtained one overlapping vertebrate detection for the two primer sets and only in one sample.One explanation for this could be the incomplete reference database as both primer sets detected vertebrate OTUs which could not be taxonomically identified beyond order level and were therefore excluded for further analyses.Additionally, in some samples the 12S vertebrate primer only detected amphibian taxa, whereas the 16S primer detected mammal taxa in the same samples.Although not formally tested in this study, it seems like the 12S vertebrate primer set has an affinity toward amphibian DNA, which could also cause the discrepancy between the two primer sets.This shows the complementarity of the two primer sets and highlights the need to use both primer sets and potentially additional primer sets, to increase vertebrate detections in future studies.For example, no taxa of the class Reptilia were detected in the present study, which might be caused by the inability of the primers to amplify reptilian DNA.Therefore, we highly encourage research to optimize primer choice to enhance vertebrate detections.

| Technical considerations
When attempting to PCR amplify low amounts of template DNA, it is important to consider PCR stochasticity as it can lead to false negatives and thus the failure of amplification of certain taxa in some PCR replicates (Kebschull & Zador, 2015;Murray, Coghlan, & Bunce, 2015).Incorporation of additional PCR replicates can increase the probability of amplifying target DNA in low quantity (Alberdi et al., 2018).Although our study design did not allow a detailed assessment of how the number of PCR replicates influences vertebrate detection rates in bulk arthropod samples, some observations can be made.
For the Brazilian samples, four vertebrate taxa would not have been detected if samples were only pooled when all three PCR replicates showed successful amplification when visualized on gel electrophoresis (data not shown).Similarly, for the 12S vertebrate primer, three vertebrate taxa would not have been detected in samples from Tanzania (data not shown).As for the 16S primer in the Tanzanian samples, it is possible that vertebrate taxa were missed, as the pooling strategy was stricter.This argues for careful consideration when deciding, based on the results of the gel electrophoresis, which and how many PCR replicates from each sample to pool and sequence when the aim is to detect vertebrate taxa in bulk arthropod samples.
For example, even though vertebrate DNA is only successfully amplified in a single PCR replicate, our results highlight that this single PCR replicate should still be included in the following pooling.However, this requires careful consideration when filtering the DNA sequences during data analysis where it is necessary to keep sequences appearing in any of the sequenced PCR replicates, which may introduce false positives (Alberdi et al., 2018).A postclustering curation could therefore be applied to identify and delete some of the false positives (Alberdi et al., 2018;Frøslev et al., 2017).To optimize the potential of detecting vertebrate DNA and limit the risk of false positives, another but more costly approach can be to include additional PCR replicates (e.g., five) which then would permit a stricter filtering (e.g., only keeping sequences occurring in min.3/5 PCR replicates).Finally, after deciding the number of PCR replicates to use and how to analyze the data, it is important to minimize type II errors (false negatives).
Therefore, undetected species should not be treated as absent as the vertebrate species can be present in the study site but not have been fed upon by the sampled arthropods (Alberdi et al., 2019).

| Perspectives
We demonstrate that bulk arthropod samples should no longer be considered only to provide information about the arthropod communities but also as a source of vertebrate fauna information.It can require many field days to collect bulk arthropod samples, and once the samples have been brought into the laboratory, the hardship continues as sample preparation and DNA extraction of the sometimes hundreds of samples can require many man-hours.
Therefore, once having collected and extracted bulk arthropod samples, researchers should obtain as much biodiversity information as possible from the samples.Using bulk arthropod samples to detect vertebrate diversity can be of particular interest in larger projects (e.g., the Global Malaise Trap Program http://biodi versi tygen omics. net/proje cts/gmp/) where samples have already been collected and DNA extracted.Moreover, this method could serve as a supplement to vertebrate monitoring such as camera trapping, visual surveys, and iDNA studies.By using the proposed approach, researchers can increase the value of bulk arthropod samples for ecological assessment and monitoring programs.

For
each of the two primer sets, tagged PCRs were carried out with three PCR replicates for each of the 265 extracts and negative extraction controls.Furthermore, four to five positive controls were included, namely Canis lupus (wolf), Ursus maritimus (polar bear), Zalophus californianus (California sea lion), and Ursus arctos (brown bear), and additionally for the vertebrate 12S primer set,

F
I G U R E 2 Study sites in Brazil and Tanzania where bulk arthropod samples were collected.Maps created with QGIS (version 3.6.2) the analyzed 265 bulk arthropod sample DNA extracts, 76 (28.7%) were sequenced.Curation of the 16S mammal and 12S vertebrate OTU tables with the postclustering algorithm LULU (Frøslev et al., 2017) removed 31 OTUs (49.2%) from the 12S vertebrate dataset and 0 OTUs (0%) from the 16S mammal dataset.Clustering with a similarity score of 99% did not affect the outcome except resulting in more OTUs being removed by the LULU algorithm.PROTAX confirmed the taxonomic identifications obtained using BLASTn and MEGAN, and enabled taxonomic identification of three additional OTUs.Specifically, two OTUs belonging to the family Hylidae (PROTAX family probability of .86)were detected in three Malaise samples from Brazil and one OTU belonging to the family Cercopithecidae, genus Procolobus (PROTAX family probability of .99 and genus probability of .80),was detected in two Malaise and two pitfall samples from Tanzania.

3. 1 |
Detection rates of vertebrate DNA in arthropod bulk samplesCombining the results from both primer sets, nonhuman vertebrate DNA was detected in 51 bulk arthropod sample DNA extracts (19.2% of analyzed extracts, 67.1% of sequenced extracts).The remaining sequenced extracts only contained OTUs assigned to Hominidae, OTUs that could not be assigned to a lower level than order, or OTUs potentially arising from cross-contamination from the positive controls.Thirty-two vertebrate taxa were detected in the sequenced extracts, with a range of one to three taxa detected F I G U R E 3 Vertebrates identified with DNA metabarcoding of bulk arthropod samples collected in Brazil and Tanzania.Flags indicate if the taxon was detected in samples collected in Brazil or Tanzania.N/A indicates that taxonomic identification to species or genus level was not possible.The Krona chart is produced with credit to(Ondov et al., 2011) TA B L E 1 Vertebrate taxa detected in bulk arthropod samples from Brazil and Tanzania using DNA metabarcoding.N/A indicates that taxonomic identification to species or genus level was not possible.Information regarding trap type (Malaise or pitfall) and number of vertebrate detections, metabarcoding primer set (v: 12S vertebrate,Riaz et al., 2011; or m: 16S mammal, Taylor,  1996), whether the taxa were observed during sample collection, if its known geographical distribution falls within the collection site and finally the IUCN status (https ://www.iucnredlist.org/), least concern (LC) or vulnerable (VU) of the taxa are shown either of the primer sets (mean of 1.28, SD 0.53).

F
I G U R E 4 Visual observations during bulk arthropod sample collection confirmed the presence of some of the detected vertebrate taxa.Left: tapir footprint in one of the Brazilian study sites.Right: Paragalago zanzibaricus (Bushbaby) in the Udzungwa Mountains in the Tanzanian study site samples, only PCR products from samples where all three PCR replicates had successfully amplified were pooled.For the 12S vertebrate primer, PCR replicates that successfully amplified were pooled if at least two out of the three PCR replicates for a sample had successfully amplified.
(Schnell, Bohmann, & Gilbert, 2015;van Orsouw et al., 2007) controls were included in each amplicon pool.Amplicon pools were purified with SPRI beads(Rohland & Reich, 2012)with a 1.6× bead-to-amplicon pool ratio and eluted in 35 μl EB buffer.Purified amplicon pools were built into sequence libraries with an in-house protocol in which Illumina sequencing adapters and dual indices were ligated onto amplicons.The protocol omits the two steps that have been shown to cause tag jumps, that is, T4 DNA polymerase blunt ending and postligation PCR(Schnell, Bohmann, & Gilbert, 2015;van Orsouw et al., 2007).Libraries were purified with a 0.8× bead-toamplicon pool ratio and eluted in 30 μl EB buffer and qPCR-quantified using the NEBNext Library Quant Kit for Illumina (New England Biolabs Inc.).Amplicon libraries were pooled and sequenced at the National High-throughput DNA Sequencing Centre, University of Copenhagen.Sequence libraries were sequenced 250 bp PE on an Illumina MiSeq sequencing platform using v2 chemistry, aiming for 25,000 paired reads per PCR replicate.