In silico and empirical evaluation of twelve metabarcoding primer sets for insectivorous diet analyses

Abstract During the most recent decade, environmental DNA metabarcoding approaches have been both developed and improved to minimize the biological and technical biases in these protocols. However, challenges remain, notably those relating to primer design. In the current study, we comprehensively assessed the performance of ten COI and two 16S primer pairs for eDNA metabarcoding, including novel and previously published primers. We used a combined approach of in silico, in vivo‐mock community (33 arthropod taxa from 16 orders), and guano‐based analyses to identify primer sets that would maximize arthropod detection and taxonomic identification, successfully identify the predator (bat) species, and minimize the time and financial costs of the experiment. We focused on two insectivorous bat species that live together in mixed colonies: the greater horseshoe bat (Rhinolophus ferrumequinum) and Geoffroy's bat (Myotis emarginatus). We found that primer degeneracy is the main factor that influences arthropod detection in silico and mock community analyses, while amplicon length is critical for the detection of arthropods from degraded DNA samples. Our guano‐based results highlight the importance of detecting and identifying both predator and prey, as guano samples can be contaminated by other insectivorous species. Moreover, we demonstrate that amplifying bat DNA does not reduce the primers' capacity to detect arthropods. We therefore recommend the simultaneous identification of predator and prey. Finally, our results suggest that up to one‐third of prey occurrences may be unreliable and are probably not of primary interest in diet studies, which may decrease the relevance of combining several primer sets instead of using a single efficient one. In conclusion, this study provides a pragmatic framework for eDNA primer selection with respect to scientific and methodological constraints.


| INTRODUC TI ON
The genetic analysis of environmental samples such as soil, water, or feces, known as environmental DNA (eDNA) metabarcoding, is a rapid and cost-effective tool for the study of species that are difficult to detect or monitor (Bohmann et al., 2014). This approach allows the simultaneous identification of multiple taxa in environmental samples, bypassing the need to isolate organisms prior to identification (Clare, 2014;Taberlet, Coissac, Hajibabaei, & Rieseberg, 2012). eDNA metabarcoding is of particular interest in dietary analysis of rare or elusive species, and this approach has now been applied to a large spectrum of organisms (Clare, Fraser, Braid, Fenton, & Hebert, 2009;Corse et al., 2017;Kartzinel & Pringle, 2015;Rytkönen et al., 2019;Shehzad et al., 2012). Compared to the traditional microscopic study of undigested fragments in fecal remains, eDNA metabarcoding has three key advantages for dietary analysis: (a) a finer resolution (potentially to the species level), (b) the simultaneous processing and sequencing of several hundred samples, and (c) the detection of species that could not be detected using previous techniques such as visual recognition of morphological features (e.g., soft-bodied species; Galan et al., 2018;Nielsen, Clare, Hayden, Brett, & Kratina, 2018).
However, eDNA metabarcoding is subject to many biological and technical biases in each step of the process, including fieldwork, laboratory analysis, and bioinformatics. Inappropriate sampling and storage conditions, contaminations, PCR inhibitors, PCR stochasticity, and chimera formation are common biases influencing the reliability of results (for a review, see Lindahl et al., 2013). Many methodological improvements have been made to limit some of these biases and to introduce best-practice guidelines for metabarcoding protocols, such as the systematic inclusion of technical replicates and negative controls (Alberdi, Aizpurua, Gilbert, & Bohmann, 2018;Corse et al., 2017;Galan et al., 2016;Mata et al., 2018).
Another major issue in eDNA metabarcoding protocols that still needs to be solved is the selection of appropriate primer set(s), although this issue has received increased attention in recent years Op De Beeck et al., 2014;Piñol, Senar, & Symondson, 2019). This is a crucial choice, as the primers must be suitable for all of the taxa actually present in the environmental sample in order to avoid missing unexpected items (Taberlet, Coissac, Pompanon, Brochmann, & Willerslev, 2012). The DNA fragment amplified by the primers must be variable enough to discriminate close species, but also needs to be abundantly referenced in public sequence databases for successful taxonomic identification based on the sequences generated . Among the various mitochondrial genes used in animal metabarcoding, such as the 12S rRNA, 16S rRNA, and cytochrome b genes (Hänfling et al., 2016;Riaz et al., 2011;Santas, Persaud, Wolfe, & Bauman, 2013), the cytochrome c oxidase I (COI) gene is the most widely used as it fulfills the above criteria the best (Andújar, Arribas, Yu, Vogler, & Emerson, 2018;Hebert, Cywinska, Ball, & deWaard, 2003). Formerly, DNA barcoding approaches targeted the "Folmer region" of COI which is 658 base pairs (bp) long. This is too long for efficient sequencing using the second generation of high-throughput sequencing (HTS) platforms (Lear et al., 2018). Moreover, eDNA is often highly degraded, due to the digestion process (in the case of fecal samples) or to environmental exposure of the samples (i.e., rain, sunlight; Oehm, Juen, Nagiller, Neuhauser, & Traugott, 2011). Such issues render the use of the entire "Folmer region" COI sequence impracticable (Deagle, Eveson, & Jarman, 2006). "Mini COI barcodes" (i.e., with a targeted region range < 200 bp; Hajibabaei et al., 2006;Pompanon et al., 2012) are therefore more commonly used in eDNA metabarcoding analyses. However, the paucity of conserved regions in the COI sequence can complicate universal primer design (Deagle, Jarman, Coissac, Pompanon, & Taberlet, 2014). Thus, some authors have argued for the joint use of several COI primer sets (e.g., Corse et al., 2019;Esnaola, Arrizabalaga-Escudero, González-Esteban, Elosegi, & Aihartza, 2018) or for the combination of mitochondrial 16S rRNA and COI primer sets (e.g., Alberdi et al., 2018;Bohmann et al., 2018). This combined approach should improve the taxonomic coverage of prey items. For example, Esnaola et al. (2018) have shown that the combination of the Zeale, Butlin, Barker, Lees, and Jones (2011) and Gillet et al. (2015) primer sets enabled the detection of 37.2% more species than when using Gillet's primer set alone. However, the combination of several primer sets greatly increases both the financial cost and the duration of laboratory and bioinformatics analyses of metabarcoding studies.
In this context, the use of a single highly degenerate primer set would be preferred, and some studies have already highlighted the efficiency of this approach Galan et al., 2018;Vamos, Elbrecht, & Leese, 2017). Degenerate bases are used in primer sequences to avoid mismatches at variable positions between the different targeted taxa. In theory, this process should enable the amplification and identification of many different taxa using a unique primer set in a single PCR. However, degenerate primers have to be carefully designed because high levels of degeneracy can lead to high rates of nontarget amplification (Innis, Gelfand, Sninsky, & White, 2012). No consensus on primer choice has yet been reached in eDNA metabarcoding, and at present, a multitude of primer sets exist for the COI and 16S genes that differ in their target length, degeneracy levels, and position within the genes.
Designing robust metabarcoding protocols for the study of insectivorous bat diets is still an important issue. Such molecular analyses are critical because the direct observation of bat feeding is virtually impossible. The morphological identification of prey remains in guano rarely provides resolution at the genus or species level and can be highly time-consuming. Finally, many bat species are endangered and highly protected such that invasive methods cannot be used to carry out diet surveys. Metabarcoding analysis of the DNA contained in bat guano has thus been developed (a) to deepen our understanding of bat ecology (Arrizabalaga-Escudero et al., 2015;Clare, Symondson, Broders, et al., 2014), (b) to highlight the potential ecosystem services provided by bats as pest suppressors Maslo et al., 2017;Wray et al., 2018), and ultimately, (c) to set up effective conservation strategies (Arrizabalaga-Escudero et al., 2015).
However, most of these studies have used specific arthropod primer sets (but see Galan  or specific bat primer sets (Walker, Williamson, Sanchez, Sobek, & Chambers, 2016). While working with guano collected from roost sites, a potential problem arises from the fact that several bat species may roost in the same sites. Because guano is not easily distinguishable between bat species (Ware, Garrod, Macdonald, & Allaby, 2019), it is critical to identify bat species to avoid misassigning prey to the wrong bat species and also to discard guano samples that could be contaminated with excreta from other bat species. To this end, Galan et al. (2018) previously optimized a metabarcoding approach to simultaneously identify bat species and their prey, instead of using different primer sets and methodologies to identify bat species on the one hand and arthropods on the other (e.g., Bohmann et al., 2011;Van den Bussche et al., 2016). As the amplification of bat DNA may reduce the sensitivity of prey DNA detection , it is important to find a trade-off between the success of an exhaustive prey amplification and bat species identification.
In this study, we followed a multicriteria assessment approach to compare the performance of several sets of primers for use in insectivorous bat dietary analyses. We compared statistically different primer sets to identify the characteristics (primer degeneracy, amplicon length, DNA degradation) that would maximize the accuracy of arthropod detection and identification, while minimizing the time and financial cost of laboratory work. We also compared the capacity of primer sets to provide identification of bat species without over amplifying bat DNA. We focused on the greater horseshoe bat (Rhinolophus ferrumequinum) and Geoffroy's bat (Myotis emarginatus) as they often share maternity roosts during summer but have contrasting diets (Goiti et al., 2011;Jones, 1990

| In silico evaluation
First, we selected primers that are commonly used in arthropod metabarcoding studies (Table 1, Figure 1). Six of the selected primer sets amplified fragments within the Folmer region of the COI gene, and one set amplified a region in the 16S gene. We favored primer sets which amplified short fragments (between 133 and 218 bp excluding primers), but we also looked at longer fragments (between 313 and 322 bp excluding primers) to evaluate the potential effect of amplicon length on prey detection and identification in bat guano. Then, we designed five new combinations of primers (four for COI and one for 16S) based on the seven primer sets described above. We increased the base degeneracy level of the COI reverse primer described in Jusino et al. (2018) and of the 16S primers from Epp et al. (2012) in target region length and degeneration level ( Figure 1).
We used the mean penalty score per arthropod order provided in PrimerMiner. This score is calculated as a mismatch scoring that includes the adjacency, position, and type of mismatch between primers and template sequences. Primer evaluation was conducted only for arthropod orders represented by at least 100 OTUs, as recommended by . This threshold enabled us to capture a large portion of the variability potentially existing at primer binding sites and to limit potential biases resulting from the presence of only few sequences when evaluating the match/mismatch. We compared primer sets by combining the penalty scores from both the forward and reverse primers. We then used these scores to determine whether the arthropod OTUs would theoretically be successfully amplified using the default value (success: penalty score < 120). We used the same DNA extractions of bats and arthropods to build reference sequences for the COI and 16S genes. Normalized DNA was amplified and sequenced for each individual to provide reference sequences for each gene. As the ten COI primers span almost all of the 658-bp COI Folmer region, we sequenced this region using Sanger technology as described in Sow et al. (2018). We sequenced the 106-bp region targeted by the two 16S minibarcodes using the Epp-degen primer set and MiSeq sequencing technology for each DNA extraction independently, following the same protocol used for the analysis of the mock and guano samples, as described below.

| Guano samples: collection and preparation
We collected 22 fecal pellets from each of five mixed maternity colonies of the greater horseshoe bat and Geoffroy's bat in Western France in June 2018. Each fecal pellet was retrieved from paper plates which had been left on the ground beneath the colony for 10 days. Single-use forceps were used to collect pellets to avoid contaminations between samples. Paper plates were renewed on each sampling date to avoid contaminations. Samples were stored at −20°C until DNA extraction.
Briefly, guano samples were frozen at −80°C and then beadbeaten for 2 × 30 s at 30 Hz in a TissueLyser (Qiagen) using a 5-mm stainless steel bead. DNA was extracted using the NucleoSpin 8 F I G U R E 1 Visual representation of DNA barcode lengths and primer positions on the COI gene. Primer sets are represented by a number (see Table 1) and colored arrows, with each color representing a unique primer set. For each primer set, the number on the gray line corresponds to the amplicon length excluding primers. This information is collated at the bottom of the figure

| PCR and library construction
The PCR conditions can greatly influence the performance of primer sets . Thus, we applied the same PCR program for all primer sets and used a low annealing temperature (45°C) as recommended in recent studies Jusino et al., 2018). We validated this program by checking the quality of PCR amplifications on the mock communities for the 12 primer sets.
We made four versions of each primer by adding a 5′ heterogeneity spacer of 0-3 bp to the primer sequence. This increased the diversity at each sequencing cycle, improved the detection of the sequencing clusters at the flowcell surface, and thus increased the quality of the reads.
The four versions of each primer were mixed together before PCR.
We used the two-step PCR protocol described in Galan et al. (2018) with slight modifications. Firstly, we increased the extension time (2 min instead of 45 s) in PCR 1 and PCR 2 to reduce chimera formation (Qiu et al., 2001). Secondly, for each PCR replicate multiplexed in the same run, we used a double indexing strategy as recommended in Kircher, Sawyer, and Meyer (2012) to significantly reduce the rate of read misassignment: Each 9-bp i5 and i7 dual index was used only for one PCR sample, eliminating the problem of "leak" due to false index-pairing (Martin, 2019). PCR 1 was performed in 10 µl reaction volumes using 5 µl of 2× Qiagen Multiplex Kit Master Mix (Qiagen), 2.5 µl of ultrapure water, 0.5 µl of each mix of forward and reverse primers (10 µM), and 1.5 µl of DNA extract.
Thermocycler conditions for PCR 1 consisted of an initial denaturation step at 95°C for 15 min, followed by 40 cycles of denaturation at 94°C for 30 s, annealing at 45°C for 45 s, and extension at 72°C for 2 min, followed by a final extension step at 72°C for 10 min.
Conditions for PCR 2 consisted of a limited-cycle amplification step to add multiplexing indexes i5 and i7 (nine bases each) and Illumina sequencing adapters P5 and P7 at both ends of each DNA fragment from PCR 1 . PCR 2 was carried out in a 10 µl reaction volume using 5 µl of Qiagen Multiplex Kit Master Mix (Qiagen) and 2 µl of each indexed primer i5 and i7 (0.7 µM). Then, 2 µl of PCR 1 product was added to each well. The PCR 2 started by an initial denaturation step of 95°C for 15 min, followed by eight cycles of denaturation at 94°C for 40 s, annealing at 55°C for 45 s, and extension at 72°C for 2 min, followed by a final extension step at 72°C for 10 min.
We included a negative control for extraction (NC ext ), a negative control for PCR (NC PCR ), and a negative control for indexing (NC index ) in each 96-well microplate. We performed three PCR technical replicates per sample on each DNA extract. For the guano samples, we We checked the homogeneity of amplifications between primer sets and for nonspecific amplification by electrophoresis using 3 µl of each PCR 2 product on a 1.5% agarose gel. Then, PCR 2 products were pooled separately for each of the 12 primer sets and put on a low-melting agarose gel (1.25%) for excision. After electrophoresis, the excision step was used to eliminate primer dimers and nonspecific amplifications. We used the PCR Clean-up Gel Extraction kit (Macherey-Nagel) to purify the excised bands. The 12 pools were quantified using the KAPA library quantification kit (KAPA Biosystems) taking into account the different fragment lengths, normalized at 4nM, and pooled in equimolar concentrations before loading 14 pM and 5% of PhiX control on a MiSeq flowcell with a 500-cycle Reagent Kit v2 (Illumina).
The mock communities and the guano samples were sequenced independently on two different Illumina runs.

| Bioinformatics and taxonomic assignments
First, we used a R preprocessing script (Sow et al., 2019) to merge paired-end sequences into contigs with FLASH v.1.2.11 (Magoč & Salzberg, 2011) and to trim primers with CUTADAPT v.1.9.1 (Martin, 2011). We then used the FROGS pipeline ("Find Rapidly OTU with Galaxy Solution"; Escudié et al., 2018) to create an abundance Taxonomic assignments were carried out for each primer set, following different procedures. We used the Sanger reference sequences produced to analyze mock community results (see above). This enabled us to identify the 33 genuine arthropod sequences in the two mock communities and the genuine bat sequences. The other sequences were excluded from subsequent analyses. With regard to guano samples, we analyzed the 16S OTUs using BLASTN (Altschul, Gish, Miller, Myers, & Lipman, 1990) and the NCBI Nucleotide database (Benson, Karsch-Mizrachi, Lipman, Ostell, & Wheeler, 2008). Taxonomic assignments of COI OTUs were made using the NCBI BLAST+ automatic F I G U R E 2 In silico evaluation of arthropod orders represented by at least 100 OTUs using PrimerMiner. (a) Primer set performance is shown for each order using pie charts, with green and red colors representing success and failure, respectively, of in silico amplification. Success of amplification corresponded to PrimerMiner mean penalty score < 120 and amplification failure to a mean penalty score ≥ 120. (b) Boxplots of the median of PrimerMiner mean penalty scores over all arthropod orders and for each primer set, with mean values represented by a circle within boxplots. (c) Percentage of degeneracy level of each primer set TA B L E 2 Effect of the primer set characteristics on the in silico amplification success and mean penalty score, the detection of mock community taxa, and the detection of arthropod occurrences in guano samples

| In silico data
We tested the effect of the level of primer degeneracy on mean penalty score and on theoretical amplification success. We used a Poisson and a binomial generalized linear mixed model (GLMM), respectively, with the primer set and the order as random effects.

| Mock community data
We tested the effect of amplicon length, level of primer degeneracy, and percentage of bat reads on the percentage of arthropod taxa detected using a binomial generalized linear model (GLM).

| Guano sample data
We tested the effect of amplicon length, level of primer degeneracy, total number of reads, and percentage of bat reads on the number of arthropod occurrences using a Poisson GLM. We did not work on the number of OTUs but rather on taxa occurrence. This was preferred as it enabled us to take into account the frequency of detection of a particular taxon for each primer set, and hence to minimize the importance of rare taxa (i.e., those detected in a single sample). As such, we minimized artificial inflations of OTUs diversity.
Note that we did not include the 16S primer sets in the in vivo statistical analyses (mock communities and guano samples) because of confounding factors (e.g., smaller size of the 16S reference database compared to the COI database).

| Defining a strategy to determine the best primer set(s) for the study
We developed a multicriteria table that included criteria for each step of primer set evaluation (in silico, mock community, and guano sample analyses). We provided a score for each criterion with regard to the objectives and constraints of our future metabarcoding studies.

| In silico evaluation
The in silico evaluation was performed on 21 arthropod orders ( Figure 2). We recovered almost fifty times more sequences for the COI (4,259,845 sequences) than for the 16S gene (83,651 sequences) using the BOLD and NCBI databases. Two orders, Dermaptera and Julida, and four orders, Archaeognatha, Dermaptera, Julida, and Mecoptera, were excluded from the COI and 16S analyses, respectively, using PrimerMiner due to their insufficient number of OTUs (<100).
The mean penalty scores and the theoretical amplification success varied strongly between primer sets and between arthropod orders ( Figure 2). We found a significant negative effect of the level of primer degeneracy on the mean penalty score (GLMM, p = 1.01e−12) and a positive influence on the theoretical amplification success (GLMM, p = 2.88e−09) ( Table 2). The highest mean penalty scores (>165) and lowest amplification success rates (mean < 50%) were observed for the primer sets exhibiting a lack of degeneracy (e.g., mlHCO, Lep1, Zeale, and Epp). For example, the mean penalty score of Epp primers was about 12 times higher than the score of its degenerate version Epp-degen.

| Detection of arthropod taxa
The percentage of arthropod taxa detected in MCs varied from 67% to 100% (see Figure 4 and Figure S1). In the mock commu- Table 2) but no effect of the amplicon length (GLM, p = .681; Table 2). The influence of the degeneracy level is also illustrated by the two 16S primer sets: Epp misamplified or did not amplify five (MC arthr ) and six taxa (MC arthr+bat ), while its degenerated version Epp-degen misamplified or did not amplify only two taxa of the mock communities.
In the mock community containing bat DNA (MC arthr+bat ), four primer sets, MG2, MG2ANML-degen, fwh2, and fwhFol, amplified all taxa in triplicate ( Figure S1). The primer set fwh1 performed with reduced efficiency as it did not amplify Hemiptera-1 at all. The two primer sets Zeale and Lep1 were again the least efficient with eight and 13 taxa that were not amplified or misamplified, respectively. Our results revealed a positive effect of the level of COI primer degeneracy on the percentage of taxa detected in MC arthr+bat (GLM, p = .001; Table 2). We found no effect of amplicon length (GLM, p = .781; Table 2) or of the percentage of bat reads (GLM, p = .668; Table 2).
In both MCs, the primer sets that detected the smallest number of taxa (Zeale,Lep1,mlHCO,and Epp) also exhibited the largest variation of read numbers between arthropod taxa and between the observed and expected number of reads ( Figure S1).

| Taxonomic identification of arthropod taxa
All COI primer sets led to identical identifications of arthropod taxa and exhibited only slight differences in taxonomic resolution. In con-

| PCR verification and sequencing results
DNA amplification was relatively homogeneous between primer sets and pellet samples, except for Hex and Lep1. These both had

| Detection of bat taxa
Two bat species were identified: Rhinolophus ferrumequinum and ( Figure 5). However, the percentage of bat reads in R. ferrumequinum pellets was always lower than in M. emarginatus pellets, whichever primer set was considered ( Figure 5). Finally, bat reads outnumbered arthropod reads for eight primer sets out of 12 in pellet samples from M. emarginatus, and for the 16S Epp-degen primer set (Eppdegen) in R. ferrumequinum pellet samples.

| Detection of arthropod taxa
We found a negative effect of the amplicon length (p = .030) and a positive effect of the number of reads (p = .032) on the number of occurrences detected (Table 2). However, the number of reads was not significant (p = .397) when the primer set with the lowest number of reads (Hex) was excluded. The degeneracy level of the primers and the percentage of bat reads had no effect on the number of occurrences detected (p = .140 and p = .111, respectively).
Taken together, our results showed that the 12 primer sets allowed  fwhFol, and mlHCO (Table 3). Considering guano results, Zeale and fwh1 had the best scores. However, Zeale did not amplify bat DNA, which is critical when studying environmental DNA from feces that can potentially originate from different predator species. Overall, the best primer set for our study was therefore fwh1 (Table 4).

| D ISCUSS I ON
During the most recent decade, eDNA metabarcoding has proven to be a promising approach for characterizing biodiversity in a broad array of contexts (Bohmann et al., 2014).

| The need for primers that identify predator species and discard contaminated samples
The In the particular case of insectivorous bats, the identification of bat species from guano is critical when guano is collected in roosts, especially when bat colonies are known to be mixed. It is potentially less important when guano is retrieved from trapped bats, or from monospecific bat colonies. However, even colonies supposed to be monospecific can be shared by cryptic bat species (Filippi-Codaccioni et al., 2018) or other insectivorous species (e.g., birds).
Furthermore, we have shown that some of the guano sampled in the roosts of mixed colonies (R. ferrumequinum/M. emarginatus) was contaminated with excreta belonging to other bat species, including M. myotis, for which a few individuals were also known to be present in the studied colony. Hence, future diet analyses should allow for the simultaneous identification of bat species and their prey, to reveal the presence of unexpected species in the roosts studied, and to discard guano that would be contaminated by the DNA of multiple predators. It is thus particularly important to ensure that primer sets are able to identify all bat species potentially present in the roosts.
Here, we have shown that some primer sets did not amplify bat species at all and that others could provide bat identification only for some of the bat species of interest (one out of three). We performed a posteriori in silico analyses to test the efficiency of our primer sets on four bat families (Molossidae, Mormoopidae, Rhinolophidae, and Vespertilionidae). Using this larger spectrum of Chiropteran families and species, our results tended to confirm the contrasting pattern of bat amplification observed between our primer sets, albeit with low power due to a low numbers of OTUs (see Appendix S1).
The simultaneous identification of predator and prey is often avoided in diet analyses as an elevated predator amplification is F I G U R E 6 Occurrences of arthropod prey taxa detected in guano samples for each primer set and for four combinations of primer sets. Dashed lines separate the primer sets from the combinations of primer sets. N is the number of guano samples analyzed for each bat species.
(a) Occurrences of arthropod orders. Black lines represent 50% of the occurrences (55 occurrences for Myotis emarginatus samples and 48 for Rhinolophus ferrumequinum samples). (b) Comparison of the number of occurrences of arthropod orders considering the reliability of the technical PCR replicates (dark green = occurrences validated in three PCRs out of three; light green = occurrences validated in two PCRs out of three) for each primer set on the one hand and for two to four primer sets in combination on the other hand. The latter include the two 16S primer sets and the COI primer sets that provided the best results in terms of occurrence of arthropod orders. Dashed lines separate the primer sets from the combinations of primer sets. Numbers correspond to the number of reads gathered for each class of occurrence validation (three PCRs out of three vs two PCRs out of three)

F I G U R E 7
Reliability of the occurrences of prey taxa in Myotis emarginatus samples (N = 11) and Rhinolophus ferrumequinum (N = 10). The occurrences are grouped by bat species and ordered by level of reliability according to the repeatability between primer sets and PCR replicates: (a) not shared between the 12 primer sets (i.e., observed with only 1 of the 12 primer sets), (b) shared by at least one COI primer set and one 16S primer set, or (c) shared by at least two COI primer sets. Number of positive PCR replicates: "2" (light green) = occurrence in two PCRs out of three; and "3" (green) = occurrence in three PCRs out of three Note: Scores range from the less efficient ("1" represented in red) to the most efficient ("3" represented in green).
expected to dampen that of prey Vestheim & Jarman, 2008). Other approaches used to be applied to ensure that the samples belong to the relevant species, including a separate diagnostic PCR that is specific to the expected species, and/or Sanger sequencing that only reveals the major DNA sequence of the sample (Bohmann et al., 2011;Forin-Wiart et al., 2018). These alternatives add a supplementary step to the metabarcoding process, thereby increasing the time and financial cost of the analyses. Most importantly, they do not reveal the presence of nontargeted predator species, or the rate of between-species contamination of samples. As such, they do not facilitate the rejection of contaminated samples. In this study, we have shown that the simultaneous predator amplification by metabarcoding does not necessarily lead to a drop in prey detection. Our results revealed that there is no effect

| Choosing between the "16S + COI" and "COIalone" strategies
In this study, we compared primers designed from two mitochondrial genes that are frequently used in eDNA metabarcoding studies of animals (Deagle et al., 2014): the COI and 16S genes. Our results showed that the 16S primer sets were always less efficient and had lower levels of taxonomic resolution than most of the COI primer sets tested. This was surprising, as, for example, the 16S Epp-degen was identified as one of the best primer sets from the in silico analysis. Moreover we expected a high amplification success for this primer set due to its short amplicon length and high level of degeneration. In consequence, it is likely that the poor performance of the 16S primers revealed in the guano analyses is due to the paucity of reference sequences in the 16S database; an issue which has been emphasized in previous studies (Clarke et al., 2014;. This difference in diversity between COI and 16S reference databases could explain the differences in the number of taxa which could be assigned to genera or species. Lack of reference sequences could also bias re-

| How many COI primer sets should be used?
To counter the negative effects of primer biases, two main strate- which are based on a low number of reads when combining several primer sets (see light green bar in Figure 6). It would thus be interesting to experimentally test the hypothesis of the effect of the number of PCR replicates on prey detection. Alternatively, our results could advocate for the use of multiple primers. Deciphering between the "one-locus" versus "multilocus" strategies should hence rely on the trade-off between the completeness of the results gathered on the one hand and the costs of combining several primers on the other.
However, a mean of 23.8% (R. ferrumequinum) and 31.6% (M. emarginatus) of these occurrences were characterized by the amplification of only two replicates out of three and a small number of reads (except Zeale > 4,000 reads in M. emarginatus samples). These particular occurrences, which were mostly amplified by a single primer set out of twelve, might be the reason why the plateau of taxa occurrence could not be reached with the combination of a reasonable number of primer sets. These less repeatable occurrences may not concern specific taxonomic groups with particular primer sets but rather could be due to the weak biomass of prey (low DNA quantity), traces of old meals (very degraded DNA, as observed with the Hemiptera-1 of our mock community), traces of secondary predation (e.g., meals of Araneae, the more frequent order in the M. emarginatus diet), or environmental contaminations potentially from other bat or insectivorous species. 79.3% (M. emarginatus) and 86.7% (R. ferrumequinum) of the occurrences that were revealed by a single primer set, and that could therefore be interesting to recover by combining several primer sets, were less repeatable (i.e., revealed by a single primer set + only two positive PCRs out of three for these taxa + small number of reads). These results bring new insights into the real benefit of attempting to recover all taxonomic occurrences, in the case where one-third of them were less repeatable with regard to PCR replication, and not replicable between primer sets.
Therefore, the use of a single primer set following the characteristics described below may well be sufficient.

| What is the best COI primer set for characterizing insectivorous diets?
Previous studies have shown that in silico and in vivo tests were complementary and critical for assessing the performance of primers Corse et al., 2019;.
Here, the combination of in silico, mock community, and guano analyses enabled us to reveal the strengths and weaknesses of 12 primer sets for a large spectrum of taxa. DNA quality did not influence in silico and mock community analyses, so that amplicon length had no significant effect on the number of arthropod taxa detected, while the degeneracy level of the primers had a major positive effect on it as it minimized mismatches between primers and sequences from diverse taxonomic assemblages .
However, amplicon length was the most important factor influencing the success of arthropod detection when analyzing DNA from guano samples. Amplicons that were too long (>313 bp), such as Hex and fwhFol, were less efficient. This result could be explained by the low proportion of large size DNA fragments in degraded samples. Therefore, our results strongly support the use of short-length amplicons and degenerate primers to maximize biodiversity coverage in metabarcoding analyses Galan et al., 2018;Vamos et al., 2017).
In previous studies, the Zeale primer set seemed well suited for detecting Lepidoptera and Diptera because of its high taxonomic coverage for these orders (Clarke et al., 2014;Zeale et al., 2011).
As the diet of European bats is mainly dominated by Lepidoptera and Diptera (Alberdi, Razgour, et al., 2019), the Zeale primer set remains one of the most used in insectivorous bat diet studies (e.g., Aizpurua et al., 2018;Andriollo, Gillet, Michaux, & Ruedi, 2019;Clare, Symondson, & Fenton, 2014;Vesterinen, Puisto, Blomberg, & Lilley, 2018). In our study, the Zeale primer set  showed important amplification failures during in silico and mock community analyses, but not in guano analyses. Low taxon recovery of the Zeale primer set had previously been observed for terrestrial arthropods (Brandon-Mong et al., 2015;Clarke et al., 2014;Esnaola et al., 2018), especially when facing a large spectrum of arthropods, as shown for the insectivorous Pyrenean Desman (Esnaola et al., 2018) or for some insectivorous bat species  Here, we demonstrated that fwh1 was the best primer set identified to simultaneously characterize the diet of R. ferrumequinum and M. emarginatus from guano samples collected under mixed roosts (Vamos et al., 2017) (Table 4). However, we have also shown that the efficiency of the primer sets, in terms of arthropod detection, may vary between the two bat species studied here. We therefore recommend that a few primer sets should be tested on a representative subset of eDNA samples before undergoing large-scale metabarcoding studies of diet for the first time. Such preliminary analyses should help to determine the most suitable primer set with regard to the sampling scheme and the biological model targeted.

| CON CLUS ION
Our study confirms the importance of combining in silico, mock community, and field sample analyses to determine the benefits and the limits of potential primer sets before conducting research based on metabarcoding. Here, this three-step assessment of primer performance confirmed that primer success was determined by amplicon length, base degeneracy level, and how complete reference databases are. Our work also emphasized that the identification of the best primer sets for insectivorous diet studies was not only highly dependent on the objective and financial resources of the study, but it also varied depending on the sampling protocols and constraints that might impact DNA quality and make the identification of predator species necessary. Finally, our results emphasized the presence of potential unreliable occurrences of taxa (detected by a single primer set out of twelve + amplification of only 2 PCRs out of three + small number of reads). Instead of combining numerous primer sets to recover these taxa, we suggest instead to increase the number of PCR replicates. In conclusion, we advocate for the use of multicriteria assessments that summarize all the information required to evaluate any primer sets' performance. This analytical framework can easily be adapted to other metabarcoding studies of predator diet.

ACK N OWLED G M ENTS
We

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.      Table D9. Objectives and impact of the pre-process and FROGS pipelines on the number of reads. 'Complete run ' indicates that the run included samples other than those of the two mock communities. ¥ = genuine sequences, † = identity >97% and coverage >90%, ‡ identity <97% and/or coverage <90%. Table D10. Molecular identification of the taxa in the mock communities obtained using the minibarcode sequences of each specimen, respectively for each primer set and the Folmer region. '<97%' indicated a percentage of identity lower than 97%. Figure D1. Consensus sequence alignment of 21 arthropod orders using the PrimerMiner R package. Figure D2. PCR products of the mock communities (left column) and guano samples (right column) on agarose gel.