Novel universal primers for metabarcoding environmental DNA surveys of marine mammals and other marine vertebrates

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2020 The Authors. Environmental DNA published by John Wiley & Sons Ltd 1Department of Environmental and Earth Sciences, University of Milano-Bicocca, Milan, Italy 2MaRHE Center, Magoodhoo Island, Faafu Atoll, Republic of Maldives 3Fondazione Edmund Mach, S. Michele all'Adige (TN), Italy 4School of Biology, University of Leeds, Leeds, UK 5Leeds Institute of Molecular Medicine, St James’s University Hospital, Leeds, UK 6Acquario di Genova, Costa Edutainment SPA, Genoa, Italy 7Department of Biotechnology and Biosciences, University of Milano-Bicocca, Milan, Italy

The few marine eDNA studies focussing on marine mammals used targeted species-specific assays (Baker, Steel, Nieukirk, & Klinck, 2018;Foote et al., 2012) or used universal fish-specific primers as a proxy to assess the total vertebrate biodiversity (e.g., Andruszkiewicz et al., 2017). All these approaches present at least one drawback when aiming to detect the presence of cetacean and pinniped species within an eDNA sample. Primer sets designed for one group, that is, "fish-specific" primers, might amplify eDNA from other taxa, but unquantified primer mismatch risk reduced detection rates and the introduction of biases in HTS results (e.g., Elbrecht, Hebert, & Steinke, 2018). Where primers are designed for the species of interest, amplicon target size may also be a consideration. The instability and short life span of eDNA molecules (e.g., Thomsen et al., 2012) favors the use of short and informative (hypervariable) DNA regions, such as the mitochondrial 12S, 16S, and cytochrome oxidase I (COI) genes, with amplicons of typically around 100-200 bp although larger mtDNA fragments have been shown to be successfully amplified from eDNA samples (Deiner et al., 2017).
There is therefore a need to develop marine mammal-specific primers suitable for metabarcoding analysis from marine eDNA samples. Mitochondrial 12S and 16S regions provide suitable targets since their sequence variation provides good taxonomic resolution for macro-eukaryotes, while also maintaining conserved sites across regions for siting primers (Deagle, Jarman, Coissac, Pompanon, & Taberlet, 2014). The design criteria for such primers should be a) primer sites which are conserved among marine mammal groups, while amplifying hypervariable DNA fragments for taxonomic resolution; b) where possible, identify marine mammal-specific priming sites in order to reduce cross-amplification with human DNA, thus lessening contamination risks from investigators, swimmers, or other biological residues left by humans; and c) for each primer set, evaluate predicted binding efficiency and amplicon sequence diversity for other marine vertebrates such as fish and sea turtles (when necessary allowing for a single degenerate base per primer). This final point would give a more accurate understanding of primer specificity and suitability for use with other vertebrate groups. Primer sets have been proved to have reliable affinity across multiple vertebrate orders and could support more efficient and cost-effective eDNA biodiversity surveys.
In this paper, we report on the development of novel "universal" marine vertebrate eDNA primers meeting the above criteria, their "in silico" validation against a large marine vertebrate NCBI-GenBank dataset, and their successful initial application and validation with a HTS analysis of environmental water samples collected from a public marine aquarium. Finally, we test for inter-and intraspecific variation over large mitogenomic datasets available for marine mammal species, presenting ready-to-use guidelines for the accurate selection of primer set of choice in specific marine mammal studies.

| Initial design of primer sets
Seventy-one complete mitochondrial genome sequences, representative of most marine vertebrate groups (fish, sea turtles, birds, and marine mammals), were retrieved from GenBank and used for initial primer development (Appendix S1). The selected sequences represented 30 marine vertebrate families, including most marine mammal families (all three Pinniped families, both Sirenian and 13 Cetacean families). The selection comprised all Cetacean species occurring in the Mediterranean Sea. In addition, four human mitochondrial genomes representative of the four main human haplogroups (i.e., haplotypes 16, 31, 33, and 52 in Ingman, Kaessmann, Pääbo, & Gyllensten, 2000) were included (Appendix S1) in order to design primers with reduced amplification efficiency for human DNA. All sequences (n = 75) were aligned with the online tool Clustal Omega (https ://www.ebi.ac.uk/Tools/ msa/clust alo/) with default parameters, and the complete ribosomal 12S and 16S genes were isolated.
Potential sites for metabarcoding primers were identified by manually searching for suitable locations within alignments. Gene regions were considered suitable for designing metabarcoding primers if they encompassed a short (80-230 bp) highly variable fragment, required for species discrimination, and were flanked by highly conserved sites for situating primers. Where possible (i.e., when enough intramammal variation was found in proximity of the priming sites), we also tried to design, for each candidate locus, alternative primers which minimized the probability of amplifying human targets, by ensuring mismatches between the primers and human templates. Such variants could be preferentially used in studies specifically targeting marine mammals.

| Primer evaluation and validation
Three approaches were used to assess primer performance. Firstly, primers were evaluated in silico in two steps: (a) predicted primer binding and amplicon sequence diversity were assessed using the ecopcr scripts within the obitools software package (Boyer et al., 2016;Ficetola et al., 2010); and (b) 680 complete marine mammal mitogenome sequences deposited in GenBank were used to quantify sequence diversity for the primer target regions within marine mammal Families, Genera, and species, to provide recommendations on taxonomic resolution utility of primer sets for specific taxa.
Secondly, the performance of the primers was evaluated in vitro using tissue-derived DNA extracts with varying levels of degradation. Finally, eDNA samples, obtained from tanks of known species composition at the Aquarium of Genoa (Italy) as a proxy for "realworld" environmental samples, were used to assess the metabarcoding performance of the primers.

| In silico primer evaluation
An in silico approach was used to assess the universality of the newly designed primers against all standard nucleotide sequences in the NCBI-GenBank data repository (accessed April 2019) for three taxonomic groups: (a) vertebrates (excluding cetaceans), (b) cetaceans only, and (c) invertebrates. As a performance benchmark, the newly designed primers were compared against 12S-V5 (Riaz et al., 2011), one of the most commonly used metabarcoding primers targeting vertebrates.
The ecopcr script was used to simulate an in silico PCR and extract the amplifiable barcoding regions for each primer pair while allowing for a maximum of three base-pair (bp) mismatches between the primers and template DNA. Barcode regions shorter than 50 bp and longer than 400 bp were not considered. Subsequently, the obigrep command was used to extract sequences that were reliably assigned to a species-level taxonomy. Ambiguous species-level identifications (i.e., sequences with "sp." and "aff.," in the definition of the sequence) and nuclear pseudogenes were excluded from the final sequence database. The obiuniq command was then used to remove duplicate records for each species. Sequences were classified according to their higher taxonomy (i.e., vertebrates [excluding Cetacean species], cetaceans, and invertebrates) before summarizing the data into a tabular format for further analyses using R version 3.5.2 (R Development Core Team 2010). Finally, the taxonomic resolution of the different primers was assessed by splitting the data into their higher taxonomy and running the ecotaxspecificity script with three different thresholds for barcode similarity (i.e., sequences were considered different if they have 1, 3, or 5 bp differences).
The data obtained from the in silico analyses were imported into R, and the packages tidyverse (Wickham, 2016) and gridExtra (Auguie, 2016) were used to construct summary figures to evaluate the taxonomic coverage, the specificity, and taxonomic resolution for each primer pair.
Finally, we downloaded 680 complete marine mammal mitogenomes from GenBank and evaluated levels of polymorphism within Families, Genera, and species at the two proposed loci. Complete 12S and 16S genes were extracted from the retrieved sequences and aligned for each type of taxonomic comparison, and the number of variable sites recorded within the two loci amplicons was reported.

| In vitro primer evaluation
Tissue-derived DNA extracts were used to assess the performance of the newly designed primer pairs in vitro and optimize amplification conditions. DNA extracts of diverse marine vertebrate groups (cetaceans, pinnipeds, sea turtles, and fish) were used as templates for PCR amplification (see Table 4). The 13 DNA templates were purposely selected to have different levels of degradation, being extracted with different techniques and spanning 1-31 years since extraction, in order to evaluate the ability of the primer to amplify low-quality DNA. High-quality DNA extracts were obtained from fresh samples (i.e., muscle, skin, or blood) using the Qiagen DNeasy Blood & Tissue extraction kit following the manufacturer's protocols.
Low-quality DNA extracts consisted of phenol-chloroform-extracted DNA which was over 25 years old or DNA extracts obtained from boiling tissue samples in a buffer solution (Valsecchi, 1998). For each primer pair and each of the DNA extracts a (single, duplicate, triplicate), PCR was performed in reaction mixes consisting of 0.025 u/μl of GoTaq G2 Flexi DNA polymerase (Promega); 1X Green GoTaq Flexi Reaction Buffer (Promega); 1.25-2 mM MgCl 2 (Promega); 0.2 mM dNTPs (Promega); and 0.25-0.75 μM of each primer and dH 2 O to reach a final volume of 20 μl. Thermal cycling conditions followed a touch-down PCR protocol with annealing temperatures depending on primer pairs: 10/10/18 cycles at 54/55/56°C for MarVer1 and 8/10/10/10 cycles at 54/55/56/57°C for MarVer3. After an initial denaturation step of 4 min at 94°C, each of the 38 cycles consisted of a 30 s at 95°C, 30 s at the primer specific annealing temperatures as described above, followed by 40 s at 72°C. The final extension consisted of 5 min at 72°C. To confirm the amplification of the desired amplicon, PCR products were visually assessed using gel electrophoresis and Sanger sequenced (GENEWIZ, UK; data not shown).

| Evaluation of primer performance with environmental samples
Environmental DNA (eDNA) samples derived from water collected from six tanks of the Aquarium of Genoa, Italy, in June 2018, were also employed to further validate the performance of the two primer sets.
The tanks contained from 1 to 14 known vertebrate species and were named after their main hosted species or typology: (a) "Manatee," (b) "Dolphin," (c) "Shark," (d) "Seal," (e) "Penguin," and (f) "Rocky shore"-a multispecies tank hosting fish and invertebrates typical of Mediterranean rocky shores. Two tanks (Dolphin and Seal) were single species, the Penguin included two penguin species (Spheniscus demersus and Pygoscelis papua), and the Manatee included two teleost species beside the manatees, while the Shark and the Rocky shore tanks included a combination of cartilaginous and bony fish.
For each tank, a total of 13.5 L of water was collected from the water surface using a sterile graduated 2,000 ml glass cylinder, while wearing sterile gloves, and stored within a 15-L sterile container in order to homogenize the water sample and avoid stochastic variability due to sampling of small volumes. For each tank, 3 × 1.5 L and 3 × 3 L replicates (total six replicates per tank) were then aliquoted from the larger sample. To capture eDNA, immediately after aliquoting, each of the six replicates was filtered using individual 0.45 μm pore size nitrocellulose filters, using a BioSart ® 100 filtration system (Sartorius). After filtering, membranes were placed on ice for transport to the University of Milano-Bicocca and subsequently stored at −20°C. Two weeks later, eDNA was extracted from the filter membranes using a DNeasy PowerSoil Kit ® (Qiagen), following the manufacturer's protocol.
For each of the two novel primer sets, PCR performance with the eDNA extractions was initially evaluated using the same PCR conditions as the "in vitro" validation. After confirming amplification of amplicons in the expected size range with eDNA templates, indexed forward and reverse sequencing primers were created for each primer set, comprising (5′-3′): an 8 bp Illumina barcode tag-4 random nucleotides-amplification primer sequence, and sourced from Sigma, UK. Eight forward primer indexes were combined with 12 reverse primer indexes, to allow pooling of amplicons from up to 96 uniquely identifiable samples within a single sequencing library (Taberlet, Bonin, Zinger, & Coissac, 2018). Trial amplifications with the Illumina barcode-tagged primers suggested their yield, and specificity was unchanged, and so the previously optimized PCR conditions were used to generate amplicons for MiSeq sequencing.

| Bioinformatics for environmental HTS data
Paired reads were first screened for the presence of the expected primer and index sequence combinations to exclude off-target amplicons. The reads were then combined to generate the insert sequence, and the sequence of the random nucleotide region was noted, such that only one instance of an insert per sample with the sample random nucleotide fingerprint was saved to a sample-specific file (i.e., to avoid PCR duplicates and chimeric sequences). The insert data were aggregated to create a count matrix containing the occurrence of each unique sequence in each sample. The taxonomic origin of each insert was determined by blasting their sequence against a local instance of the GenBank NT database (Nucleotide [Internet]).
The level of homology of insert to the hit sequence was noted, as was the species name of the hit sequence. The taxonomic hierarchy for each unique insert was generated by searching a local instance of the ITIS database (ITIS [Internet]) with the annotated GenBank species name. The count matrix and taxonomic hierarchy for all annotated unique sequences were then aggregated into values for equivalent molecular operational taxonomic units (MOTUs), by combining all inserts with a set homology (≥98%) to the GenBank hit at a specified taxonomic level (i.e., "order," "family," "genus," or "species"), using bespoke software (available on request). Summaries and visualizations of read counts for different taxonomic levels were generated using the R package "Phyloseq" (McMurdie & Holmes, 2013).

| Description of primer sets
From the initial evaluation of aligned marine vertebrate mitochondrial sequences, three hypervariable regions flanked by conserved motifs were identified, two within the 12S gene, which we term MarVer1 and MarVer2, and one in the 16S gene, which we term MarVer3. PCR primer pairs were designed for all three MarVer loci.
Here we focus on MarVer1 and MarVer3 (Table 1)
It partially overlaps with loci Tele02 (Taberlet et al., 2018), Tele03 (as named by Taberlet et al., 2018, corresponding to MiFish-U in Miya et al., 2015, and Elas01 (as named by Taberlet et al., 2018, corresponding to MiFish-E in Miya et al., 2015, targeting bony and cartilaginous fishes (Figure 1). The forward primer MarVer1F differs from the forward primers of previously described loci, in that by shifting 5-12 bp at the 5′ end, it skips variable sites distinguishing bony from cartilaginous fishes, while gaining, at the 3′ end, bases which are conserved across all surveyed marine vertebrates.

| MarVer3
MarVer3 (abbr. MV3) amplifies a variable region of the 16S gene, producing amplicons ranging in size from 232 to 274 bp in the 71 marine vertebrate taxa tested (Table 2). MarVer3 is partially covered by locus Mamm02 (Taberlet et al., 2018, see Figure 1), but targets a fragment twice as long: for example, in Odontocetes, MarVer3 amplifies a 233 bp fragment versus a 115 bp amplicon for Mamm02.
The reverse primer (MarVer3R) was the only one of the presented oligonucleotides to be truly "universal," as it was found to be fully conserved across all tested marine vertebrates ( Table 2)

| In silico primer evaluation
For MarVer1 and MarVer3 primer pairs and higher taxonomic groups, the total number of unique taxa for which the in silico amplification recovered target sequences is given in Figure 2. The results show that the MarVer3 primer pair amplified DNA from the most vertebrate and cetacean taxa. However, allowing for up to three mismatches per priming site, this primer pair also successfully amplifies the DNA of invertebrate species thus indicating that it has a low overall specificity to the intended taxonomic targets (Figure 2). The MarVer1 primer pair amplified DNA from slightly fewer target taxa than the commonly used 12S-V5 primers.
The proportion of sequences amplified for each higher taxonomic group is a function of the bp mismatches between the primers, and template DNA is shown in   (Table 3).

Species level
Genetic variability was investigated also below the nominal species level. This could be tested only on the few species for which large enough sample sizes were available from GenBank to evaluate intraspecific variation. We assessed 14 marine mammal species for which mitogenomic data were available for a number of individuals ranging from 2 (Megaptera novaeangliae and Dugong TA B L E 2 Characteristics of the amplicons produced by each of the two MarVer primer sets for the 71 marine vertebrate species used for primers' design. The overall amplicon size is the mean value of the amplicon sizes recorded in the eight marine vertebrate groups analyzed

F I G U R E 2
The total number of unique taxa (y-axis) recovered by the different primer pairs (x-axis). The results are shown for all three higher taxonomic groups (i.e., vertebrates-excluding cetacean species, cetaceans, and invertebrates) considered during the analyses

F I G U R E 3
The proportion of sequences amplified for each higher taxonomic group as a function of the bp mismatches between the template DNA and the forward (y-axis) and reverse (x-axis) primers. The size of the pie charts is proportional (on a log scale) to the total number of sequences recovered for a given number of bp mismatches between the forward and reverse primer F I G U R E 4 The percentage of sequences correctly identified (y-axis) to the family-, genus-and species-level taxonomy (x-axis) for the different primer pairs. Results are shown for all sequences belonging to the cetaceans and vertebrates (horizontal panels) and using different threshold values for barcode similarity (i.e., sequences were considered different if they have 1, 3, or 5 bp differences) (vertical panels)

| In vitro primer evaluation
PCR amplicons of the expected size were generated by the two primer sets in all the 13 DNA extracts (Table 4). Sanger sequencing test performed on PCR products confirmed the amplification of both the correct 12S/16S fragments targeted by MarVer1 and MarVer3 loci and the correct species from which the tissue DNA samples originated.

| Application to environmental samples
Amplicons of the expected size were obtained also from all 36 water samples collected at the Genoa Aquarium with the two primer sets  Table 5, Figure 5, Appendix S5).
Amplicons for the aquarium's four frequently used vertebrate feed species (Culpea harengus, Mallotous villosus, Merluccius productus, and Scomber scombrus) were detected in Tanks 2 to 6 (in the Manatee tank, feed consists of lettuce) using both loci, with the exception of MarVer1 failing to detect Merluccius productus. Squid (unspecified species) is also supplied to the Dolphin Tank, but no cephalopod amplicons was recovered.
Amplicons for resident species were also detected in tanks other than their host tanks at low levels (e.g., with MarVer3, dolphin and seal amplicons were detected in the manatee and shark, and penguin and rocky shore tanks, respectively), suggesting possible transfer of eDNA between tanks in the aquarium, for example, via the equipment used by staff members. Similarly, human amplicons were detected in all tanks, consistent with the practice of aquarium staff entering the water for maintenance.
Both MarVer1 and MarVer3 identified amplicons (partially overlapping between loci and tanks) that were not directly attributable to resident species or food sources (category B in Appendix S5). These comprised six recurrent species, two of which were previously (but no longer) used as feed (Sardina pilchardus, Engraulis encrasicolus), and four species present in the Ligurian Sea (e.g., Auxis rochei, Auxis thazard, Belone belone, and Coris julis) from which the water used to fill the tanks is drawn, after being filtered and UV irradiated. All of these unexpected species detections were at low abundance, with read counts greater than 100 to a maximum of 947 in at least 1 tank (range 0.3% to 3.7% of total reads with MarVer3 and MarVer1, respectively). Very low abundance amplicons (<100 reads per tanks) for at least 20 other Mediterranean resident teleost fish species were also observed, but were not considered further as definitive detections.
Amplicons from invertebrate species (category C in Appendix S5) were detected in two tanks by the MarVer3 locus at low abundances (read count < 100), which would normally be discounted as a detection. These cases refer to an Anthozoa species, Seriatopora hystrix, in the manatee tank, and a Sipunculid worm, of the family Phascolosomatidae, in the seal tank (Appendix S5). Neither taxa were in the tank in which their traces were found. No invertebrate amplicons were recovered in the "rocky shore" tank which contains some Anthozoa (e.g., Anemonia viridis), some unidentified sponges growing spontaneously and hydrozoans, or from the shark tank where Aiptasia spp grows spontaneously. The other tanks (with the exception of dolphin and manatee) may also contain other spontaneously growing small invertebrates, such as copepods, amphipods, and hydrozoans, but none were detected.

| Comparison with previously described barcode primer sets
This study was conceived to identify cetacean specific barcode loci complementing the many primer sets already available for The 12S-V5 primer set (Riaz et al., 2011, renamed Vert01 by Taberlet et al., 2018) is the only one previously described as being specific for vertebrates. It is located adjacent to MarVer1 site (forward 12S-V5 primer partially overlaps with reverse MarVer1 sequence, see Figure 1). Within our alignments, the 12S-V5 site was not as variable as any of the three loci candidates identified in this paper.
TA B L E 3 Levels of polymorphism found at the two proposed loci in a series of marine mammal Families, Genera and species for which complete mitogenomic data were available from GenBank (accession numbers listed in Appendix S4, with the exception of 22 Pusa capsica entries for which data are unpublished). Numbers in the two columns on the right indicate the number of variable sites detected in the sequence targeted by the two primers sets in each given comparison:  For example, looking at one of the most variable Families considered in this study, the Ziphiidae (Odontocetes), our two loci MarVer1 and MarVer3 recovered 37 and 43 variable sites, respectively (Table 3), while 12S-V5 highlighted only 13 within the same pool of sequences.
This was not attributable to differences in amplicon size since the MarVer2 candidate region includes more than twice (30) the number of variable sites found in the 12S-V5 target sequence (13) within the Zophiidae family, while having a similar size (respectively, 96 bp and 106 bp). Such performance differences were also highlighted in the in silico simulation (Figure 4). Moreover, within both forward and reverse 12S-V5 primer sites, polymorphisms were present within the Ziphiidae family, suggesting that the 12S-V5 primers may not be conserved across all vertebrate classes. However, 12S-V5 performed slightly better than MarVer1 based on the total number of unique taxa for which DNA was amplified in the in silico simulation ( Figure 2). While the lower predicted number of taxa recovered by these primers may indicate a failure to amplify DNA from some species, the completeness of the reference database used will also influence the results given that all sequence records were considered and not only the full mitochondrial genomes. Within the partial 12S sequences deposited on GenBank, the fragment including locus 12S-V5 might be over-represented, as most previous studies relied on this marker.
Besides being highly conserved among vertebrates, the two MarVer primer sets were shown to be potentially nonexclusive to vertebrates when 2-4 bp primer/template mismatches were allowed

| Performance of MarVer1 and MarVer3 primer sets with environmental samples
We evaluated the performance of MarVer1 and MarVer3 primer sets with water samples collected at the Genoa aquarium, from tanks with known community compositions. Amplicons annotated to species level were recovered for 81.5% and 37% of the 27 resident taxa for MarVer3 and MarVer1, respectively. For MarVer3, the five "undetected" species included Diplodus cervinus (Zebra seabream), Pterygoplichthys gibbiceps (Armored catfish), two ray species (Dasiatis americana and Teaniura grabata), and Pristis zjisron (Longcomb sawfish). In the case of D. cervinus, amplicons assigned to other nonresident Diplodus species were observed, so eDNA from the species may have been present, but annotated as a congeneric. For the four other "undetected" species, there were no other incompletely (above species level) assigned reads at genus, family, or order level which could be attributed to these taxa (for Pristis zjisron, no matching reference sequence for the MarVer3 region was available in GenBank). These four cases therefore appear to be genuine nondetections. These are all bottom-dwelling species, whereas our water samples were collected at the surface, and therefore, potentially we did not capture eDNA from these species.
For MarVer1, 9 out of 13 resident species with GenBank reference sequences covering the MarVer1 region were detected successfully. Amplicons correctly assigned to the species level were not observed for the two penguin species, Blackspot seabream (Pagellus TA B L E 4 List of tissue DNA extracts used for wet lab primer tests. "PC" indicates phenol-chloroform extracts, and "CK" indicates commercial kit extracts, while "BE" refers to boil extracts as described in Valsecchi (1998 TA B L E 5 Species composition of the six tanks of the Aquarium of Genoa from which water samples were collected for extracting eDNA. The last four columns on the right show NGS outcomes for primer sets MarVer1 and MarVer3: "GB" columns indicate whether reference sequences are present in GenBank for that specific species and locus; "■" indicate successful species detection (half-full and empty square indicates a number of reads 0.001> nr> 0.005 or nr < 0.001, respectively); △ denotes that a congenereric was identified in place of the resident vertebrate species; and "?" indicates possible detection of the resident species at a higher taxonomic group (this case together with those instances for which reference sequences were available but the species remained undetected are discussed in the text)

| Optimizing locus choice for different eDNA and taxon detection applications
The two loci described in this paper provide investigators with flexible options to target different barcode markers depending on priorities for their study objectives, tailored to requirements for taxonomic breadth, variation and resolution at different taxonomic levels and amplicon size where eDNA degradation is a concern (e.g., Speller et al., 2016).
The 12S-based MarVer1 offers the advantage of smaller amplicon sizes (approximately 202 bp), which may be a consideration for applications requiring work with more degraded eDNA templates (Nichols et al., 2018;Wei, Nakajima, & Tobino, 2018 Table 3. Overall, this pattern is consistent with lower rates of evolution in 16S compared to 12S genes and suggests 12S genes may be more informative for resolving intraspecific differences (see below).
We provide guidelines in Table 3 for the choice of the most suitable marker to be employed where specific marine mammal taxa are of interest, while for metabarcoding studies aiming at the detection of all marine vertebrates, we would recommend using the combination of the two loci to maximize taxonomic coverage and to ameliorate potential gaps in reference sequence databases.
Our initial search for hypervariable regions flanked by con- The preliminary investigation of sequence variation in other large marine vertebrate groups (tuna and sea turtle species) suggested our loci also have potential to be informative for species identification in those taxa. While not assessed directly in this study, the MarVer loci may also prove to be useful as barcode markers for terrestrial vertebrates given taxonomic conservation of the priming sites.
Finally, the high levels of diagnostic variation seen within MarVer loci amplicons offer potential for designing additional species-specific nested internal primers (Stoeckle, Das, & Charlop-Powers, 2018).
These might have utility for species-focused, non-sequencing-based F I G U R E 5 Taxon "abundance" (read counts) for amplicons generated with the MarVer3 primer set for environmental DNA extracted from water samples collected from 6 tanks at the Genoa Aquarium. Read counts are combined across the 6 replicate samples assayed for each tank, and sequence demultiplexing and amplicon annotation against Genbank references were carried out using a threshold of 98% sequence similarity. Taxa presented in the figure were filtered to exclude those with read counts less than 0.005*median read count across tanks. detection applications, such as quantitative PCR (qPCR) or digital droplet PCR (ddPCR), or simple agarose gel-based amplicon visualization when there is limited access to laboratory facilities or funding.

| CON CLUS IONS
This paper presents four novel primer sets targeting 12S and 16S vertebrate mitogenome regions, with a particular focus on marine mammals. Using a combination of "in silico" validation, and application to eDNA samples from aquarium communities with known species composition, we show the loci to have high potential for metabarcoding and eDNA studies targeting marine vertebrates.
These primer sets have broader taxonomic coverage and resolution compared to previously developed 12S and 16S primer sets, potentially allowing surveys of complete marine vertebrate communities (including fish, sea turtles, bird, and mammals) in single HTS runs, simplifying workflows, reducing costs, and increasing accessibility to a wider range of investigators. They may be applied in any context focusing on resolving vertebrate taxonomic identity, from biodiversity surveys and forensics (e.g., CITES surveillance or surveys of commercially targeted fish species), through to behavioral ecology studies and supporting conservation of rare or endangered marine vertebrate species.

ACK N OWLED G M ENTS
We thank Fulvio Maffucci for providing marine turtles' DNA samples, Giudo Gnone of the Aquarium of Genoa for allowing and supporting collection of controlled environmental eDNA samples, Antonia Bruno for advises on filtering procedures of environmental samples, and Anna Sandionigi for bioinformatic advice.

CO N FLI C T O F I NTE R E S T
The authors declare that they have no conflict of interest.

E TH I C A L A PPROVA L
All applicable international, national, and/or institutional guidelines for the care and use of animals were followed by the authors.

SA M PLI N G A N D FI E LD S TU D I E S
All necessary permits for sampling and observational field studies have been obtained by the authors from the competent authorities and are mentioned in the acknowledgements, if applicable.

DATA AVA I L A B I L I T Y S TAT E M E N T
The datasets generated during and/or analyzed during the current study are included in this published article (and its Supporting Information files) and are available from the corresponding author on reasonable request.