Optimization and validation of a cost‐effective protocol for biosurveillance of invasive alien species

Abstract Environmental DNA (eDNA) metabarcoding has revolutionized biodiversity monitoring and invasive pest biosurveillance programs. The introduction of insect pests considered invasive alien species (IAS) into a non‐native range poses a threat to native plant health. The early detection of IAS can allow for prompt actions by regulating authorities, thereby mitigating their impacts. In the present study, we optimized and validated a fast and cost‐effective eDNA metabarcoding protocol for biosurveillance of IAS and characterization of insect and microorganism diversity. Forty‐eight traps were placed, following the CFIA's annual forest insect trapping survey, at four locations in southern Ontario that are high risk for forest IAS. We collected insects and eDNA samples using Lindgren funnel traps that contained a saturated salt (NaCl) solution in the collection jar. Using cytochrome c oxidase I (COI) as a molecular marker, a modified Illumina protocol effectively identified 2,535 Barcode Index Numbers (BINs). BINs were distributed among 57 Orders and 304 Families, with the vast majority being arthropods. Two IAS (Agrilus planipennis and Lymantria dispar) are regulated by the Canadian Food Inspection Agency (CFIA) as plant health pests, are known to occur in the study area, and were identified through eDNA in collected traps. Similarly, using 16S ribosomal RNA and nuclear ribosomal internal transcribed spacer (ITS), five bacterial and three fungal genera, which contain species of regulatory concern across several Canadian jurisdictions, were recovered from all sampling locations. Our study results reaffirm the effectiveness and importance of integrating eDNA metabarcoding as part of identification protocols in biosurveillance programs.


| INTRODUC TI ON
Metabarcoding has become an effective method for molecular-based biomonitoring programs and biosurveillance for invasive alien species (IAS; Makiola et al., 2020). This molecular technique involves high-throughput sequencing (HTS), which facilitates the identification of multiple species using environmental DNA (eDNA) extracted from complex ecological samples (Creer et al., 2016;Taberlet et al., 2012). Incorporating HTS into biosurveillance programs is advantageous as it can provide results faster than conventional identification methods, thereby allowing for early detection of species of concern, including IAS (Ruppert et al., 2019). The ability to obtain results fast, the existence of curated DNA reference libraries (Nilsson et al., 2019;Quast et al., 2013;Ratnasingham & Hebert, 2007), and the continued drop in the cost of HTS platforms have contributed to metabarcoding's rise in popularity in environmental monitoring (Cristescu, 2014;de Kerdrel et al., 2020). Furthermore, as DNA reference libraries with associated morphological species identifications continue to grow, the time, effort, and resources that would otherwise be put toward morphological identification of specimens can be saved through the use of molecular identification methods. This is significant since the number of taxonomists available to conduct species identifications is declining. Reference libraries, therefore, provide a permanent repository of traditional taxonomic expertise that can be used with the appropriate molecular identification tools as needed (Cristescu, 2014).
The mitochondrial DNA region coding for cytochrome c oxidase I (COI) enzyme has been recognized as the primary marker for metabarcoding in the animal kingdom (Hebert et al., 2003). Similarly, the nuclear ribosomal internal transcribed spacer (ITS) and the 16S ribosomal RNA gene have been adopted as the fungal and bacterial markers, respectively (Klindworth et al., 2013;Seifert, 2009).
Multiple reference databases and workbenches for data collection and analysis are currently available for these markers as well as a growing number of sequence records (e.g., BOLD has 8,099,249 records for COI as of November 20, 2020; Nilsson et al., 2019;Quast et al., 2013;Ratnasingham & Hebert, 2007, and UNITE has 2,480,043 for ITS as of November 20, 2020). However, despite the plethora of protocols available for eDNA metabarcoding and the drop in the cost of HTS over the years, there is limited information available on cost-effectiveness Ji et al., 2013;de Kerdrel et al., 2020). Biodiversity monitoring and biosurveillance programs would benefit from the inclusion of HTS protocols to accelerate the detection of species of concern and reduce costs associated with processing large numbers of individual specimens (Giovani et al., 2020;Piper et al., 2019). On a broader scale, cost-effectiveness is essential given that only a fraction of the global biodiversity (including IAS and other pest species) has been described to date, and several billions of dollars may be needed to complete this enormous endeavor (Carbayo & Marques, 2011).
Pest insects have a negative impact on Canada's forests and are second only to wildfires in their effect ("The State of Canada's Forests. Annual Report 2018.", n.d.). Pest insects that are IAS are considered a byproduct of anthropogenic activities and can provoke significant economic and biodiversity losses (Westphal et al., 2008). For instance, IAS are capable of destroying about 400,000 ha of forest every year in Canada (Government of Canada, 2013). Nonmanufactured wood packaging and loose wood dunnage are high-risk pathways for the introduction of IAS, particularly wood-boring beetles (e.g., Cerambycidae and Buprestidae families), like the Asian long-horned beetle (Anoplophora glabripennis) and the emerald ash borer beetle (Agrilus planipennis) (CFIA, 2017). Current pathway-based biosurveillance programs led by the Canadian Food Inspection Agency (CFIA) includes placing traps at sites that are at a high risk for IAS, such as industrial zones receiving international commodities associated with nonmanufactured wood packaging and dunnage. The trapped insects remain in the fluid of collection jars until specimens are decanted and referred to the CFIA Entomology laboratory for morphological identification. As organisms interact with their environment, whether it be the fluid of a collection jar or a plant, they shed DNA into it (Adams et al., 2019;Tab erlet et al., 2018). The eDNA extracted from the collection fluid can then be used to identify insects considered IAS, native pests, and accompanying microorganisms using molecular methods such as eDNA metabarcoding (Pawlowski et al., 2020;Taberlet et al., 2018).
Traditionally, plant health trapping survey protocols for insect pest detection have used alcohol-based collection fluids (e.g., ethanol or propylene glycol) to preserve biological material for morphological identification (CFIA, 2017). Recent protocols have replaced alcohol-based fluids with saturated salt solutions due to advantages over the previous chemistries including lower cost, less storage space requirements, low toxicity to humans, fewer regulatory constraints for laboratories and inspectors, nonflammability, and lower evaporation rate (Young et al., 2020). Salt solutions have also proven to be satisfactory for preserving morphological structures of captured specimens (Young et al., 2020) and for the preservation of water samples for eDNA analysis (Williams et al., 2016). Therefore, the standardization and validation of a cost-effective protocol for eDNA metabarcoding analysis using salt trap solutions are of crucial interest. The current study focusses on the optimization and validation of a protocol for eDNA metabarcoding of organisms captured during regulatory plant health monitoring surveys in southern Ontario, Canada. Special attention will be placed on the wood-boring beetles as well as regulated pathogenic bacteria and fungi inadvertently collected in the traps.

| Collection locations
Lindgren funnel traps were placed at four locations in Southern Ontario, Canada, with six sample sites (traps) at each location ( Figure 1). The traps represented a subset of those deployed following the CFIA's annual regulatory survey program (CFIA's Plant Health business line), which is aimed at detecting insects introduced through high-risk pathways. Locations were selected based on susceptibility to IAS, accessibility, sufficient area to accommodate six traps spaced approximately 25-30 meters apart, and limited public access to avoid vandalism. Six traps were then distributed at these sites near species of trees known to be hosts to the target IAS and that were showing evidence of stress/decline, or damage, indicating the possible presence of IAS. One location was in Halton Hills (GT) and had traps situated near a Municipal landfill and railroad track. A second location was in a 241-hectare park of Carolinian forest that is situated along Lake Erie in Chatham-Kent (WP). A third location, Barrie (BA), was within a wooded area close to railway tracks that experiences a high volume of import traffic. A fourth and final location was in Woodstock (TO), a city that includes manufacturing facilities that import commodities packaged using wood materials. All four locations are less than 215 km from the US-Canada border and could presumably be exposed to traffic of imported wood materials carrying IAS and other pests.

| Sample filtration
Salt solutions were filtered through a nitrocellulose mixed ester membrane filter (pore size 1 µm, diameter 47 mm, Sterlitech). The filter was mounted onto a magnetic filtration cup (Pall) and secured to a 3-piece manifold connected to an GAST vacuum pump (GAST Manufactured, Inc). All supplies were sterilized with 50% bleach or ELIMINase (Decon Labs) before filtration.
The amount of debris within the salt solutions varied greatly across samples and would sometimes cause clogging of membrane pores. Therefore, most samples required multiple membranes to filter the entire volume, resulting in a total of 110 membranes for the 48 samples. The membranes were stored at −80°C until DNA extraction could be completed. As a negative control, one sample of saturated table salt solution prepared in the laboratory was also filtered. All membranes were stored in new Ziploc bags at −80°C until processing for eDNA extraction.

| eDNA extraction
We used a modified CTAB buffer (Coyne et al., 2005;Dempster et al., 1999) (2% w/v cetyltrimethylammonium bromide, 2% w/v polyvinylpyrrolidone, 1.4M NaCl, 100 mM Tris-HCl, 20 nM EDTA). This buffer was used due to reported success in retrieving good quality eDNA yield Dougherty et al., 2016;Renshaw et al., 2015;Turner et al., 2014). The eDNA extraction steps described below were performed on the 110 membranes and the negative control. To extract the eDNA, we used the Dougherty et al. (2016) protocol with some minor modifications. Each filter paper was allowed to thaw and then cut into quarters using new razor blades. Each quarter was placed into individual 2-ml microcentrifuge tubes containing ~250 mg of 1-mm-diameter glass beads, and 500 μl of CTAB buffer prewarmed to 65°C in a heat block. The filter paper quarters were pulped using the TissueLyser II (Qiagen) at a frequency of 30 Hz for one minute and then incubated at 65°C for one hour in a heat block. Following incubation, each tube received 500 μl of 24:1 chloroform-isoamyl alcohol and was briefly vortexed. The aqueous phase containing the eDNA was separated from the chloroform phase by centrifuging the tubes at 13,000 g for 15 min to enable phase separation. After two passes with 24:1 chloroform-isoamyl alcohol, the aqueous phase (approximately 500 μl) was transferred to new 2.0-ml Eppendorf tubes and mixed with 500 μl of isopropanol, and 200 μl of a 5 M NaCl solution. Tubes were briefly vortex and stored at −20°C overnight to facilitate precipitation of eDNA. Tubes were then centrifugated at 13,000 g for another 15 min to pellet the eDNA. The supernatant was removed by pipetting, and 200 μl of 70% ethanol was added to wash the pellet. After centrifugation at 13,000 g for 15 min, the ethanol was removed by pipetting and replaced anew. The contents were again briefly vortexed and centrifuged at 13,000 g for 15 min.
The ethanol was removed for a final time, and the tubes were then placed in a fume hood to allow any remaining ethanol to evaporate at room temperature (~1 hr). Once the eDNA pellet was dry, it was then resuspended in 25 μl 1X TE buffer (prewarmed at 70°C) to favor DNA dilution. eDNA extracts from each quarter belonging to the same trap were pooled. A 5 μl subsample was used to visually assess the presence and quality of eDNA on a 1% agarose gel electrophoresis. The concentration of eDNA (ng/μl) was determined by fluorometry (Qubit). All eDNA extracts with concentrations over 5 ng/μl were diluted to 5 ng/μl with 10 mM Tris pH = 8.5 This was done to normalize the amount of starting material per sample and to reduce the effect of potential PCR inhibitors in the samples. Eight samples exhibited signals of inhibition by showing no amplification during the first PCR of library preparation (see below). These samples were treated to remove inhibitors using a 1x NGS magnetic beads (Macherey-Nagel) ratio, following manufacturer's protocols.
Following quality control steps, the remaining volume of solution containing eDNA extract was placed in 1.5-ml LoBind Eppendorf tubes and stored at −20°C.

| Library preparation
For COI and ITS library preparation, we adapted the "16S  (Beeck et al., 2014). A negative control was included for every batch of samples amplified. All PCR products were visualized on 1% agarose gel to check for proper amplification and fragment size of the amplicons. PCR products were purified using a 0.8x NGS magnetic beads (Macherey-Nagel) ratio following the manufacturer's protocols.

| Second PCR
The purified products resulting from the first PCR were used as the template for the second PCR. As per standard methods for eDNA library preparation, the second PCR was conducted in a separate room and using different PCR workstations than the first PCR. The second PCR used unique index primer combinations for each sample. The sequences for the index primers were equivalent to the Nextera XT Index Kit (Illumina), but synthesized de novo using the services from Integrated DNA Technologies (IDT) and prepared at Advanced Analysis Centre (AAC) at the University of Guelph. This PCR was conducted in 50 μl reaction volumes on an Eppendorf Mastercycler using a unique index primer combination for each sample. Each reaction contained 5 μl of previously cleaned PCR product, 5 μl of each index primer (10 μM agarose gel first and purified using a 0.6x NGS magnetic beads ratio (Macherey-Nagel), following manufacturer's protocols.

| High-throughput sequencing
Sequencing of COI, 16S, and ITS marker regions was performed at the Genomics Facility of AAC at the University of Guelph. For quality control purposes, each sequence library was first normalized using SequalPrep Normalization Kit (Thermo Fisher Scientific), pooled, and quantified with the Qubit dsDNA High Sensitivity assay kit (Thermo Fisher Scientific), and finally checked for fragment size in a Bioanalyzer High Sensitivity DNA Chip (Agilent). After passing quality control, libraries were sequenced on an Illumina MiSeq System using a MiSeq reagent kit, version 3 (600 cycles). Each sample was analyzed based on retaining 1% of the total capacity of the run. Sequencing reads were demultiplexed, and the adapters trimmed with the MiSeq Reporter software generating two paired-end FASTQ raw data files.

| Cost analysis
We conducted a simple cost analysis that compared the cost of processing one sample (from eDNA extraction to MiSeq sequencing) based on the modified protocol described here versus the original "16S Metagenomic Sequencing Library Preparation Protocol" (Illumina) ( Table 2).  to BINs in mBRAVE. Essentially, a BIN (Barcode Index Number) is an alphanumeric code that corresponds to a tight cluster of closely related species haplotypes. BINs are a good proxy for actual biological species (Ratnasingham & Hebert, 2013 ordered by E-value (in ascending order) and then filtered by sequence length (≥200 bp) and percentage of pairwise identity (≥98%).

| Data analysis
The filtered list was queried against the list of fungal species regulated by Canada. Graphs of fungal diversity were generated for the most species-rich genera (≥10 species) using R (R Core Team, 2019).  Figure 2).

| COI
Overall, 2,535 BINs were identified across the 48 traps, representing 57 Orders and 304 Families, the vast majority of them being arthropods (Appendix S1). A total of 5,997,851 sequences were associated with a BIN and together represented an average quality value (QV) score of over 37 out of a total possible score of 40.

| Traps collected in August 2018
A total of 701 BINs were identified across traps placed at the Halton Hills (GT) location, with 12 of them shared among traps. Overall,

| Regulated insects
Species-specific sequence identification was completed with over 97% average mean similarity and over a minimum COI sequence length of 392 bp (Table 4) while Xylella showed the lowest value. Pseudomonas was also the most represented genus in terms of sequence count (Figure 4).

| Fungi
No regulated fungal species were identified across collection sites. A total of 36 genera consisting of ten or more species were Considering that every step of an adapted eDNA metabarcoding protocol can have a particular impact on species detectability, protocol validation is essential (Ruppert et al., 2019). The asymptotic rarefaction curves presented here may suggest that the read depths of our optimized protocol were sufficient to recover the full OTU diversity present in each trap. Nevertheless, it is important to highlight that the curves reflect OTU diversity based on the specific primer sets used in our study (Table 1). Although we selected primers to maximize OTU representation, due to issues such as "PCR dropout," primer binding and therefore species detection may have been less than 100% (Griswold, 2019).
Contrary to eDNA metabarcoding of samples collected in ethanol (Zenker et al., 2020), our protocol was successful in generating high-quality library preparations of the three targeted molecular markers for all of our samples. This validates that salt-saturated solutions can act as a reservoir for eDNA in conventional trapping surveys (Young et al., 2020). If this were not the case, any logistical advantages over the alcohol-based fluids listed above would more likely be irrelevant. However, additional studies, specifically addressing eDNA preservation in saturated salt solutions versus alcohol-based solutions, will be required in samples of known taxonomic composition to compare the methods' efficiency comprehensively.
Our procedure was effective in detecting IAS from high-risk sites in southern Ontario. Among the 2,535 BINs identified, two has been demonstrated for alcohol-based collection fluids (Zenker et al., 2020), should be further evaluated.

F I G U R E 4
Graph representing bacterial diversity per sampling location and collection date. The color gradient represents mean confidence, as indicated in the gradient bar. An interactive pie chart showing the classification of the sequences and including detailed information for each section of the pie can be viewed online.

| Applications for biomonitoring and biosurveillance
Since its introduction into North America, the emerald ash borer beetle (order Coleoptera) has been responsible for killing several millions of ash trees across the continent (Herms & McCullough, 2014 Developing efficient and cost-effective protocols for the early detection of IAS is key in decreasing tree deaths and related economic losses. In the recent past, the high cost of HTS platforms prohibited their application in routine biodiversity monitoring and biosurveillance programs (Shokralla et al., 2012). Although HTS has become more economically accessible, it constitutes just one element in multistep metabarcoding protocols. In our study,

F I G U R E 4 (Continued)
we used alternative cost-effective chemistries without compromising protocol efficiency and were able to process samples at a reduced cost ( It is worth mentioning that morphological identification of arthropods remains a challenge due to factors such as unidentifiable early life stages, specimens escaping from the trap, and advanced or total degradation of specimens in collection jars. In all these cases, the eDNA metabarcoding protocol described here would allow the detection of the species, whereas morphological confirmation may not be possible, leading to false negatives. However, the eDNA of an invasive species that has not been captured can be released with a specimen's feces and result in potential false-positive results for a particular location while at the same time signalling the presence of the IAS in the wider region. The eDNA from a non-captured species can also be found when the invasive species' material persists in the trapped species' gut content. These scenarios illustrate how eDNA and morphological identification might yield different species' detection results.

F I G U R E 5
Graphical representation of the fungal genera with ten or more species per sampling location and date. Each genus is represented by a different color, and the scale represents the number of species found   ITS_GT_16_8_2018  I TS_WP_22_8_2018  ITS_BA_24_8_2018  ITS_TO_28_8_2018   ITS_GT_3_7_2018  ITS_WP_18_7_2018  ITS_BA_19_7_2018  ITS_TO_30_7_2018   0  and their shared environment to achieve better health results for all (e.g., One Health approach).

ACK N OWLED G M ENTS
We are thankful to the team from the Genomics Facility of the AAC at the University of Guelph, and particularly to Jeffrey Gross for all suggestions and recommendations during library preparation and MiSeq runs. We thank Jocelyn Kelly and Mireille Marcotte for kindly reviewing earlier versions of the manuscript. We are also grateful to Jessica Castellanos-Labarcena for helping with the analyses in R and the scripting. We acknowledge the assistance of all members of the Hanner Laboratory during data collection.