Express barcodes: racing from specimen to identification

Authors

  • NATALIA V. IVANOVA,

    1. Canadian Centre for DNA Barcoding, Biodiversity Institute of Ontario, University of Guelph, 579 Gordon Street, Guelph, ON, Canada N1G 2W1
    Search for more papers by this author
  • ALEX V. BORISENKO,

    1. Canadian Centre for DNA Barcoding, Biodiversity Institute of Ontario, University of Guelph, 579 Gordon Street, Guelph, ON, Canada N1G 2W1
    Search for more papers by this author
  • PAUL D. N. HEBERT

    1. Canadian Centre for DNA Barcoding, Biodiversity Institute of Ontario, University of Guelph, 579 Gordon Street, Guelph, ON, Canada N1G 2W1
    Search for more papers by this author

Natalia V. Ivanova, N1G 2W1. Fax: (519) 824-5703; E-mail: nivanova@uoguelph.ca.

Abstract

Although devices combining microfluidic and advanced sequencing technologies promise a future where one can generate a DNA barcode in minutes, current analytical regimes typically involve workflows that extend over 2 days. Here we describe simple protocols enabling the advance from a specimen to barcode-based identification in less than 2 h. The protocols use frozen or lyophilized reagents that can be prepackaged into ‘kits’ and support barcode analysis across the animal kingdom. The analytical procedure allows 5 min for DNA extraction, 25 min for polymerase chain reaction amplification of the barcode region, 25 min for cycle-sequencing, 10 min for cleanup, 45 min for capillary sequencing and 5 min for trace file analysis to complete DNA-based identification. This study involved the comparison of varied DNA preservation and extraction methods, and evaluated Taq polymerases with high processivity and resistance to inhibitors.

Introduction

The need for rapid identification of organisms is critical in cases such as the interception of CITES-registered species or pest detection in shipments scheduled for transboundary movement. DNA barcoding often offers the best opportunity for identifications in such situations, but its utility will further increase if results are delivered in hours rather than days. Because considerable reductions in processing time can be gained by targeting small amplicons, it is worth emphasizing that a 100–200 bp fragment of the DNA barcode region usually provides species-level identifications (Hajibabaei et al. 2006b; Meusnier et al. 2008). However, 300–400 bp fragments give better resolution and still allow significant time reductions in polymerase chain reaction (PCR) and cycle sequencing protocols from those required for recovery of a full-length barcode. Finally, pre-mixed PCR and sequencing reagents can be supplied as ‘kits’ that are immediately ready for use. These factors provide the basis for express DNA barcoding with modest infrastructure and minimal experience in molecular methods. The present study describes protocols that enable the advance from a specimen to barcode-based identification in less than 2 h.

Barcode analysis, of necessity, begins with the collection of a tissue sample that will be used for DNA extraction. In this study, we examine two commonly used types of animal tissue samples: hard body parts (dry insect fragments) and vertebrate blood. Although analytical protocols for arthropod body parts are well established, the best practices for sampling vertebrates remain controversial (Kilpatrick 2002). Biopsy sampling cannot be unified across all vertebrates, because of the diverse nature of tissues sampled in different groups, e.g. feathers, hairs, or skin punches. There is also an impediment to standardizing the processing of such material, due to the differences in required DNA extraction routines for different biopsy samples (Eguchi & Eguchi 2000; Bello et al. 2001; Tomasek et al. 2008). As a consequence, we evaluated methods for preserving blood samples to identify the optimal methods for rapid DNA barcode acquisition.

Materials and methods

Samples

Dry insect fragments were used for evaluation of alkaline lysis. Although the method was evaluated on a very large number of specimens, we focused on just four samples of the lepidopteran species Acronicta hasta taken from the collection of the Biodiversity Institute of Ontario.

Blood samples from 51 specimens representing eight species of birds and 15 species of mammals were collected in the Calakmul Biosphere Reserve, Campeche, Mexico (18.59°N, 89.42°W, 270 m above sea level). Material was provided and identified by researchers from El Colegio de la Frontera Sur (Ecosur) who conducted a catch-and-release field course under a permit from the administration of the Calakmul Reserve. Blood samples were collected from the ulnar vein (birds), uropatagial vein (bats) or caudal vein (rodent) punctured with a fine syringe needle. A small drop of blood (10–30 µL) was collected on a Q-tip cotton swab and then spotted onto two types of FTA cards (Whatman International Ltd): standard FTA in CloneSaver format, and Whatman–FTA Elute. Q-tips with remaining blood were preserved in tubes with 95% ethanol.

DNA extraction

Alkaline lysis.  Insect fragments up to 3 mm in dimension were extracted using the procedure recommended by Whatman for DNA elution from FTA cards with minor modifications: samples were added to 1.5 mL tubes containing 35 µL of alkaline buffer (0.1 N NaOH, 0.3 mm EDTA, pH 13.0). Tubes were incubated for 2 min at 95–100 °C and 65 µL of neutralization solution (0.1 m Tris-HCl, pH 7.0) were added to each tube and mixed by pipetting. To test removal of contaminants from crude lysates, an aliquot of lysate was transferred to a new tube and mixed with an equal volume of Instagene (BIO-RAD Laboratories, Inc.).

Prior to alkaline lysis, 2–5 mm pieces of blood-stained cotton swabs were placed in a 96-well skirted microplate and incubated at 56 °C to evaporate residual ethanol. After the addition of alkaline buffer, the plate was incubated for 5 min in a thermal cycler at 96 °C, followed by the neutralization procedure described above.

FTA cards.  Punches (1 mm2) from standard FTA cards were extracted on a Biomek FX liquid handling station using FTA wildlife protocol reagents (Smith & Burgoyne 2004), with minor modifications (Borisenko et al. 2008) and used directly in the PCR mixture.

Three protocols for DNA extraction were tested with FTA Elute cards: (i) one punch was briefly washed with 150 µL of water and used directly in the PCR; (ii) two punches were briefly washed with 250 µL of water, and DNA was eluted from the washed punches with 20 µL of water at 95 °C for 20 min; and (iii) one punch was used directly (without wash) in the PCR with the KAPA Blood Direct Kit.

PCR

The PCR performance of seven different enzymes was evaluated on blood samples (Table 1). By testing their capacity to amplify a 421-bp segment of COI using an M13-tailed version of RonM primer (Pfunder et al. 2004; Borisenko et al. 2008) and C_VR1LRt1 M13-tailed reverse cocktail (Ivanova et al. 2007) with thermocycling parameters for different enzymes described above.

Table 1.  PCR cocktails and cycling parameters used for crude alkaline blood lysates. The manufacturer's names of enzymes are as follows: Platinum Taq (Invitrogen). TaKaRa (TaKaRa BIO Inc.), QIAGEN Fast Cycling Kit (QIAGEN, GmbH), Phusion Flash High-Fidelity PCR Master Mix (FINNZYMES OY), KAPA 2G Fast, KAPA Robust, KAPA Ab Hot Start Taq, and KAPA Blood Direct (Kapa Biosystems)
 Platinum Taq, KAPA AbTaKaRa Z-TaqKAPA 2G FastKAPA RobustKAPA Blood DirectQIAGEN FastPhusion Flash
Trehalose (%) final concentration5 — Platinum
0 — KAPA Ab
Reaction volume (µL)12.512.512.512.512–241212
10× buffer (µL)1.25 — Platinum1.25
5× buffer (µL)2.5 — KAPA Ab2.52.5
2× ready Mix (µL)6–1266
50 mm MgCl2 (µL)0.6250.0250.025
dNTP (µm) final concentration50200200200N/AN/AN/A
Primers (µm) final concentration0.10.20.30.30.20.50.5
Enzyme (U)0.30.31250.250.25N/AN/AN/A
Initial denaturation (temp °C/s)94–12095–12094–12094–12095–30098–10
Denaturation (temp °C/s)94–4098–596–594–4094–4098–598–1
Annealing (temp °C/s)54–4054–554–1554–4054–4054–554–5
Extension (temp °C/s)72–6068–1072–572–6072–6068–1572–15
Final extension (temp °C/s)72–30072–30072–30072–6072–60
Number of cycles 40404040404040
Total time (min)10925291091093526

PCR cycling conditions for alkaline insect lysates are listed in Figs 1 and 2. Moth DNA was extracted using either alkaline lysis (Fig. 1) or the standard glass fibre automated DNA extraction protocol (Fig. 2) (Ivanova et al. 2006).

Figure 1.

Performance of TaKaRa Z-Taq on alkaline lysates obtained from different body parts of the moth: 1/3, legs; 2, head; 4, body. The fragment of 307 bp was amplified with the following parameters: 40 cycles of 98 °C for 5 s, 51 °C for 5 s, extension at 68 °C for 10 s (cycling time, 24 min).

Figure 2.

Comparative performance of fast PCR kits: A, TaKaRa Z-Taq; B, QIAGEN Fast Cycling kit; C, KAPA 2G Fast. The protocol for amplification of 307 bp consisted of 5 min denaturation at 95 °C, 40 cycles of 98 °C for 5 s, 51 °C for 5 s, 68 °C for 10 s, and final extension at 72 °C for 1 min (cycling time — 31 min); 658 bp protocol consisted of 1 min denaturation at 95 °C, 40 cycles of 98 °C for 2 s, 51 °C for 10 s, and 72 °C for 5 s (cycling time, 24 min).

Eppendorf Mastercycler ep gradient S thermocyclers were used for all PCR and sequencing reactions. One microlitre of template DNA was used in the PCRs for alkaline lysates; 1 mm2 punch was used for FTA or FTA Elute cards; 2 µL were added to the PCRs for DNA eluted from the FTA Elute cards and for insect DNA extracted using glass-fibre protocol. PCR was performed in thin-wall tube strips for single insect samples or in skirted 96-well plates (Eppendorf) for blood samples.

For fast cycle sequencing optimization, eight PCR products were amplified with Platinum Taq using MLepF1 (Hajibabaei et al. 2006a; deWaard et al. 2008) and LepRI (Hebert et al. 2004) primers (moth DNA was extracted with standard automated DNA extraction protocol (Ivanova et al. 2006)). The same PCR conditions were used to prepare products for size-exclusion cleanup evaluation.

PCR products were visualized on a 2% agarose gel using an E-Gel96 Pre-cast Agarose Electrophoresis System (Invitrogen) as described in DeWaard et al. (2008).

Cycle sequencing

Fast cycle sequencing optimization.  We evaluated 12 thermocycling regimes (Table 2) on eight PCR products generated from moth DNA (see above). Cycle sequencing reactions with 1/24 dilution of BigDye 3.1 (Applied Biosystems) were performed using the LepRI primer (Hebert et al. 2004) as described in Ivanova & Grainger (2007). To ensure consistency of reactions between different regimes, each diluted PCR product was pre-mixed with the sequencing mix and aliquoted by 10.5 µL into 12 eight-well tube strips. Each strip was subjected to a different cycling regime (Table 2).

Table 2.  Cycle sequencing optimization parameters to obtain ~350 bp. Two short protocols which yielded good quality sequences are indicated in bold. CRL, contiguous read length
ProgramInitial denaturation (s)Denaturation (s)Annealing (s)Extension (s)No. of cyclesTime (min)Mean CRLFailures
SEQ31120301524030154395 
SEQ311120301512030 94395 
SEQ3121203015 9030 79389 
SEQ3131203015 6030 63370 
SEQ3141203015 3030 49204 
SEQ3151201515 3030 41368 
SEQ3161201010 3030 362021
SEQ3171201010 2030 31211 
SEQ3181201010 1030 26230 
SEQ31912055303031376 
SEQ320120552030263351
SEQ321120 5 5 1030 21241 

Cycle sequencing of blood samples.  Unidirectional sequencing of blood samples was done with the M13R primer. The standard CCDB protocol with 1/24 BigDye dilution (Ivanova & Grainger 2007) was used for sequencing.

Cycle sequencing of insect samples for size-exclusion cleanup evaluation.  Cycle sequencing reactions for ~400 bp amplicons with 1/16 BigDye 3.1 were prepared using standard protocols (Hajibabaei et al. 2005; deWaard et al. 2008). In brief, 10 µL sequencing reactions consisted of 1.875 µL of 5× sequencing buffer (400 mm Tris-HCl pH 9.0 + 10 mm MgCl2), 0.25 µL of BigDye terminator version 3.1 (Applied Biosystems), 5 µL of 10 % trehalose, 10 pm of primer and 1–2 µL of PCR product.

Cycle-sequencing clean-up

Agencourt CleanSEQ.  Cycle sequencing reactions resulting from fast sequencing optimization and blood samples were purified using Agencourt CleanSEQ (Agencourt Bioscience Corporation) following manufacturer's instructions with minor modifications as described in Ivanova & Grainger (2007) and analysed on a 3730XL DNA Analyser (Applied Biosystems).

Size-exclusion cycle sequencing clean-up.  After cycle sequencing was completed, 40 µL of 0.05 mm EDTA pH 8.0 was added to each sequencing reaction, and the entire sample was transferred to a Nanosep 3 K OMEGA device (Pall Life Sciences). Columns were centrifuged in a table-top Eppendorf centrifuge at 8000 g for 4 min. One hundred microlitres of 0.05 mm EDTA pH 8.0 was added to the column followed by a 4-min centrifugation at 8000 g; and 40 µL of 0.05 mm EDTA pH 8.0 was then added to a Nanosep device; and after a few mixing cycles, the purified sequencing reaction was collected from the top of the membrane and transferred to the sequencing plate for analysis on a 3730XL Genetic Analyser.

Data analysis

Sequence data from fast cycle sequencing optimization were evaluated using Sequence Scanner 1.1 (Applied Biosystems). Sequences for blood samples were auto assembled in SeqScape 2.1.1 (Applied Biosystems) against a mammalian reference sequence and briefly edited [short sequences (< 50 bp) and sequences with multiple heterogeneous basecalls were deleted, gaps were removed]. Remaining sequences were exported into FASTA format and used for identification. While analysing sequence data, we intentionally reduced sequence editing to a minimum, trying to simulate automated contig assembly. The accuracy of DNA-inferred identifications was analysed using the BOLD identification engine by querying against the full reference database of DNA barcodes. Identification/sequencing success was defined as 99% base-pair match of the sequence obtained to the closest available reference sequences of the corresponding species. Because all of the analysed species were present in the database, sequencing success was equal to identification success. After preliminary analyses, sequences were further edited in SeqScape 2.1.1 and submitted together with sequencer trace files to the Barcode of Life Data Systems (BOLD) at http://www.barcodinglife.org into the project titled ‘Terrestrial Vertebrate Survey in Calakmul [ABVSC]’ housing provenance data and images of the corresponding individuals.

Results and discussion

DNA extraction

We sought to implement a simple and rapid DNA extraction procedure. Alkaline lysis is widely used as a ‘quick and dirty’ DNA extraction method (Meeker et al. 2007; Porcar et al. 2007). We employed the Whatman protocol for DNA elution from FTA cards because it does not require special equipment and can be performed in 5 min. We also evaluated the removal of contaminants from crude lysates using Instagene (BIO-RAD Laboratories). However, lysates from insects rarely inhibit PCRs when a small amount of tissue (2–4 mm3) is used per 100 µL of final lysate volume (Fig. 1).

Fast PCR enzymes

Standard PCR protocols take about 1.5–2 h. We evaluated the performance of three PCR kits designed for ‘fast’ cycling protocols. In our first trial, we compared the performance of TaKaRa Z-Taq and QIAGEN Fast Cycling Kit in the amplification of a 307-bp amplicon from moth DNA extracted using the glass fibre protocol (Ivanova et al. 2006). The second trial involved amplification of full-length barcodes (658 bp) and incorporated a new generation enzyme KAPA 2G Fast (Kapa Biosystems) with an even shorter cycling protocol. Both TaKaRa Z-Taq and KAPA 2G Fast outperformed the QIAGEN Fast Cycling Kit. In fact, no products were generated with the QIAGEN kit in the second trial (Fig. 2). Although shorter products are more suitable for express protocols, some applications (e.g. very closely allied species) might require the rapid amplification of full-length barcode region.

Cycle sequencing optimization

Although the standard cycle sequencing protocol takes about 2.5 h, it has been shown that this time can be reduced to 50 min without compromising quality (Platt et al. 2007). We evaluated a series of cycle sequencing protocols (Table 2) to see if this time could be reduced further when dealing with a 300-bp amplicon. The shortest contiguous read lengths (CRLs) were obtained with protocols SEQ316–318, involving 10 s for denaturation and annealing stages (Table 2). Surprisingly, CRLs were longer in protocols with shorter denaturation and extension times, e.g. SEQ319–321 (5 s for denaturation and annealing) and SEQ315 vs. SEQ314 (15 s vs. 30 s denaturation). Overall, CRL was significantly decreased if the elongation time was shorter than 20 s. In the end, we were able to obtain satisfactory quality reads for ~300 bp in 26–31 min using programs SEQ319 and SEQ320.

Size-exclusion cleanup using the 4 K Nanosep device did not completely remove unincorporated dye-labelled terminators leaving ‘dye blobs’. Despite this fact, sequencing reactions cleaned up with Nanosep produced good quality sequences (CRL = 235–285) suitable for analysis using the BOLD identification engine.

Blood samples

DNA extracted from ethanol-fixed cotton swabs with alkaline lysis were used to evaluate the resistance of various enzymes to inhibitors. All fast enzymes, except KAPA 2G, were seriously inhibited by crude blood lysates (Fig. 3). Both KAPA 2G Fast (86%) and KAPA Ab (84%) outperformed Platinum Taq (80%), but KAPA Blood Direct (75%) and KAPA Robust kits (35%) often failed, apparently due to the formation of primer dimers. Phusion Flash High Fidelity kit (designed for whole blood amplification) performed inconsistently.

Figure 3.

Comparative performance of different enzymes on 51 alkaline blood lysates.

We also compared the performance of Z-Taq polymerase, KAPA 2G Fast, and KAPA Robust on FTA Elute cards to the performance of Platinum Taq on standard FTA cards extracted using the wildlife FTA protocol (Smith & Burgoyne 2004; Borisenko et al. 2008). Platinum Taq polymerase produced highest success on both FTA Elute disks and eluates (Fig. 4), while KAPA 2G Fast enzyme outperformed Z-Taq polymerase on FTA Elute disks.

Figure 4.

Comparative performance of different enzymes on 51 blood samples preserved on FTA and FTA Elute cards.

Although overall PCR success with Platinum Taq was higher on standard FTA cards (100%), sequencing/identification success was higher for FTA Elute cards (88–90% vs. 78%). KAPA 2G Fast outperformed Z-Taq polymerase on FTA Elute disks, showing higher PCR (90% vs. 55%) and relatively high sequencing success (76%).

Protocols for DNA retrieval from FTA CloneSaver cards are tedious as they include multiple wash stages and special reagents. In contrast, the new FTA Elute technology requires only water for DNA retrieval. Moreover, FTA Elute disks can be used directly (without any wash) for PCR with new generation enzymes such as the KAPA Blood direct kit. In our experiments, the KAPA Blood Direct Kit showed 76% PCR success from unwashed FTA Elute disks. However, similar to results obtained with the crude alkaline lysates, we noticed an increased formation of primer dimers, possibly affecting PCR efficiency. We did not evaluate an improved version of the Blood Direct kit currently available from Kapa Biosystems.

We noticed that cotton swabs were occasionally picked up from their wells while peeling aluminium foil from the top of the plate, causing potential contamination, and we detected one likely cross-contamination event. As well, one sample on the FTA Elute card was contaminated with ascomycete fungi, which likely happened in the field. Finally, we encountered co-amplification of pseudogenes in the birds Oporornis formosus and Habia rubica. As a result, the sequences for these bird species were not uploaded to BOLD. Based on our previous experience, this problem with pseudogenes can be solved by amplification of a full-length barcode region.

In summary, KAPA 2G Fast enzyme showed the best performance among the enzymes tested in this study, both on alkaline lysates and blotting cards. FTA Elute cards were much more convenient to handle than FTA CloneSaver cards or ethanol-preserved blood on cotton swabs. Currently, Whatman is launching the production of FTA Elute cards in CloneSaver 96-well compatible format (Wingkei So, personal communication), which should make this technology compatible with the high-throughput analysis.

If extremely short cycling times (e.g. 10 min) and a compact footprint are required, then specialized thermocyclers, such as Piko Thermal Cycler (FINNZYMES) and thin-walled PCR strips or non-skirted plates, are the optimal choice for high performance PCR and sequencing protocols.

Conclusions: practical applications and the future of express-barcoding

The protocols described here enable rapid identification of biological material in time-sensitive situations. Barcoding kits with reagents for alkaline lysis and with reagents for PCR and sequencing can be frozen or lyophilized for on demand usage (Hajibabaei et al. 2005; deWaard et al. 2008). Some of the enzymes tested in this study (KAPA Blood Direct, QIAGEN Fast Cycling Kits) are already available as a PCR premix. However, because it is a ‘hot-start’ enzyme, KAPA 2G Fast is best for frozen or lyophilized reactions. If premixed PCR reagents include 5% trehalose, they can be lyophilized and retain activity for up to 1 year at 4 °C and for 3 months at 37 °C (Klatser et al. 1998). We have also found that sequencing reagents dried in 5% trehalose can be stored at 20 °C for 4 months without negative impacts on sequence quality.

A recent application note (Applied Biosystems 2008) describes a resequencing workflow that enables the generation of a sequence from various sample types including blood, cultured cells and fresh or frozen tissue in just 4 h. However, this technique involves magnetic bead technology for DNA extraction and additional steps such as PCR cleanup, which are currently omitted from high-throughput DNA barcoding (Hajibabaei et al. 2005; deWaard et al. 2008). The method also employs a costly BigDye Exterminator cleanup protocol which is time-consuming and requires specialized equipment. By contrast, the protocol described in this study is faster and requires equipment that is commonly available, making it suitable for small-scale laboratories operating in ‘field’ settings.

In summary, our barcoding workflow involves quick DNA extraction using alkaline lysis or FTA Elute cards (5 min), fast PCR with KAPA 2G Fast enzyme (25 min), no PCR product check or cleanup, fast cycle sequencing with BigDye 3.1 (26 min), cleanup of the sequencing reaction using Nanosep columns (PALL) (5–10 min) and a short sequencing run (60 min). The use of pre-mixed PCR and sequencing reagents further aids speedy deployment.

Some of our protocols may aid the development of a hand-held barcoding device by helping to simplify the analytical chain. Although the portable device envisioned in the early days of DNA barcoding (Janzen 2004) does not exist, components are available. Hand-held fluorescence thermocyclers BioSeeq (Burns et al. 1998; Emanuel et al. 2003) already allow rapid pathogen detection in field settings. High throughput DNA sequencing using a microfabricated 96-lane capillary array electrophoresis bioprocessor enables a fast sequencing run of ~430 bp in 24 min (Paegel et al. 2002). Solid phase reversible immobilization technology allows sequencing cleanup to be performed in 2 min (Xu et al. 2003). Systems employing microfluidic technologies (e.g. Microchip Biotechnologies Inc.) are already allowing integrated sequencing reactions and cleanup. The challenge for the future lies in the integration of these technologies and reducing their cost. Until such breakthroughs occur, it makes sense to exploit modifications in existing Sanger sequencing methods that speed analysis. Further development of simple protocols, such as the ones described here, will aid the delivery of species identifications at low cost using basic laboratory infrastructure. Although the race from organism to identification is no sprint, it is already shorter than any marathon.

Acknowledgements

This work was supported by grants to P.D.N.H. from Genome Canada through the Ontario Genomics Institute, and NSERC. We thank Sarah Adamowicz for comments on the manuscript, our colleagues Blanca Roldán-Clara, Enrique Escobedo Cabrera, Humberto Bahena Basave, Manuel Elas–Gutiérrez, and Martha Valdez–Moreno for providing identified specimens for blood collection and assistance in the field. Duane Mendis provided samples of Kapa enzymes, while Maryke Appel and John Foskett from Kapa Biosystems provided technical support. Wingkei So, Christina Kuhlmann, and Breck Parker supplied samples of FTA Elute cards.

Conflict of interest statement

The authors have no conflict of interest to declare and note that the funders of this research had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Ancillary