To test the sensitivity of 454 pyrosequencing for detection of rare species, we added known species to existing complex plankton samples using different dilution gradients. These artificially assembled plankton samples were subjected to pyrosequencing to determine whether or not these spiked rare species could be successfully recovered. In addition, we designed a universal primer pair for the hypervariable V4 region of the nuclear small subunit ribosomal DNA (V4-nSSU) for biodiversity assessment based on 454 pyrosequencing. The sensitivity and universality of this primer pair were tested using a small-scale pyrosequencing of a freshwater plankton sample collected from Hamilton Harbor in Lake Ontario.
Universal primer design
For universal primer design for V4-nSSU, we recovered sequences from GenBank (http://www.ncbi.nlm.nih.gov/nuccore) of representative species of the three major groups of interest (Crustacea, Mollusca, Tunicata) owing to their history of invasiveness. In total, we included 142 species to cover almost all orders/suborders of these groups. All downloaded sequences were aligned using MEGA version 5 (Tamura et al. 2011), inspected manually, and universal primers were designed in conserved regions (Fig. 2). Based on the read length (~ 500 bp) of the 454 GS-FLX Titanium platform, all primers were designed to amplify approximately 400–600 bp depending on variable length in different species to get maximum information for species identification. The forward primer used for pyrosequencing was tagged specifically for each sample using eight nucleotides to identify pooled PCR products after pyrosequencing (Parameswaran et al. 2007). In addition, the 454 FLX adaptors (adaptor A: GCCTCCCTCGCGCCATCAG, adaptor B: GCCTTGCCAGCCCGCTCAG) were also added to the 5′-end of the forward and reverse primers, respectively, to make them compatible with pyrosequencing procedures.
We performed three steps of test for amplification capacity of the universal primers designed for V4-nSSU. First, we tested the universality of primers using several species from each taxonomic group studied: representative members of crustaceans included Daphnia pulex, Cercopagis pengoi and Carcinus maenas, while molluscs were represented by Limnoperna fortunei and Dreissena polymorpha, and tunicates by Ciona intestinalis and Botrylloides violaceus. Sequences from these species had not been included in the alignment used for primer design. Secondly, the primers that performed well in the first step were then tested on bulk DNA isolated from a plankton sample collected from Hamilton Harbor in Lake Ontario. The resulting PCR products were cloned into a vector using a TA cloning kit (Invitrogen Inc., ON, Canada). Twenty-four clones were randomly selected and sequenced using traditional Sanger sequencing method to verify whether or not the selected primers could amplify multiple species when presented simultaneously in a plankton sample. Finally, we employed a small-scale run of 454 pyrosequencing (i.e. an equivalent of 1/48 PicoTiter plate) to assess the performance of the selected primers for biodiversity assessment using the same bulk DNA as that used for Sanger sequencing.
Biological sample preparation, pyrosequencing and data analysis
We spiked larvae/juveniles of four species, including two marine species (bay scallop Argopecten irradians and Japanese sea cucumber Apostichopus japonicus) and two freshwater species (golden mussel Limnoperna fortunei and water lice Asellus aquaticus), into plankton samples to test the sensitivity of 454 pyrosequencing for detection of rare species. To avoid possible errors and confusion derived from spiked species, we spiked marine species into freshwater plankton samples and freshwater species into marine plankton samples.
The plankton samples were collected from major ports in the Great Lakes (Hamilton, Nanticoke and Thunder Bay) and on the Atlantic coast of North America (Bayside and Hawksbury). We used geo-referenced 80 μm oblique plankton nets to tow from the bottom to water surface in each port to collect plankton samples. Larvae of the two marine species were artificially cultured in the laboratory following Zhan et al. (2008), while the golden mussel and water lice were collected from the wild in South America and Europe respectively. All larvae/juveniles were taxonomically confirmed and measured under a microscope. The size of larvae/juveniles of these four species varies from approximately 70 μm to 2 000 μm, and weight, which was averaged based on multiple individuals weighted, ranges from 1·8 × 10−4 mg to 2 mg (Table 1). All collected samples were immediately preserved in 95% ethanol.
Table 1. Results for the sensitivity of 454 pyrosequencing for detection of rare species using spiked known indicator species to existing plankton samples. The sequencing depth is an equivalent of 1/24 PicoTiter plate per each artificially assembled sample. Four species, including two marine species (bay scallop Argopecten irradians, sea cucumber Apostichopus japonicus) and two freshwater species (golden mussel Limnoperna fortunei, water lice Asellus aquaticus), were spiked into plankton samples collected from major ports in Canada. For each species, three replicates and four gradients were setup to assess the recovery performance. For each replicate, the amount of plankton (in mg) is shown. For successful recovery setups, the amount of plankton samples is bolded, and the percentage of biomass of spiked indicator species and the actual number of sequence reads recovered for the target species are shown
|Small species||Large species|
|No. of larvae/juveniles||Replicate no.|| Bay scallop Argopecten irradians Size: 73·4 ± 2·0 μm Weight: 1·8 × 10−4 mg || Golden mussel Limnoperna fortunei Size: 150·0 ± 20·0 μm Weight: 4·0 × 10−4 mg || Sea cucumber Apostichopus japonicus Size: 1175·4 ± 159·0 μm Weight: 4·0 × 10−2 mg ||No. of larvae/juveniles||Replicate no.|| water lice Asellus aquaticus Size: 2020 × 528 μm Weight: 2 mg |
|0·01||1||100 mg||100 mg|| ||0·001||1||118 mg|
|2||74 mg||105 mg||79 mg||2||132 mg|
|3||74 mg||107 mg|| ||3|| |
|0·1||1||66 mg||97 mg|| ||0·01||1||112 mg|
|2|| ||93 mg|| ||2||137 mg|
|3||90 mg||96 mg|| ||3|| |
|1||1|| || || ||0·05||1|| |
|2||95 mg|| || ||2|| |
|3|| || || ||3|| |
|5||1|| || || ||0·1||1|| |
|2|| || || ||2|| |
|3|| || || ||3|| |
Tubes containing preserved plankton were centrifuged at 12 000 rpm for 3 min to remove ethanol, and then opened in a fume hood for 10–15 min to evaporate residual ethanol. Depending on the available amount of plankton from each port, 50–150 mg of plankton sample was used for DNA isolation. We ran three replicates and four gradients for each spiked species to assess the recovery performance (Fig. 1). For the three smaller species (i.e. bay scallop, Japanese sea cucumber, golden mussel), we established a gradient of 0·01, 0·1, 1 and 5 larva(e) per plankton sample, while the larger water lice was added at 0·001, 0·01, 0·05 and 0·1 individual per plankton sample. All artificial assembling procedures were performed before DNA extraction. For the gradients using ≥1 larvae, we spiked larvae directly into plankton samples, while for those <1, we lysed one larva/juvenile using 200 μL DNA lysis buffer and then added different amount of lysed larva/juvenile solution into corresponding lysed plankton samples based on dilution gradients (Fig. 1).
Figure 1. Methodological flow chart used to design and test universal primers and use them to test the sensitivity of 454 pyrosequencing for detection of rare species. Rare known indicator species were spiked into complex plankton samples.
Download figure to PowerPoint
We extracted total genomic DNA using DNeasy Blood and Tissue Kit (Qiagen Inc., ON, Canada). The quality and quantity of each DNA sample were measured by a NanoDrop spectrophotometer (NanoDrop Technologies, DE, USA). We prepared PCR mixtures (25 μL) in eight duplicates for each sample to avoid biased amplification. Each duplicate consisted of 100 ng of genomic DNA, 1 × PCR buffer, 2 mM of Mg2+, 0·2 mM of dNTPs, 0·4 μM of each primer, and 2 U of Taq DNA polymerase (Genscript). PCR cycling parameters consisted of an initial denaturation step at 95 °C for 5 min, followed by 25 amplification cycles of 95 °C for 30 s, 50 °C for 30 s, 72 °C for 90 s, and a final elongation step at 72 °C for 10 min. We pooled and purified PCR products of duplicates using the Solid Phase Reversible Immobilization (SPRI) paramagnetic bead-based method (Agencourt Bioscience Corporation, MA, USA).
For pyrosequencing, we pooled PCR products derived from 24 artificially assembled communities to form one PicoTiter plate (totally two plates for 48 assembled samples, Fig. 1). To ensure approximately equal contributions from each sample, equimolar PCR products from each sample were pooled together. Pyrosequencing was performed using 454 FLX Adaptor A on a GS-FLX Titanium platform (454 Life Sciences, CT, USA) by Engencore at the University of South Carolina.
After pyrosequencing, each sample was sorted based on its unique tag labelled on the forward primer using software CLOTU (Kumar et al. 2011). Raw sequence reads were denoised, trimmed and filtered prior to subsequent analyses to eliminate errors/artefacts using both RDP pyrosequencing pipeline (http://rdp.cme.msu.edu/) and software CLOTU. In general, we deleted sequence reads with Phred quality scores <20 (Q20), and then we removed sequences that: (i) did not perfectly match the tags and forward primer (10–16% of reads removed); (ii) contained any undetermined nucleotide (N's, 4–6% of reads removed); and (iii) were too short (<250 bp, 4–15% of reads removed). In addition, given that PCR-mediated recombination in amplification products (i.e. chimeras) can inflate species diversity, we identified and then deleted chimeras from each data set (23–35% of OTUs removed) using the newly developed, fast and sensitive algorithm UCHIME (Edgar et al. 2011).
Sequence reads from each sample were clustered into similarity-based OTUs at a range of genetic divergence from 1% to 10% (insertions and deletions included) using the CD-HIT method (Li & Godzik 2006) implemented in software CLOTU. The CD-HIT method is based on a heuristic search strategy and offers the capacity for rapid clustering of large similar sequence data sets. OTUs were grouped taxonomically (by suborder or higher, such as Copepoda, Cladocera, etc.) by searching against the nucleotide database of GenBank using MEGABLAST with the parameters of E value < 10−50 and minimum query coverage >80%. Spiked rare species were also identified by MEGABLAST from each dilution gradient and replicate using available reference sequences.