A genome comparison method was used to identify specific target sequences for the polymerase chain reaction (PCR) detection of Vibrio parahaemolyticus, and the CDS value of this bacterium was compared with that of 139 other bacterial genomes. It was found that 20 CDS of V. parahaemolyticus were relatively specific according to their E value in BLAST (a new tool for comparing protein and nucleotide sequences), and four of them were selected for the design of PCR primers. There were positive amplification products of these four pairs of primers from nine V. parahaemolyticus strains, whereas there were no amplification products from nine other Vibrionaceae strains and four non-Vibrionaceae strains. An evaluation of detection sensitivities revealed that these four pairs of primers can be used in a PCR assay for the detection of V. parahaemolyticus.
An automatic BLAST method was developed in this study, by which species-specific sequences can be screened out rapidly. In this way, new and specific genes of Vibrio parahaemolyticus were identified to be used as target sequences for PCR detection. In terms of acceptable specificity and sensitivity, the four pairs of primers were selected by screening, which can be applied in PCR assays and other molecular methods. These kinds of methods might become commercial detection products in the new future. In addition, this method for searching specific DNA sequences can also be used for the mining specific sequences in other genus and species, such as Salmonella, Staphylococcus, etc.
Vibrio parahaemolyticus is a well-recognized human pathogen that is widely distributed in marine and estuarine environments. This organism can cause gastroenteritis through ingestion of raw or partially cooked seafood (Lee et al. 2003; Lin et al. 2004; Park et al. 2004). V. parahaemolyticus accounts for about 35% of the gastroenteritis associated with seafood in Taipei (Wong et al. 1999) and about 10% in India (Deb et al. 1975). More attention has been paid to the increasing reports of these disease outbreaks in humans, and the correct identification and rapid detection of this organism is of great importance.
The identification of V. parahaemolyticus by traditional microbiological detection methods is time-consuming, and it usually requires several days to achieve a low sensitivity (Thompson et al. 2004). The polymerase chain reaction (PCR) method, which is rapid and highly specific, is extensively used to detect a number of pathogens. The target genes and their primers play an important role in PCR assays, which affect the specificity and sensitivity of the method (Hill 1996). It has been reported that the target genes in PCR methods for V. parahaemolyticus were basically gyrB gene, tl, tdh and trh genes, and toxR gene (Miller et al. 1987; Shirai et al. 1990; Tada et al. 1992; Venkateswaran et al. 1998; Bej et al. 1999; Kim et al. 1999). All of them have been used for many years, but they are not very specific in PCR detection (Luan et al. 2006).
The target sequence for PCR methods should be specific in species of V. parahaemolyticus and exhibit a low similarity to other bacterial genera and species. Fortunately, hundreds of bacterial genomes including one V. parahaemolyticus strain and 11 other Vibrionaceae strains have been sequenced and published in the past decade. This would provide a great pool of informational resources to mine relatively specific and new genes by comparison of these sequences. Therefore, the objective of this study was to use a bioinformatics method with C++ programing to automatically report specific DNA sequences after comparing 4832 CDS of V. parahaemolyticus genes with other bacterial DNA genomic sequences (139 genomes of different species) by implementing BLAST (Basic Local Alignment Search Tool, a tool for searching against DNA and protein sequence databases).
MATERIALS AND METHODS
Screening Strategy for Specific Sequences of V. parahaemolyticus
The following four steps (Fig. 1) were established according to our screening strategy to mine relatively new and specific sequences.
Step 1: Creation of a Local BLAST Database. All of the bacterial genome sequences (634 in all, in November 2006) available in GenBank of NCBI (http://www.ncbi.nlm.nih.gov/) were downloaded, of which a total of 139, representing each species, respectively, were combined as a database of local BLAST.
Step 2: Sequence Cutting. To get individual files of BLAST results, all the CDS of two chromosomes of V. parahaemolyticus (BA000031.ffn, BA000032.ffn, cDNA FASTA format, downloaded from NCBI) were divided into thousands of segments through C++ programing by judging “>,” which is the start signal of genes in FASTA format. Then each CDS was automatically saved into a new file after cutting.
Step 3: Automatic Implementing BLAST. A C++ application was compiled to create a Windows Batch Program, through which all the CDS files from step 2 were input and compared against the local database (step 1) individually, and the comparison results were outputted into new files, respectively.
Step 4: Results Filtering. Another C++ program was developed to screen for specific sequences within large numbers of results. The program automatically selected hypothetical specific genes with E values greater than 2.0 and CDS lengths longer than 300 bp.
Eight isolates of V. parahaemolyticus were obtained from Qingdao Disease Control and Prevention Center, Qingdao, China; one standard V. parahaemolyticus strain (ATCC17802-11, from Peking University Health Science Center) and 13 other non-V. parahaemolyticus bacterial strains were collected in our laboratory (Table 1). V. parahaemolyticus strains were grown on 2216E agar (0.5% tryptone, 0.1% yeast extract, 0.01% FePO4, 3.5% NaCl, 2% agar, pH 7.6) at 37C (Annick et al. 2002), and other bacterial strains were grown on 0.17% Luria-Bertani (LB) at 37C overnight.
Table 1. REFERENCE STRAINS USED IN THIS STUDY
Name of bacterial species/strains
Clinical isolates were obtained from Qingdao Disease Control and Prevention Center, Qingdao, China.
Clinical isolates were obtained from Peking University Health Science Center, Beijing, China.
Other bacterial strains acquired from our laboratory, Shanghai Jiao Tong University, Shanghai, China.
The total chromosomal DNA from overnight broth cultures of different strains was extracted according to Nichols et al. (2003).
The genomic DNA concentrations were determined by measuring the absorbance at 260 nm by using a DU 800 UV/Visible Spectrophotometer (Beckman, Fullerton, CA).
Design of Oligonucleotide Primers
Oligonucleotide primers were designed by using PRIMER PREMIER 5.0 and OLIGO 6.0 (Pearson 1990). The genera specificities of oligonucleotides primers were tested against all DNA sequences available in the NCBI database. The primers designed in this study were synthesized by Shanghai Sangon Biotech Corporation.
PCR amplification was carried out in a 25-µL reaction mixture containing 0.8 µL of 10 mM dNTP (Tiangen Biotech Co., Ltd., Shanghai, China), 2.5 µL of 10× PCR buffer (200 mM Tris-HCl [pH 8.4]; 200 mM KCl; 100 mM [NH4]2SO4), 1.2 µL of 25 mmol−1 MgCl2, 5 µL of the DNA sample, 0.4 µL primer, 0.4 µL Taq Polymerase (Tiangen Biotech Co., Ltd.) and 14.7 µL of distilled water by using the Peltier Thermal Cycler PTC-200 (Bio-Rad Laboratories, Hercules, CA).
The PCR program consisted of denaturation at 94C for 5 min, followed by 30 cycles of 94C for 30 s, 58C for 30 s and 72C for 30 s. A final extension was performed at 72C for 10 min and 4C for 10 min.
The amplification products were separated on a 1.5% (w/v) agarose gel, stained with ethidium bromide in 1× TAE buffer and photographed by the electrophoresis imaging analysis system (Shanghai Tanon Science & Technology Co., Ltd., Shanghai, China).
Specificity and Sensitivity of PCR Amplification
The specificities of the four primer pairs were evaluated by PCR tests of 14 different bacterial strains.
PCR sensitivity assays were carried out by using reference strains of V. parahaemolyticus ATCC802-11. A DNA template was prepared in 10-fold dilution with double distilled water and subjected to PCR amplification. The quantities of genomic DNA ranged from 1.36 × 10−3 fg to 1.36 × 106 fg per reaction.
For sensitivity determination in terms of whole cells, the culture of V. parahaemolyticus ATCC802-11 was grown overnight at 37C and diluted in distilled water to 10−9 with three replicates at each serial dilution. Each suspension was examined in duplicate by the DNA extraction and amplification procedures outlined above. The number of viable cells was assessed from the 10−6, 10−7 and 10−8 dilutions by the pour-plate method using Luria-Bertani (LB) agar.
Identification of Specific Genes and Design of Primers
A total of 4,832 CDS of V. parahaemolyticus were compared by automatic BLAST against the local databases of 139 bacterial genome sequences, and 4,832 files containing BLAST results were generated.
During the filtering step, 20 genes of V. parahaemolyticus were identified as hypothetically specific (Table 2). It was indicated from the reports of NCBI that all the functions of these genes were nonexperimental and not confirmed by additional evidence. Four CDS were randomly chosen as target sequences for PCR detection in this study and named DS1, DS2, DS3 and DS4 (Table 3).
Table 2. SELECTED SIMILARITY SCREENING OF GENES BY C++ PROGRAM
Table 3. SEQUENCES SELECTED FOR DESIGNING OF PRIMERS AND PCR DETECTION
Sequence serial number
CDS position (5′–3′)
CDS length (bp)
Primers sequence (5′–3′)
PCR product size (bp)
PCR, polymerase chain reaction.
Four pairs of primers were designed based on four target DNA segments, respectively, and their sequences as well as amplification parameters are shown in Table 3.
Specificity of PCR Primers
Nine strains of V. parahaemolyticus were examined for the presence of these four selected target DNA segments by PCR method using the reaction conditions mentioned above. All pairs of primers yielded a band from each of these nine V. parahaemolyticus strains (Fig. 2). It was indicated from this finding that the DNA segments of DS1, DS2, DS3 and DS4 probably existed in these V. parahaemolyticus strains. Thirteen of the other bacterial strains including eight Vibrionaceae strains and five non-Vibrionaceae strains were also tested, and they did not produce any bands on the gel (Fig. 2).
Sensitivity of PCR Primers
As shown in Fig. 3, the detection limits of DNA template extracted from V. parahaemolyticus were 136 fg for DS1 and DS3, 13.6 fg for DS2 and DS4 per reaction, respectively.
The sensitivities of PCR detection in terms of whole cells of V. parahaemolyticus were 2.17 × 105 cfu/g for DS1 and DS3, 2.17 × 104 cfu/g for DS2 and DS4, respectively (Fig. 4), assuming that the DNA extraction process had completely released all the bacterial DNA from cells.
BLAST analysis has become a common method of interrogating new sequence data, but there are limitations as a discriminating tool for mining specific sequences (David et al. 2005). Thousands of sequences compared directly against a large database by BLAST would generate a huge outcome file, which includes millions of comparison results. It is very laborious to manually collect useful information and find out specific sequences from such a large amount of data. Therefore, a new method to enhance BLAST, employing a C++ program, was established for the sake of mining V. parahaemolyticus specific sequences. The advantage of using C++ programing is that the implemented BLAST program was simultaneously run, allowing for surveying the results during automatic generation thus yielding shorter screening times for target sequences. To our knowledge, no reports have been disclosed on such an automated method used to mine specific sequences for PCR detection. Furthermore, this strategy can also be used in mining specific sequences from other pathogens.
A sequence segment in a bacterium is specific if it elicits an E value > 1.0 (Andreas et al. 1998; David 2002) when a BLAST analysis is conducted against each genome of all the tested bacteria. Correspondingly, the E value cutoff was set at 2.0 in this filter section. Therefore, the selected sequences should be specific enough as PCR target genes. The length of chosen sequences was longer than 300 bp for the convenience of designing primers.
The primers designed in this study were acceptable and more sensitive than gyrB and toxR, and comparable with vpm (Karunasagar et al. 1996; Luan et al. 2006). Therefore, these primers can be used for the PCR detection of V. parahaemolyticus. In addition, the strategy of this study can be used for the evaluation of 16 other hypothetically specific genes in V. parahaemolyticus.
National Center for Biotechnology Information
This research was jointly supported by the Grant No. 30771792 from National Natural Science Foundation of China, the Grant No. 2006BAK02A14 from Ministry of Science & Technology of China and the Grants No. 06dz22208, No. 071422011 and No. 07dz19508 from Science & Technology Commission of Shanghai Municipality. The authors thank the Peking University Health Science Center and Qingdao Disease Control and Prevention Center for their kindness in providing clinical isolates of V. parahaemolyticus.