Specificity of randomly generated genomic DNA fragment probes on a DNA array


Correspondence: Tomohiro Tobino, Department of Urban Engineering, Graduate School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan. Tel.: +81 3 5841 6243; fax: +81 3 5841 6244; e-mail: t_tobino@esc.u-tokyo.ac.jp


The use of randomly generated DNA fragment sequences as probes on DNA arrays offers a unique potential for exploring unsequenced microorganisms. In this study, the detection specificity was evaluated with respect to probe-target sequence similarity using genomic DNAs of four Pseudomonas strains. Genome fragments averaging 2000 bp were found to be specific enough to discriminate 85–90% similarity under highly stringent hybridization conditions. Such stringent conditions compromised signal intensities; however, specific signals remained detectable at the highest stringency (at 75 °C hybridization) with negligible false negatives. These results suggest that, without any probe design or selection, genomic fragments can provide a reasonable specificity for microbial diagnostics or species delineation by DNA–DNA similarities.


Microarray-based technology, with its advantage of highly parallel detection, has been applied to both population profiling and to functional studies of complex microbial communities in the environment (Loy et al., 2002; Palmer et al., 2006; He et al., 2007; Iwai et al., 2008). Recent studies have used synthesized oligonucleotides as probes because of their flexibility in design and preparation, with intensive specificity evaluation applied to the probe design criteria (He et al., 2005). In addition, several studies have reported the use of PCR-amplified genomic fragment sequences as probes. Such microarrays have been used for the detection of specific bacteria (Kim et al., 2004, 2005), species determination (Cho & Tiedje, 2001), and screening of environmental sequences related to a certain function within a community (Yokoi et al., 2007; Tobino et al., 2011). As the probes in these studies are randomly prepared by shotgun cloning of genomic DNAs, this kind of microarray is independent of sequence information found in public databases and hence offers unique potential for exploring unsequenced microorganisms. However, the specificity of genomic fragment probes has not been evaluated in detail. In this study, we prepared genomic fragment probes from pure cultures whose whole genome sequences are available and then evaluated hybridization specificity in terms of sequence similarity between probe and target.

Materials and methods

Probes and membrane array preparation

Genomic fragment probes were prepared from genomic DNAs of three Pseudomonas strains (Pseudomonas aeruginosa PAO1, Pseudomonas fluorescens Pf-5, and Pseudomonas putida KT2440) by shotgun cloning as described previously (Tobino et al., 2011). The probe set consisted of 167 fragment probes (55, 56, and 56 probes from PAO1, Pf-5, and KT2440, respectively) of ~ 2000 bp in length (Supporting Information, Table S1) and four control probes (see the figure legend of Fig. S1 for the preparation of control probes). Each fragment probe was spotted onto nylon membranes (5 ng per spot) in duplicate, and the spotted membranes were alkaline denatured, baked, and stored in a plastic bag until use (see Fig. S1 for probe layout).

Hybridization and signal detection

Genomic DNAs of pure cultures, plus control DNA (yeast gene ACT, included in the probe set as a positive control) were individually labeled with digoxigenin (DIG) by random priming according to the manufacturer's instructions (DIG High Prime; Roche, Basel, Switzerland). Labeled products were sonicated to an average length of 400 bp. A predefined amount of each labeled genome was mixed with a labeled control and hybridized with membrane arrays in 5 mL of hybridization buffer (DIG Easy Hyb; Roche), at five different hybridization temperatures (55–75 °C), in duplicate. Hybridization and washing procedures were carried out as described previously (Tobino et al., 2011). Chemiluminescent detection was performed using an antidigoxigenin antibody conjugated with alkaline phosphatase and CSPD (both Roche) according to the instruction manual (DIG Application Manual for Filter Hybridization, Roche), and the signal was recorded by LAS-4000 mini (Fujifilm, Tokyo, Japan) using a 10 min exposure. Signals were background corrected and considered positive when the signal to noise ratio was > 3 in all the replicated spots.

Probe–target similarity analysis

Partial sequences from both ends (60–700 bp) of each probe were read using SP6 and T7 primers as described previously (Tobino et al., 2011). The full probe sequence was defined as the segment that was on and within both end sequences in the genome, found using the blastn tool from the National Center for Biotechnology Information (NCBI). The full probe sequences were then searched against the target genome sequences using blastn in NCBI under the default settings. The match that had the least e-value was selected as the representative similarity pair between the probe and the target genome. To eliminate short alignments and anomalous high signals, caused by the high gene copy number, those pairs that had < 500 bp alignment or significant multiple hits were rejected in the subsequent analysis.

Results and discussion

Specific responses were observed from probes corresponding to the target genome at all hybridization temperatures tested (Fig. S2). Visible signals were also found from some probes whose origins were different from the target genome, indicating the occurrence of cross-hybridization (i.e. false positives). As shown in Table 1, the level of false positive signals was 64.7% (216 of 334) at 55 °C but decreased steadily to 22.5% (75 of 334) at 70 °C and was almost completely absent (1.5%; 5 of 334) at 75 °C. In contrast, very few probes (0.6%; 1 of 167) corresponding to the target genome fell in negative and were only found at hybridization temperatures above 70 °C. These results suggest that randomly generated genomic fragments (~ 2000 bp) can function as specific probes to discriminate species in the genus Pseudomonas under highly stringent conditions.

Table 1. False signals observed at different hybridization temperatures
Hybridization temperature (°C)False signals (%)
False positiveFalse negative

Sequence similarity searches between the fragment probes and target genomes produced a total of 496 similarity pairs (Fig. S3). With the exception of probes that originated from the target genome (resulting in 100% similarity), most of the pairs had < 90% similarity, while only two pairs sharing a partial sequence of rrn operon were found to have > 90% similarity of > 500 bp. After screening effective pairs, the resultant 391 pairs were put into six groups according to their sequence similarities, and the average signal intensities of each group were plotted against hybridization temperature (Fig. 1a). As expected, the higher the similarity, the higher the signal intensity, which was consistent at every temperature. Signal intensity decreased with increasing hybridization temperature, with a 10-fold decrease in signal intensity observed from 55 to 75 °C for the perfect match group. The different responses to hybridization temperature are highlighted in Fig. 1b. The signal intensity from mismatches was considerably lower than that of perfect matches, and intensity relative to the perfect match decreased with the increase in hybridization temperature. For example, the group with 85–90% similarity had 12%, 5%, and 1% relative signal intensity compared with the perfect match at 65, 70, and 75 °C, respectively. Previously, long oligonucleotide probes (50–70mer) with around 85–90% of sequence similarity to targets were shown to have 10% relative signal intensity to perfect matches (He et al., 2005). Thus, the specificity of random genomic fragment probes is comparable to long oligonucleotide probes. In addition, based on the data reported by Goris et al. (2007), most genomic fragment sequences between different species seem to share similarity lower than 90%, a finding consistent with this study. Therefore, our results indicate that the specificity of genomic fragment probes is potentially at species level.

Figure 1.

Average signal intensities of probe–target similarity groups in arbitrary units (a), and relative to the perfect match group (b), at different hybridization temperatures. Bars indicate standard deviations.

In this study, long DNA fragments (around 2000 bp) were selected as probes because of their high sensitivity (Letowski et al., 2004). Long probes also contain more sequence information, which makes them advantageous for analysis of microorganisms, many of which are unknown, in the environment (Yokoi et al., 2007; Tobino et al., 2011). Because random 2000-bp fragments may cover two adjacent genes partially, they may however hybridize with target DNA containing one of these genes, which will result in nonspecific signal. Here, target DNA was fragmented to 400 bp to, at least in part, address this concern. When using fragmented DNA, regions flanking the sequence that binds to the probe remain in solution and hence do not contribute to the signal. This situation is similar to what has been observed for whole genome probes, which can provide strain-level specificity even though a given probe (that is, genome) contains genes that are conserved among strains (Wu et al., 2004). However, fragmentation cannot prevent the hybridization of multiple fragments from different strains to the same probe, and hybridization signals obtained with DNA from diverse microbial communities should be interpreted with caution.

In conclusion, our results show that the degree of specificity achievable by randomly generated genomic fragment probes on DNA arrays legitimizes their use for microbial diagnostics. In addition, further experiments (using larger numbers of genomes across diverse species) may provide useful insights into the interpretation of DNA–DNA hybridization values, from the view point of genomic-level sequence similarity (Goris et al., 2007; Wu et al., 2008), on the assumption that a whole genome is a composite of genome fragments.


This study was supported by a Grant-in-Aid for Exploratory Research, project number 21651028 from the Ministry of Education, Culture, Sports, Science and Technology (MEXT).