The complete nucleotide sequence of the Vibrio harveyi bacteriophage VHML


* Correspondence to: H.J. Oakey, Oonoonba Veterinary Laboratory, Queensland Department of Primary Industries, Animal and Plant Health, Service, Abbott Street, Oonoonba, Queensland 4811, Australia. (e-mail:


Aims: To determine the complete nucleotide sequence of the bacteriophage VHML and establish a hypothesis for the virulence conversion caused by VHML infection of Vibrio harveyi.

Methods and Results: The complete nucleotide sequence of VHML was determined (43 193 bp) and used to identify putative genes. The translated products of these genes were compared with reported sequences to assign hypothetical functions. All anticipated structural genes and putative genes for lysogeny were identified. In addition, we found a complete N6-adenine methyltransferase (Dam) gene that appeared to have an essential site for ADP-ribosylating toxins at the C-terminal of the translated product.

Conclusions: Virulence conversion of V. harveyi by VHML may be associated with Dam transcriptional regulation. The Dam gene may also encode for a toxin component similar to ADP-ribosylating toxins.

Significance and Impact of Study: This manuscript lays the foundation for understanding the virulence of toxin-producing V. harveyi. Further research into aspects discussed here will lead to a greater comprehension regarding the invertebrate disease vibriosis and its control in the farming of these animals.


Vibrio harveyi (Baumann et al. 1984) is a naturally occurring inhabitant of warm marine environments. While most strains have no adverse effects on marine fauna, some strains have been reported to cause devastating pathological effects upon marine invertebrates. Such disease has been reported from prawn (Penaeus spp.) hatcheries in the Philippines, Thailand, Indonesia and northern Queensland (Suranyanto and Miriam 1986; Muir 1987; Lavilla-Pitogo et al. 1990; Ruangpan and Kitao 1991; Prayitno and Latchford 1995), spiny lobster (Panuliris homarus) in India (Jawahar et al. 1996) and pearl oyster (Pinctata maximus) in western Australia (Pass et al. 1987). The disease, termed luminous bacteriosis or simply vibriosis, has a high mortality rate and causes significant financial loss to marine aquaculturists.

In our laboratory we have previously isolated and described VHML (Vibrio harveyi Myovirus-like), a myovirus-like bacteriophage found in V. harveyi strain ACMM 642 (Oakey and Owens 2000). This bacterium was isolated from P. monodon larvae by Muir (1987) and was subsequently shown to produce an exotoxin and to be lethal to penaeid larvae (Harris and Owens 1999). In addition, we have demonstrated that laboratory-based infection of a number of previously avirulent strains of V. harveyi with VHML will occur by adding VHML virions to a growing bacterial culture (Oakey and Owens 2000; Munro et al. in press). Infected V. harveyi strains presumably underwent the phenomenon of lysogenic conversion as they remained fully viable and demonstrated a number of phenotypic alterations compared to the same strains when uninfected. The colonial morphology altered, haemolysis was up-regulated, and previously absent extracellular proteins were produced. These extracellular proteins were shown to be antigenically similar to the toxin components produced by V. harveyi 642. In addition, all of the infected bacteria were demonstrated to be lethal to penaeid larvae in contrast to the same strains without the presence of the prophage (Munro et al. in press). Similar results were reported by Austin et al. (in press) who showed that VHML-infected strains of V. harveyi caused increased mortality in Atlantic salmon (Salmo salar L) and in Artemia.

It would, therefore, be reasonable to conclude that VHML may be responsible for the production of the exotoxin, and that VHML has the potential to spread the toxin-producing capability to other strains of V. harveyi in the close environment. However, the above reports drew no conclusions as to whether VHML transduced the toxin gene(s), as was the case with Φ-CTX and cholera toxin (Waldor and Mekalanos 1996), or whether the phage genome includes a gene that up-regulated or altered an existing chromosomal gene(s), hence increasing virulence in some manner. In order to begin to understand the mode of action of VHML with respect to virulence of host bacteria, it is necessary to determine the potential proteins encoded by the VHML prophage and to identify possible toxin genes and/or transcriptional regulator genes that may be present.

This work describes the determination of the VHML genome nucleotide sequence, open reading frame (ORF) analysis, putative gene identification, and the probable role of the proteins synthesized by these genes. The morphology of VHML has been described previously (Oakey and Owens 2000). From this we can anticipate the identification of genes for a phage tail protein similar to other myoviruses, a tail sheath protein similar to P2-like myoviruses in particular, because the sheath appeared loose, and a capsid protein. In addition, we can anticipate the presence of common Caudovirales (tailed bacteriophage) proteins such as tail fibres and tail length determinator, and common lysogenic bacteriophage proteins such as integrase/recombination associated enzymes, repressor protein(s) and antirepressor protein(s). Remaining ORFs will be examined for similarity to reported toxin genes and/or transcriptional regulator genes, and the presence of active sites or protein motifs that may suggest a hypothetical relationship with virulence.

Materials and methods

Bacteriophage isolation and DNA extraction

VHML virions were isolated as described by Oakey and Owens (2000). Briefly, V. harveyi ACMM 642 that had been stored at −80 °C was cultured in Peptone Yeast Sea Salt (PYSS; Oakey and Owens, 2000) broth at 28 °C, 100 rev min-1 for 10–12 h. The lytic cycle of the VHML prophages was induced by incubation with 30 ng ml-1 mitomycin C (Sigma Aldrich, Castle Hill, NSW, Australia) for a further 10–12 h. Cell debris was removed by centrifugation at 5000 g for 15 mins and filtering the supernatant fluid through a 0·45 µm Millipore filter (Millipore, North Ryde, NSW, Australia). VHML was concentrated by ultracentrifugation at 200 000 g for 4 h and the pellet was resuspended in a minimum volume of SM buffer (Sambrook et al. 1989).

DNA extraction was also performed according to Oakey and Owens (2000). Briefly, concentrated VHML were lysed with the addition of 20 mmol l-1 (final concentration) EDTA (Sigma Aldrich), 50 µg ml-1 proteinase K (Sigma Aldrich) and 0·5% w/v SDS (Sigma Aldrich), and incubated at 56 °C for 1 h. The nucleic acid was purified using a hot phenol, a phenol/chloroform and a chloroform extraction. The final aqueous phase was dialysed overnight against TE (10 mm Tris, 1 mm EDTA, pH 8·0). DNA was concentrated using an ethanol precipitation and resuspension of DNA pellet in 0·5 ml sterile distilled water. Concentration and purity were estimated using absorbance at 260 and 280 nm, and integrity was ascertained by running 5 µl DNA extract through 1% agarose gel electrophoresis. DNA was stored at −20 °C until required.

Sequence determination

Shotgun approach using restriction endonucleases  A range of restriction endonucleases with 4, 6 and 8-base recognition sites were tested for ability to digest VHML DNA. Digestion was only observed with the 4-base cutting enzymes. Consequently, 10 such enzymes were used to digest VHML DNA. Digestion reactions were prepared with 3–5 µg VHML DNA, 2·5 U enzyme µg−1 DNA, 1 X suppliers recommended reaction buffer and 0·1 mg ml-1 acetylated bovine serum albumin (Promega) in a total volume of 200 µl. Digestion was carried out for 3 h at the suppliers recommended temperatures. All digestions were separated by 0·8% agarose gel electrophoresis with replicate lanes per digest.

Digestion fragments of <1500 bp were excised across the replicate lanes using clean scalpel blades. DNA fragments were removed from the agarose slices using electro-elution as described by Ausubel et al. (1999). Eluted DNA was purified from the elution buffer by phenol extraction, and concentrated using ethanol precipitation. DNA fragments were resuspended in 50 µl sterile distilled water and frozen at −20 °C until required.

Fragments were ligated into pGEM-3Z (Promega) using a blunt ended cloning technique. Fragments were converted from ‘sticky’ ended to blunt ended using T4 DNA polymerase (Promega) following the suppliers protocol. Fragments were purified using ethanol precipitation. pGEM-3Z was linearized with Sma I (Promega) to give a blunt ended cut within the cloning area of the plasmid and ends were dephosphorylated with calf intestinal alkaline phosphatase (CIAP) (Promega) according to suppliers instructions. Linearized plasmids were purified using ethanol precipitation. DNA fragments were ligated into the linearized plasmid using 10 U T4 DNA ligase (Promega) for every 100 ng vector, with a 3 : 1 fragment to vector molar ratio. Ligation reactions were incubated at 22 °C for 18 h. The ligations were transformed into high efficiency JM109 E. coli cells (Promega) and transformants were selected using ampicillin and blue/white screening. Presumptive transformants were further cultured overnight at 37 °C in 10 ml aliquots of Luria-Bertani (LB) medium containing 100 µg ml−1 ampicillin, and the plasmids extracted from the culture using a commercial kit (QIAprep Spin Miniprep kit, QIAGEN, Clifton Hill, VIC, Australia). The presence of the inserted fragments was confirmed with a double digest excision using 5 µl extracted plasmid, Eco RI (Promega) and Hind III (Promega). 1% gel electrophoresis confirmed the presence of a ligated fragment of the expected size when compared to the original genomic digestion above.

Plasmids with confirmed inserts were used for sequencing with M13 universal sequencing primers (Promega) and ABI-PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit (Applied Biosystems, Melbourne, VIC, Australia) or CEQ 2000 DTCS (Beckman Coulter, Gladesville, NSW, Australia). Reactions were analysed at the Genetic Analysis Facility (GAF) at James Cook University, Townsville, QLD, Australia, using an ABI 310 Genetic Analyser or Beckman CEQ 2000 DNA Analysis System. Data was examined using Sequencher™ software (Gene Codes Corporation, Ann Arbor, MI, USA) and the vector sequence was removed. Fragment sequences were aligned and overlapped, where possible, using Sequencher™, to form a number of contiguous data sets (contigs). Contigs and unaligned sequences were subjected to a BLASTx search of the GenBank database.

Primer directed approach Oligonucleotide primers were designed to amplify out from both ends of all the contigs and from both ends of unaligned ‘free’ fragments with high quality sequence and/or convincing BLASTx results. Primers were designed using at the University of Minnesota, and were designed to be compatible with each other with respect to similar melting temperature and minimum primer complementarity. Oligonucleotides were synthesized by Sigma Aldrich. To estimate positions and orientations of contigs and fragments along the genome, thus preventing tedious and costly random screening of every combination of primer pairs, a long-range PCR was applied to combinations of primers using whole DNA from VHML as a PCR template. AccuTaq long-range polymerase (Sigma Aldrich) was used according to the suppliers instructions. Reaction cycles consisted of 98 °C for 30 s, 15 cycles of 98 °C for 20 s, 46 °C for 30 s and 68 °C for 20 min, an additional 15 cycles with a 15-s auto extension of each of the subsequent extension times, and a final extension step of 68 °C for 5 min Products were visualized using 1% agarose gel electrophoresis.

The estimated relationships between contigs observed from the long-range PCR's were confirmed with conventional PCR using Taq polymerase (Sigma Aldrich). Reactions were subject to an initial denaturation at 95 °C for 1 min, 35 cycles of 95 °C for 1 min, 46 °C for 1 min and 72 °C for 2 min, and a final extension step of 72 °C for 5 min Products were visualized using 1% agarose gel electrophoresis. PCR's which yielded single amplicons <1500 bp in size were purified using a commercial kit (QIAquick PCR purification kit, QIAGEN), and ligated into pGEM-T vector (Promega) using the suppliers recommended protocol. Plasmids were transformed and confirmed as for the digestion fragments above. Plasmids with confirmed inserts were used for sequencing, as described above. Fragment sequences were aligned and overlapped where possible to existing fragments or contigs, using Sequencher™.

The linking of contigs using PCR continued until a single large contig was obtained. At least two replicates of each base were determined, through either the overlapping of fragments or the replicate testing of fragments. Where ambiguities occurred (possibly through Taq polymerase base misincorporation during PCR) in duplicate sequences, at least one more reaction was undertaken to determine the most likely correct base.

Open Reading Frame (ORF) analysis

Nucleotide sequence data for VHML was analysed for open reading frames (ORFs) and putative genes using both NCBI ORF Finder ( and Glimmer 2·0 (Dr Steven Salzberg, The Institute for Genomic Research, Rockville, MD). Results of both programs were applied to a BLASTp search of GenBank to assign a hypothetical role or function and thus form a putative gene map of the VHML genome.


The total number of base pairs along the VHML genome was 43 193 (GenBank accession no. AY133112). Analysis of the genome sequence identified 57 putative ORFs (Table 1). The polypeptides from 37 of these ORFs had similarity to amino acid sequences within GenBank. Of these 37, 28 putative genes could be assigned a hypothetical function based upon translated sequence homology, and 9 were similar to hypothetical proteins of unknown function. Thirty-four of the 37 BLASTp similarities corresponded to bacteriophage related genes or bacteria known to contain a prophage. Details of position, size and nearest match with a BLASTp search of GenBank are shown in Table 1. In most cases, there were many other matches to similar genes in other bacteriophages or bacteria, however, Table 1 lists only the match with the highest degree of homology. From Table 1 it can be seen that the baseplate and tail genes of VHML show homology with other P2-like bacteriophages. The ‘head’ genes (capsid, terminases and portal protein), however, were similar to a lambda-like phage. Other, nonstructural, genes showed similarity to a variety of bacteriophages.

Table 1.  Putative open reading frames identified in the VHML genome
 GenBank accessionPositionLength E-value (Identity %)
 bpa.a.Nearest protein sequences on GenBank (identified by BLASTx)
ORF 1AAN12318297– 18261530509CAC88681·1: Protelomerase, bacteriophage PY54. 631 aa.0·087 (23%)
ORF 2AAN123292422–2790369122NP657249·1: Helix-turn-helix XRE family protein, Bacillus anthracis. 67 aa.0·16 (35%)
ORF 3AAN12339Complement 2996–278721069E41858: biphenyl dioxygenase, Pseudomonas spp. 109 aa.4·2 (33%)
ORF 4AAN12349Complement 5614–30412586862AAC48876·1: primase, bacteriophage N15. 1228 aa.0·16 (22%)
ORF 5AAN12360Complement 5792–557421972None identified
ORF 6AAN12361Complement 7134–6454681226AAG57241·1: repressor protein cI of prophage CP-933 V of E. coli 0157:H7. 215 aa.2e-04 (29%)
ORF 7AAN123627156–7857702233AAF83313·1: hypothetical protein xf0503, Xylella fastidiosa . 200 aa.2e-06 (28%)
ORF 8AAN123638050–827722875CAA21401·1: hypothetical protein of Yersinia pestis, similar to ORF 82 of P2 phage.  69 aa. AAC34182·1: ORF 80, enterobacteriaphage 186. 75 aa.2e-04 (41%) 0·037 (41%)
ORF 9AAN123648385–8855471156None identified
ORF 10AAN123088889–9818930309AAF93883·1: recombination associated protein rdgC, Vibrio cholerae group 01. 304 aa.4e-59 (37%)
ORF 11AAN123099855–10223369122None identified
ORF 12AAN1231010269–10640372123None identified
ORF 13AAN1231110654–1085720467None identified
ORF 14AAN1231211269–11862594197None identified
ORF 15AAN1231311868–1202916253None identified
ORF 16AAN12314Complement 12193–1209410233None identified
ORF 17AAN1231512635–137141080359AAG56134·1: adenine methyltransferase encoded by prophage CP-9330,  E. coli 0157:H7. 352 aa.3e-93 (49%)
ORF 18AAN1231613798–1406727089None identified
ORF 19AAN1231714064–14621558185AAL18980·1: putative protein, Salmonella typhimurium LT2. 177 aa. AAL52291·1: secretion protein, Brucella melitensis. 253 aa2e-33 (44%) 2e-12 (32%)
ORF 20AAN1231914637–15053417138AAF95285·1: flagellar hook associated protein, Vibrio cholerae group 01. 666 aa.0·30
ORF 21AAN1232015592–16173582193AAG56411·1: terminase small subunit (DNA packaging) of prophage CP-933R,  E. coli 0157:H7. 169 aa.0·027 (28%)
ORF 22AAN1232116079–179621884627AAG56410·1: terminase large subunit (DNA packaging) of prophage CP-933R,  E. coli 0157:H7. 654 aa.1e-66 (31%)
ORF 23AAN1232218218–197051488495BAA89642·1: Wolbachia sp. wKue protein similar to portal protein gpB  of phage lambda. 472 aa.2e-84 (39%)
ORF 24AAN1232319698–209781281426BAA89643·1: Wolbachia sp. wKue protein similar to capsid protein gpC  of phage lambda. 350 aa.3e-30 (26%)
ORF 25AAN1232420998–21309312103CAD14561·1: bacteriophage related protein, Ralstonia solanacearum. 125 aa.1·8 (33%)
ORF 26AAN1232521433–22395963320BAA89645·1: Wolbachia sp. wKue protein similar to unknown protein of  phage Felix 01. 332 aa.3e-82 (50%)
ORF 27AAN1232622406–22879474157None identified
ORF 28AAN1232723083–2327118962None identified
ORF 29AAN1232823268–23816549182BAA89648·1: Wolbachia sp. wKue protein similar to hypothetical protein  of Pseudomonas aeruginosa. 158aa.0·006 (26%)
ORF 30AAN1233023779–24396618205AAF85290·1: Xylella fastidiosa xf2492, similar to gpV baseplate assembly  protein of phage P2, 195 aa.4e-19 (35%)
ORF 31AAN1233124612–24941330109BAA89650·1: Wolbachia sp. wKue protein similar to gpW of phage P2. 108 aa.2e-25 (50%)
ORF 32AAN1233224943–25878936311AAD03284·1: gpJ, baseplate assembly protein, enterobacteria phage P2. 302 aa.3e-61 (42%)
ORF 33AAN1233325871–26614744247AAD03285·1: gpI, baseplate spike protein, enterobacteria phage P2. 176 aa.5e-17 (33%)
ORF 34AAN1233426614–27096482160AAL21593·1: Salmonella typhimurium LT2 Fels-1 prophage, similar  to gpH tail fibre of phage P2. 524 aa.1e-04 (33%)
ORF 35AAN1233527155–283871233410AAF96970·1: conserved hypothetical protein, Vibrio cholerae group 01. 605 aa.0·029 (34%)
ORF 37AAN1233629054–39965912303BAB75196·1: hypothetical protein, Nostoc sp. 352 aa. AAD16434·1: maturase protein, Pseudomonas putida. 473 aa. BAB03943·1: transposase, Bacillus halodurans. 418 aa.6e-35 (34%) 3e-07 (24%) 1e-07 (23%)
ORF 38AAN1233730081–3035627691AAG04037·1: hypothetical protein, Pseudomonas aeruginosa 76 aa.2e-04 (44%)
ORF 39AAN1233830452–316121161386AAG04011·1: Pseudomonas aeruginosa protein, similar to FI tail  sheath gene of phage P2. 386 aa.1e-103 (51%)
ORF 40AAN1234031614–32117504167AAF83538·1: Xylella fastidiosa xf0728, similar to FII tail tube protein of P2. 169 aa.7e-34 (46%)
ORF 41AAN1234132429–3271328594AAL21586·1: Salmonella typhimurium LT2 Fels-2 prophage protein, similar  to GpE + E′ of phage P2. 100 aa.2e-09 (40%)
ORF 43AAN1234232885–3595630721023BAA36253·1: Pseudomonas aeruginosa, phage phi CTX protein, similar to gpT  tail protein of P2. 904 aa.6e-65 (26%)
ORF 44AAN1234335913–36434522173BAA36254·1: Pseudomonas aeruginosa, phage phi CTX protein, similar to gpU  tail protein of P2. 146 aa.3e-17 (36%)
ORF 45AAN1234436424–3663020768AAL21604·1: Salmonella typhimurium LT2 Fels-2 prophage protein, similar  to gpX of phage P2. 67 aa.2e-08 (45%)
ORF 46AAN1234536634–37515882293AAM42266·1: phage related tail protein, Xanthomonas campestris 328 aa.  BAA36255·1: Pseudomonas aeruginosa, phage phi CTX protein, similar  to gpD tail protein of P2. 424 aa.2e-53 (42%) 3e-41 (38%)
ORF 47AAN1234637708–38436729242AAC34191·1: ORF 97 of enterobacteria phage 186. 243 aa.3e-50 (44%)
ORF 48AAN12347Complement 39393–38503891298AAA82105·1: rha antirepressor protein, bacteriophage phi 80. 184 aa.7e-21 (53%)
ORF 49AAN12348Complement 39681–3957710534None identified
ORF 50AAN12350Complement 40004–3970829799None identified
ORF 51AAN12351Complement 40362–40015348116None identified
ORF 52AAN12352Complement 40684–4050817758AAC34183·1: ORF 81, enterobacteriophage 186. 194 aa.1e-04 (52%)
ORF 53AAN12353Complement 41153–4087527992None identified
ORF 54AAN12354Complement 41569–4133623477None identified
ORF 55AAN12355Complement 41720–4157414748None identified
ORF 56AAN12356Complement 42073–4181625885None identified
ORF 57AAN12357Complement 42188–4206612340None identified
ORF 58AAN12358Complement 42765–42133633210AAD40334·1: partition protein ParA, Pseudomonas alcaligenes 212 aa.2e-41 (54%)
ORF 59AAN12359Complement 43029–4288015050None identified

The remaining 20 putative genes had no similarity with previously reported amino acid sequences. Interestingly, these genes were identified by Glimmer 2·0, but not by the NCBI ORF Finder software. It is not known if these are genuine ORFs. Indeed, many of these genes that did not match any previously reported genes were short (less than 100 amino acids). However, as the areas of the genome where these appeared could not be assigned to an alternative role, these 20 putative genes were included in Table 1.

When sequencing the termini of the VHML genome, primed reactions extending from both ends abruptly stopped at the same bases in all replicates of each direction. In addition, the resulting sequences were shorter than had been obtained in the determination of all other sequences that made up the remaining genome. It was therefore assumed that these were the termini of the genome. The presence of 5′-cohesive ends (cos) would have been indicated by direct repeat nucleotide sequences at the extreme termini and these were not apparent. However, the two termini of the genome had the same sequence of 33 bases in the opposite orientation (inverted terminal repeats). This inverted repeat is shown in Fig. 1.

Figure 1.

33-base inverted repeats at the termini of the VHML genome. The terminal sequence is identical to the antisense sequence of the opposite terminal, in reverse


Structural genes

It was anticipated from the morphology of VHML virions (Oakey and Owens 2000) that similarities with myovirus structural genes would be observed. In particular, the apparent slipping of the tail sheath was consistent with P2-like myoviruses. As expected, the putative tail genes identified within the VHML genome showed homology with a number of P2-like bacteriophage amino acid sequences reported to comprise the tail proteins (ORFs 30–34 and 39–46. See Table 1). In addition, VHML putative tail genes were found to be in a similar order as those of other P2-likes (various GenBank entries). The baseplate genes gpV, gpW, gpJ and gpI in the enterobacteria phage P2 are located together (Accession no. NC001895), as were ORFs 30–33 in the VHML genome. It was interesting to note that the P2-like baseplate assembly genes gpW and gpJ had amino acid sequences and lengths very similar to the putative VHML genes ORF 31 and ORF 32, respectively. However, the baseplate spike genes gpV and gpI had amino acid sequence homology to ORF 30 and ORF 33 in part of the sequence only. The translated sequence homology between the P2 phage gpV and VHML ORF 30 was observed in the first 140 amino acids only (data not shown). Likewise, the homology between gpI and ORF 33 occurred only within the first 100 amino acids (data not shown). Also, the gene lengths varied between the two bacteriophages. Other gpV-like proteins have been recorded as 154 amino acids (Wolbachia sp. wKue BAA89649), up to 215 amino acids (enterobacteria phage 186, AAC34160·1). This is not surprising as the baseplate spike proteins are part of the receptor recognition pathway for tailed bacteriophages. As VHML has been shown to be specific to a number of V. harveyi strains (Oakey and Owens 2000), it could be anticipated that amino acid sequence variation would occur between this bacteriophage and phages of other bacterial hosts.

In the P2 genome, the VWJI group is directly followed by the tail fibre genes, gpH and gpG (accession no. NC001895). Similarly, the equivalent genes in VHML were directly followed by ORF 34, which had amino acid homology with tail fibre genes analagous to P2 gpH. Similar to the baseplate spike genes, it would be expected that the tail fibre amino acid sequence of phages would vary according to the receptor recognition carried out by the tail fibre proteins. Indeed, the amino acid homology between ORF 34 and the gpH homologue occurs within the first 100 amino acids of the two compared sequences.

The other region on the VHML genome corresponding to putative tail proteins was ORF 39 – ORF 46. These translated sequences showed, in respective order, homology to the FI (tail sheath), FII (tail tube), gpE + E′, gpT (tail length determinator), gpU, gpX and gpD gene products of P2-like phages. The gene order was the same in VHML and other P2-like phages with the exception of ORF 45, which had a translated sequence homologous to the tail protein gpX. In VHML, this gene was located with the other tail protein genes, lying between ORF 44 and ORF 46 (analogous to gpU and gpD, respectively). In enterobacteria phage P2, enterobacteria phage 186 and CTXΦ from Pseudomonas aeruginosa this gene is located more towards the left terminus just after the capsid genes, and is not associated with the other tail genes (from accession no. NC001895, NC003278, NC001317).

Despite minor differences, it was concluded from the results that the tail structure of VHML was encoded by genes that were similar to those found in the P2-like genus of myoviruses, as was anticipated. However, the putative ‘head’ proteins, which include the capsid, small and large subunit terminases (packaging proteins) and the portal protein (head/tail connector), of VHML showed no similarity to any myoviruses previously reported. The putative capsid gene (ORF 24) translation contains the conserved domain pfam01343 (Peptidase U7) which may suggest that the capsid protein is cleaved prior to capsid assembly. Probable functions of ORFs 21–24 have been determined from amino acid sequence similarity to a ‘lambda-like’ phage (phage WO) of Wolbachia sp. wKue, which is reported to have a similar Lambda-like head and P2-like tail (accession no. AB036666).

Genes associated with lysogeny

In order for lysogeny to occur, a bacteriophage must contain genes for genetic recombination and gene repression. In addition, for the lytic cycle to be triggered, a phage must have genes for antirepression and for host cell lysis. From the sequence of the VHML genome, putative genes for recombination, repression and antirepression were identified.

The presence of terminal inverted repeats suggests that integration of the VHML genome into the host chromosome may use a transposition mechanism, similar to that of phage Mu. This would require site-specific recombination enzymes which would attach specifically to each end of the phage genome by recognition of the core sequence. Most recombination systems have a recognition core sequence of approximately 30 bases (Hallet and Sherratt 1997). The enzymes would also recognize similar sequence on the host genome and instigate the cutting of the host genome, integration of the phage and the re-ligation to a complete circular molecule. This model for the integration of VHML is based upon the presence of 33-base inverted repeats at the termini of the VHML genome (hypothesized core recognition sequences), and the putative identification of a recombination protein (ORF 10) known to recognize core sequences of another species of Vibrio. In addition, a putative transposase (ORF 37) and a near terminal protelomerase (ORF 1) have been identified. Protelomerases have been associated with a variety of functions, including integrases (Rybchin and Svarchevsky 1999). No other model of recombination can be currently offered based upon the sequence analysis of the VHML genome as the R-H-R-Y tetrad signature of the Int-family of site specific recombinases (Nunes-Duby et al. 1998), used by P2-like myoviruses, was not identified in the translation of any of the putative gene sequences.

Once integrated into the host genome, a lysogenic phage must employ a phage repressor to prevent transcription and translation of lysis and other late genes. ORF 6 has amino acid sequence homology to the cI repressor from phage CP-933 V of E. coli and to other phage cI-like repressors.

In order to switch to the lytic cycle, as observed by mitomycin C induction of VHML (Oakey and Owens 2000), the phage must encode an antirepressor. ORF 48 amino acid sequence has homology to the rha antirepressor of phage phi-80.

The significance of other putative genes

ORF 1 showed some amino acid sequence homology to protelomerase from phage PY54 of Yersiniae. Protelomerase is normally associated with nonintegrative phages that form a circular plasmid-like form in the host cell by means of closed cos ends, such as phage N15 (Ravin et al. 2001). Protelomerase is used in phage replication to maintain the linear form of the genome with characteristic terminal hairpins throughout a replication process similar to concatamerization. The reason for a protelomerase gene in the VHML genome is unclear, as this phage genome does not have cos ends and evidence suggests that VHML is integrative. Since the VHML virion has a linear genome (Oakey and Owens 2000), it is possible that this enzyme maintains the linear form of the genome within the phage capsid. However, further work would be required to support this.

ORF 2 showed some translated sequence homology to the helix-turn-helix (HTH) class of proteins and contained the conserved domain pfam01381·4 (HTH 3). HTH proteins are strong DNA binding proteins and are commonly associated with transcriptional regulation, such as phage lambda Cro protein. Moreover, regulators of Lux-R and related luminescence proteins in vibrio's have recently been shown to be HTH proteins (Zhang et al. 2002). It has been previously hypothesized that luminescence by V. harveyi is enhanced among those strains producing an exotoxin (Manefield et al. 2000). The link between the HTH of LuxR and VHML ORF 2 remains speculative, however. Other transcriptional regulatory roles for HTH proteins have been described by Roy et al. (2002) and Huffman and Brennan (2002).

DNA primases are enzymes that essentially synthesize short RNA primers used in synthesis of antisense DNA at the replication fork. ORF 4 of VHML encodes a putative DNA primase with amino acid sequence homology to the DNA primase of phage N15. The gene product of ORF 4 has 85·5% amino acid sequence similarity with the conserved domain Smart 00493 that is present in multifunction proteins that act as helicases/topoisomerases, primases, the OLD family of nucleases and RecR recombination proteins. OLD proteins are exonucleases active upon double- and single-stranded DNA and RNA. Their role in primases is presumably one of removal of RNA primers following their extension. OLD proteins are present in P2 bacteriophages and are reportedly involved in exclusion (by nuclease activity) of linear DNA such as incoming linear phage genomes (Myung and Calendar 1995). Hence, ORF 4 of VHML may be involved with exclusion of other linear DNA molecules, such as other myovirus DNA molecules.

The VHML genome contained a putative gene for a partitioning protein (ORF 58). The most closely related protein in the database was ParA protein of Pseudomonas alcaligenes. Partitioning proteins ensure a copy of bacterial chromosome and plasmids reach both daughter cells upon cell division. ParA has also been associated with phages that remain distinct from the host genome as plasmid-like forms (Bouet and Funnell 1999). However, partition of circular DNA molecules also requires ParB and ParS (Trepnow et al. 1994), neither of which were located upon the VHML genome. Bouet and Funnell (1999) report that ParA that is not complexed with ParB, will bind ADP and repress the partition operon. The lack of parB in the VHML genome suggests that this may be the intended role of this gene. Possible reasons for repression of partition by VHML cannot be surmised by the authors of the current study. However, ORF 58 is near-terminal and it may be possible that this is a result of transduction from a previous host.

Significance of putative genes and virulence

The putative genes' translated sequences were examined for similarity to transcriptional regulators and to bacterial toxins. A number of potential transcriptional regulators were identified and have been discussed above (ORFs 2, 6 and 48). However, the most interesting link with virulence is the identification of the putative N6-Dam (DNA adenine methyltransferase) protein encoded by ORF 17, the presence of which may explain the inability of 6- and 8-base restriction endonucleases to digest the VHML genome.

In recent years there has been much interest in the role of methyltransferases in the virulence of bacteria. In a review of the literature, Low et al. (2001) reported that their role was most likely one of transcriptional regulation by alteration of the affinity of regulatory proteins for DNA. In this manner, Dam may either activate or repress genes. In E. coli, Dam positively controlled the colonization factors of strains causing urinary tract infections and diarrheal disease (Braaten et al. 1994). In Salmonella typhimurium, Dam positively regulated the pathogenicity island (Heithoff et al. 1999; Portello et al. 1999). Virulence of Yersinia pseudotuberculosis and Vibrio cholerae was also reported to be regulated by Dam (Julio et al. 2001).

In addition to this potential role of Dam in activation of virulence genes, the translated sequence of ORF 17 contained a site similar to the reported active site for an ADP-ribosylating toxin (APDRT). This was located near the C terminal of the putative gene product. Comparison with other Dams in GenBank suggested that this region was not essential for Dam activity; indeed, some Dams have shorter sequences and would not have an equivalent region (see GenBank accession numbers CAC89017, AAG23175, CAA48031, Q08318; data not shown). Interestingly, no other Dam examined contained this sequence (BAB35203·1, P55893, CAC89017, AAG23175, CAA48031, Q08318). Specifically, the active site resembled that of the group 4 ADPRTs that act upon actin filaments to produce a neurotoxic effect, and also that of the group 2, G protein-acting, ADPRT of Vibrio cholerae. The active sites of VHML and two other examples of group 4 toxins, and cholera toxin, are presented in Fig. 2. This was based upon the active sites reported by Barth et al. (1998) that consisted of an essential motif of S–T–S with an upstream arginine residue and two downstream glutamine residues separated by one variable residue (group 4). Cholera toxin has an additional conserved histidine residue between the S–T–S and the arginine residue. It should be noted that the group 2 and group 4 ADPRTs are subunit toxins consisting of a toxic subunit (A) and a carrier subunit (B). The group 4 ADPRTs are binary toxins, while cholera holotoxin consists of AB5. ORF 19 of VHML had translated sequence similarity with a number of hypothetical proteins of prophages and bacteria reported to contain prophages. However, this ORF also had amino acid sequence homology with secretion proteins of Brucella melitensis (GenBank accession no AAL52291·1, E-value = 2e-12) and Zymomonas mobilis (BAA04473·1). The close proximity of ORF 17 and ORF 19 (which were separated by 300 bases), the observation that both ORFs were in the same reading frame, and amino acid sequence similarity with secretion proteins suggests that ORF 19 is possibly involved in the secretion of the putative toxin encoded by ORF 17.

Figure 2.

Representation of putative ADP-ribosylating toxins active site identified within ORF 17 and a comparison with other ADPRT's. Numbers between the amino acids represent the number of variable bridging amino acids between the essential components of the site. Numbers above the arginine (R) residue represent the number of that amino acid within the translated gene sequence. Abbreviations: ORF 17 (VHML) = ORF 17 translated sequence as described in the current study; CTX = Vibrio cholera toxin (Barth et al. 1998); BotC2 = Clostridium botulinum C2 toxin (Barth et al. 1998); Perf = Clostridium perfringens iota toxin (Barth et al. 1998)

While site-directed mutagenesis studies are required to further determine that ORF 17 either regulates and/or encodes a toxin gene, we believe that we provide sufficient theoretical evidence that this may be the case. With regard to the pathogenic effects of the hypothesized toxin to peneid larvae, Harris and Owens (1999) report the symptoms of vibriosis to be weakness and intermittent swimming motion. This would be in accordance with the presence of a neurotoxin such as those in the group 4 ADP-ribosylating toxins.

In conclusion, the anticipated structural genes for VHML were found and putatively identified. VHML had a P2-like tail both morphological and genetically. Significant differences between the tail genes of VHML and P2-like viruses were apparent only in the receptor recognition genes. The ‘head’ region, however, had no similarity to other myoviruses. Previous experimental work that suggested VHML was capable of lysogeny was supported by the identification of putative genes encoding enzymes for lysogenic integration into the host cells. In addition, we are able to hypothesize that virulence associated with VHML conversion of V. harveyi may be associated with a putative adenine methyltransferase gene (ORF 17). Further work, using site-directed mutagensis, is required to confirm whether this gene is encoding an ADP-ribosylating toxin and/or regulating a chromosomal gene.


This work was carried out as part of a Post-Doctoral Fellowship provided by James Cook University, Townsville, QLD, Australia. The work was funded by a merit research grant provided by James Cook University.

The authors thank Dr Steven Salzberg of The Institute of Genomic Research (TIGR), Rockville, MD, USA for analysis of the sequence data using Glimmer 2·0; Mr Robert Scott of TESAG, James Cook University, for interpreting the UNIX files associated with Glimmer 2·0; and Drs Peter Young and Timothy Mahony of the Queensland Agricultural Biotechnology Centre, Dr Nick Moody of Oonoonba Veterinary Laboratory, and Prof Hans W. Ackermann of Laval University for their comments and proofreading of this manuscript.