We examined the genetic diversity of lytic Pseudomonas aeruginosa bacteriophage PB1 and four closely related phages (LBL3, LMA2, 14-1 and SN) isolated throughout Europe. They all encapsulate linear, non-permuted genomes between 64 427 and 66 530 bp within a solid, acid-resistant isometric capsid (diameter: 74 nm) and carry non-flexible, contractile tails of approximately 140 nm. The genomes are organized into at least seven transcriptional blocks, alternating on both strands, and encode between 88 (LBL3) and 95 (LMA2) proteins. Their virion particles are composed of at least 22 different proteins, which were identified using mass spectrometry. Post-translational modifications were suggested for two proteins, and a frameshift hotspot was identified within ORF42, encoding a structural protein. Despite large temporal and spatial separations between phage isolations, very high sequence similarity and limited horizontal gene transfer were found between the individual viruses. These PB1-like viruses constitute a new genus of environmentally very widespread phages within the Myoviridae.
The ubiquitous Gram-negative soil bacterium Pseudomonas aeruginosa is able to infect injured, burned and immunodeficient patients and causes persistent respiratory infections in individuals suffering from cystic fibrosis (Lyczak et al., 2000). Pseudomonas-specific bacteriophages have been studied for decades (for a review see Hertveldt and Lavigne, 2008) and are being applied as typing and therapeutic agents. The strictly virulent and historically important P. aeruginosa phage PB1 was first described almost half a century ago (Holloway et al., 1960). This myovirus carries characteristic conspicuous capsomers that appear as 8 nm cup-like depressions on the phage heads. The tail exhibits no transverse striations but presents a criss-cross pattern (Ackermann et al., 1988), a rare feature that has only been observed with Salmonella phage Felix O1. Over the years, no less than 42 phages were reported to be PB1-like, mainly based on cross-DNA hybridization and morphological studies (Ackermann and Dubow, 1987; Krylov et al., 1993; Pleteneva et al., 2008). More recently, phages LBL3 and LMA2 were suggested to be related to PB1 by de novo mass spectrometric analysis of their main structural proteins (Ceyssens et al., 2009). It has long been known that PB1 and its relatives use the bacterial lipopolysaccharide layer as receptor (Jarrell and Kropinski, 1977). Phages belonging to this group (e.g. phage 14-1) have been and are being used in human phage therapy trials (Merabishvili et al., 2009).
Despite being clearly a widespread and highly efficient virulent group of phages, no report on the genome composition of these phages is available. In this article, we describe and compare the genome sequences and particle composition of phage PB1 and four related phages (LBL3, LMA2, 14-1 and SN) to the already sequenced but originally unannotated Pseudomonas phage F8 (Kwan et al., 2006).
Results and discussion
Phage collection and particle properties
Pseudomonas phages SN, 14-1, LBL3 and LMA2 were isolated from water samples taken from very distinct natural environments, by different researchers, over a 4-year period (Table 1). All phages displayed clear plaques (2–3 mm diameter), were easily cultivable using P. aeruginosa PAO1 as host and are stable during long-term storage at 4°C in phage buffer (150 mM NaCl, 10 mM MgSO4, 10 mM Tris·HCl pH 7). Transmission electron microscopy on uranyl acetate negatively stained particles revealed them to be morphologically indistinguishable from PB1, carrying non-flexible, contractile tails (c. 140 nm) and isometric capsids with a diameter of 74 nm (Fig. 1A for LBL3, courtesy of Hans-W. Ackermann). Based on these observations, these phages can be classified into the A1 morphological group of the Myoviridae. Phage particles were remarkably stable in acid conditions: on average, 26% and 62% of the particles retain their infectivity after 1 h incubation at pH 2 and pH 3 respectively (Fig. 1B for LBL3).
Table 1. General properties of PB1-like phages described in this study.
After DNA isolation, DNA restriction fragments were subjected to a heat treatment (80°C) followed by rapid chilling prior to electrophoresis. Since this procedure did not alter restriction patterns (data not shown), the possibility of cohesive genome ends was excluded. Restriction analysis after DNA treatment with Bal31 (an exonuclease that degrades double-stranded linear DNA from both ends simultaneously) revealed two specific fragments which shortened over time, while other fragments retained their length (Fig. S1 for SalI digest of phage 14-1). These bands represent the physical genome ends (Loessner et al., 2000), showing these phages carry a non-permuted, linear dsDNA genome.
The complete genome sequences of phages PB1, LBL3, LMA2, SN and 14-1 were determined, and showed little variation in genome lengths (between 64 427 bp for LBL3 and 66 530 bp for LMA2). Remarkably, they display 87.2–93.5% overall nucleotide similarity to Pseudomonas phage F8 (Table 1) and carry very few insertions and deletions compared with this phage, as visualized by full-genome comparisons using MAUVE (Darling et al., 2004) (Fig. S2). Although F8 has been known for a long time and was part of the Lindberg P. aeruginosa typing set (Lindberg and Latta, 1974), PB1 was chosen as reference phage, as it was first isolated (Table 1). In contrast to other lytic Pseudomonas phages (e.g. gh-1 and φKMV), the average GC content (55.5%, Table 1) in PB1-like phages is significantly lower than that of their host P. aeruginosa (66.6%). This GC content is fairly constant throughout the genome, and minor local decreases correlate to intergenic, regulatory regions (Fig. 2). Despite this deviating GC content and the absence of phage-encoded tRNA genes, translational efficiency seems to be ensured by sharing the same dominant codon for each amino acid (except for glutamate) with its host.
The high degree of DNA similarity between the five sequenced phages and F8 allowed comparative ORF prediction for these phages. Based on mutual genome comparisons and aided by (t)blastx analysis, between 88 (LBL3) and 95 (LMA2) ORFs were predicted (Table 2, Fig. 2), including smaller conserved ORFs (< 150 bp, e.g. ORF 11.1), which are usually ignored in the annotation of phage genomes. For reasons of consistency and to ease comparison of the phage genomes, gene numbering of phage F8 was maintained for all genomes, and names of additional or unique genes are extended by a .1 suffix.
Table 2. Bioinformatic analysis of the phages belonging to the PB1-like viruses.
DNA primase [Burkholderia phage BcepNY3] Virulence-associated protein E [pfam 05272]
One transmembrane domain
Hypothetical protein ORF41[Pseudomonas phage 73]
As expected, most of the phage proteins (96%) are conserved among all sequenced phages, and only six true orphan genes were identified (Fig. 2). Since recently acquired ORFs are expected to display atypical codon usage and compositional bias (Lawrence et al., 2001), we calculated individual Nc values (effective number of codons) for all ORFs using CodonW software. In extremely biased ORFs, Nc can approach 20, while in unbiased ORFs it will approach 61. Among the PB1-like phages, the average Nc is 48 ± 8.5. All but one single orphan genes (LBL3-ORFs 1.1 and 88.2, SN/ORFs 2 and 3 and LMA/ORF 3) exhibit extreme Nc values below 30, which might indicate that these genes were recently acquired. ORFs present in more than one phage (e.g. ORF 14.1) have Nc values around 45, and may have lost their counterpart in phages PB1, LMA2, F8 and LBL3.
Genes are arranged in a typically compact manner with little intergenic space (on average 7.9%) and numerous overlapping start/stop codons. The genomes are organized into at least seven transcriptional blocks alternating on both strands. These are separated by (in some cases bidirectional) factor-independent terminators with stem-loop structures which are conserved perfectly between the phages (Fig. 2, Table 3). Since PB1 phages lack a phage-encoded RNA polymerase, they depend entirely on the host transcriptional machinery. However, the only recognizable σ70-specific promoters are located at the start of the structural region, upstream from ORFs 17 and 21 (Table 3). Additionally, conserved AT-rich boxes are found in intergenic regions spread throughout the genome; although no clear-cut consensus sequence could be derived, these motifs might be recognized by a putative phage sigma factor (Table 3).
Table 3. Consensus sequences of regulatory elements in PB1 phages.
Cluster of small proteins. A striking feature of the PB1 genomes is the clustering of a large number of small genes encoding hypothetical proteins. Three clusters of 8 kb (ORF1 through ORF16, except ORF4, Terminase), 4 kb (ORF77 through ORF91) and 1 kb (ORF60 through ORF64) are present, and encode small proteins of unknown functions. A similar single cluster of 20 kb, encoding 62 genes unrelated to PB1, is found in Burkholderia phage BCepF1 (NC_009015). Since regions involved in BcepF1 DNA replication and particle formation are closely related to PB1 (Fig. 2), this cluster largely accounts for the mutual differences in genome size and protein content.
Although no functional assignment could be made, seven proteins are predicted to have coiled coils, indicative for protein interactions with other phage or host proteins (see e.g. Crucitti et al., 2003). Two proteins (gp8 and gp10) carry both signal peptides and two and three transmembrane domains, respectively, and are clearly targeted to the outer membrane. Hypothetically and analogous to coliphage T4 (Miller et al., 2003a), these proteins might have a role in exclusion of superinfection (Vallée and Cornett, 1972) or serve as a membrane anchor for DNA replication or particle assembly. Similar but completely unrelated large assemblies of small proteins with unknown functions are encountered in various phage families like for example the T4-like phages (Miller et al., 2003b), the φKMV-like phages (Ceyssens et al., 2006) and in many cyanophages (e.g. Weigele et al., 2007). Besides the putative roles mentioned above, it has also been speculated that they may have a crucial role as accessory factors that bind to and subtly modify the specificity of host proteins so that they function appropriately during phage infection (Mann et al., 2005).
DNA replication machinery. Based on sequence similarity searches, only two enzymes involved in nucleotide metabolism could be identified: a kinase/phosphatase (gp57) and a thymidylate synthase (gp59). In contrast, a wide assortment of conserved replication factors was found. Unlike other lytic Pseudomonas phages like gh-1, LUZ24 and φKMV, PB1 phages use the DNA polymerase III holoenzyme for DNA replication, the main DNA replicating enzyme in bacteria (Kelman and O'Donnell, 1995). For this, PB1 phages supply two main subunits of the catalytic core. The alpha subunit (gp55) catalyses the polymerization reaction and contains two conserved motifs (342PDIDIDF and 492LLKIDALG). The polymerase III epsilon subunit, encoded by gp56, is a DEDDh-type 3′−5′ exonuclease responsible for the proofreading activity of the polymerase. The three conserved sequence motifs ExoI (7DTE), ExoII (95NLPFD) and ExoIII (156HALDD) (Bernad et al., 1989) are also perfectly conserved in the phage protein. Despite being ubiquitous among prokaryotes, only two other phages (Pseudomonas phage F116 and Staphylococcus phage 47) encode this subunit. The presence of phage-encoded DNA primase (gp74) and helicase (gp53) suggest that replication elongation is independent of the host machinery.
Phage particle proteins. Similarity studies delineate the genomic region involved in particle formation from ORF17 to ORF45 (Table 2), and suggest a relationship to myophage BcepB1A, which itself is marginally related to the Bcep781 group of phages (Summer et al., 2006). One-dimensional 12% SDS-PAGE of two times CsCl purified particles of 14-1, LBL3 and LMA2 showed a highly conserved band pattern, with only slight variations in the 14–20 kDa region (Fig. 3A). This conservation is reflected by the more than 95% nucleotide identity among PB1 phages in this region (Fig. S2). The only drop in sequence similarity is found in ORF44, encoding the presumed tail fibre. ESI-MS/MS analysis on PB1, SN and LBL3 led to the experimental identification of 22 predicted proteins, reaching sequence coverages up to 56% (Table 4). The limited variation in SDS-PAGE patterns between the PB1-likes can be attributed to fluctuating amounts of low-molecular-weight (MW) proteins gp30 (15.8 kDa) and gp22 (16.4 kDa). Gene products 22, 23, 29 and 30 were identified as major building blocks of the phage particle, and no N-terminal peptides of gp23 were found, hinting at post-translational cleavage of this major head protein. This post-translational processing can also be suggested for gp21, which migrated to an estimated mass of 15 kDa, much lower than predicted based on its DNA sequence (50 kDa). During MS analysis, no peptides corresponding to the C-terminus of this protein were detected, supporting this hypothesis.
Table 4. Summary of the mass spectrometric analysis of PB1, SN and LBL3.
No. identified peptides
AA coverage (%)
Minor head protein [pfam 09714, 2e-15]
Minor head protein [pfam 04233, 3e-08]
N-terminus conserved [COG 3566, 5e-16], no C-terminal peptides found
Member of Baseplate_J superfamily [COG 3299, 9e-3]
Similar to various tail fibre proteins
The high-MW structural protein gp38 (94.3 kDa) carries a transglycosylase domain (residues 490–600), including an N-acetyl-d-glucosamine binding site and the catalytic residue E512. The muralytic activity of gp38 of phages 14-1 and LBL3 was demonstrated using a zymogram assay, showing specific degradation of peptidoglycan of autoclaved P. aeruginosa cells upon renaturation of gp38 (Fig. 3B). These murein hydrolases are widespread in the virions of bacteriophages and assist efficient DNA transfer to the host cell, although they are in many cases not required for phage infection (Rydman and Bamford, 2000). The combination of this hydrolase activity with a major structural component of the phage is a typical solution to prevent enzyme diffusion in order to constrain its activity and preserve the viability of the cell (Moak and Molineux, 2004). The exact location of gp38 in the phage particle cannot be derived from this functional analysis, since muralytic activity has been associated with the baseplate of T4 (Kao and McClain, 1980), the internal virion protein of T7 (Molineux, 2001), the tape measure protein of TM4 (Piuri and Hatfull, 2006) and the tail fibre protein of T5 (Boulanger et al., 2008).
Curiously, gp66 was also detected as a structural protein in both the PB1 and LBL3 particle. This 33 kDa protein is encoded in the DNA replication region and consists for 50% out of strongly basic (K,R) and strongly acidic (D,E) amino acid residues. This bizarre and apparently highly charged protein can be linked to a bacterial cell wall-associated hydrolase, but its function in phage infection/development remains unknown.
Another remarkable feature of PB1-like phages is the observed variation in the amount of deoxyguanylate residues present in a G-rich stretch at the 5′ of ORF42 (bp 32–42), which encodes a 44.8 kDa protein belonging to the Baseplate_J family (COG 3299). Sequencing of individual (shotgun) clones encompassing this region identified between 7 and 9 G's in LMA2, while this amount varied between 9 and 10 in both PB1 and LBL3. Primer walking on purified phage DNA produced mixed signals downstream this G-stretch (Fig. S3). As a consequence, a −1 or +1 frameshift is necessary for translation of the full-length baseplate protein for a part of the phage population. To our knowledge, this kind of variation between 8 and 10 G residues has only been reported for the bpm gene of the almost identical Bordetella phages BPP-1, BMP-1 and BIP-1 (Liu et al., 2004). Programmed translational frameshifting is well documented in dsDNA bacteriophages (e.g. Fortier et al., 2006), but occurs mostly at the C-terminus, to produce two overlapping proteins in appropriate ratio (Zimmer et al., 2003). In this case, a frameshifting event at the N-terminus might serve as a control point to regulate the levels of gp42 in the course of phage infection.
Host lysis. All analysed phages use a classic two-component cell lysis system. The endolysin (gp46) is encoded at the end of the structural module, and carriers a single β-1,4-N-acetyl-d-glucosaminidase domain (Pfam pf00182, e = 4E-03) which presumably cleaves bonds in the aminosugar backbone of the peptidoglycan. The endolysins lack a secretory signal and a cell wall-binding domain, suggesting they accumulate in the cytoplasm, relying on their cognate holin to access the peptidoglycan layer (Young et al., 2000). Despite the lack of sequence similarity of gp50 to known proteins, its genomic location nearby the putative endolysin, its small size and three transmembrane domains with an N-terminal inside topology strongly suggest a function as a class I holin (Wang et al., 2000).
The PB1 viruses
The analysis of genome and particle content of five P. aeruginosa-infecting Myoviridae, originating from different European countries, revealed a tight evolutionary relationship between them. Together with the long-known phage F8 and BcepF1, isolated from agricultural soils in New York, they constitute the environmentally widespread genus of the PB1 viruses. This genus is marginally related to Burkholderia phage Bcep1A, and even more distantly to the Bcep781 viruses.
Considering only the PB1 viruses infecting P. aeruginosa, the extent of conservation is truly remarkable: all phages exhibit > 85% nucleotide identity and encode a completely conserved genome core region which extends from ORF4 to ORF88. When including the more distantly related phage BcepF1 in the comparison, genome conservation is limited to the modules involved in virion formation and DNA replication (ORF17 to 74).
In contrast to phages like λ, which can apparently easily replace essential genes with completely unrelated sequences that are functionally equivalent (Lawrence et al., 2002), this ability seems severely impaired in the PB1 viruses. This limited lateral gene transfer and genetic mosaicism in phage genomic core is also found in the T4 subgenus (Filée et al., 2006; Comeau et al., 2007), and was linked to the great complexity of these phages in their multitude of protein–protein interactions between the constituents of the virion (Leiman et al., 2003) and replication complex (Karam and Konigsberg, 2000).
It would be very interesting to find out whether PB1-like phages, like T4-related viruses, are also capable of successfully infecting a broad range of hosts. The small variation in genome size (65–72 kb) and the lack of mobile elements (introns/HNH endonucleases) might suggest however that drastic expansion of the core genome is not possible.
PB1 was purchased from The Félix d'Hérelle Reference Center for Bacterial Viruses. Phages LBL3, LMA2, SN and 14-1 were isolated by spotting of a cleared and filtered (0.45 μm) water sample on lawns of P. aeruginosa. The phages were amplified on solid plate cultures, concentrated by PEG8000 precipitation and purified by two successive CsCl gradient centrifugations. Phage DNA was isolated using the SDS-proteinase K protocol of Sambrook and Russell (2001). To determine the physical genomic ends, a total of 40 μg of genomic DNA was digested with Bal 31 (0.5 units per μg) (New England Biolabs) at 30°C. Samples were removed at 0, 10, 20 and 40 min after the addition of the enzyme. All samples were purified by phenol-chloroform extraction and ethanol precipitation, and digested with SalI. The pH stability of the phage particles was tested by incubation of 108 pfu in 1 ml of pH buffer (150 mM KCl, 10 mM KH2PO4, 10 mM NaCitrate, 10 mM Boric acid) adjusted to different pH levels with NaOH or HCl, for 1 h at room temperature. After incubation, phages were plated out and titres were compared with non-treated samples. For electron microscopy, purified viruses were washed in 0.1 M ammonium acetate (pH 7.0) and pelleted by centrifugation for 1 h at 25 000 g. Phage particles were deposited on carbon-coated copper grids, stained with 2% uranyl acetate (pH 4.5) and examined in a Philips EM 300 electron microscope.
Phages LBL3, LMA2, 14-1 and SN were sequenced by a combination of shotgun sequencing (pJET vector, Fermentas) and primer walking. DNA of PB1 was submitted to the McGill University and Génome Québec Innovation Centre (Montréal, QC, Canada) and sequenced using 454 Technology. In silico characterization was largely carried out as described by Ceyssens and colleagues (2008); annotation of PB1 and 14-1 was performed using Kodon (Applied Maths, Austin, TX). Prediction of structural motifs and signal peptides was performed by using the COILS (Lupas et al., 1991) and SignalP (Bendtsen et al., 2004) algorithms.
Analysis of structural proteins
Phage proteins were extracted from purified virions (1011 pfu) using a methanol/chloroform extraction (1:1:0.75, v/v/v). The protein pellet was re-suspended in SDS-PAGE loading buffer (Moak and Molineux, 2004) and boiled for 5 min before loading onto a 12% polyacrylamide gel. Either the whole lane (SN and LBL3) or specific bands (PB1) were analysed by ESI-MS/MS as described elsewhere (Lavigne et al., 2006). For zymogram analysis, gels were cast containing 0.01% SDS and 350 mg dried and autoclaved P. aeruginosa cells. After electrophoresis, zymograms were washed for 30 min with water and then soaked for 3 days at room temperature in three buffers containing 150 mM sodium phosphate, pH 6.0–7.0, 1% to 0.001% Triton X-100 and 0–50 mM MgCl2. Zymograms were stained for 3 h with 0.1% methylene blue in 0.001% KOH and destained with water. Peptidoglycan hydrolase activity was detected as a clear zone in a dark blue background.
We are grateful to Professor Hans-W. Ackermann for expert electron microscopy. P.-J.C. holds a predoctoral fellowship of the ‘Instituut voor de aanmoediging van Innovatie door Wetenschap en Technologie in Vlaanderen’ (I.W.T., Belgium). R.L. holds a postdoctoral fellowship of the ‘Fonds voor Wetenschappelijk Onderzoek-Vlaanderen’ (FWO-Vlaanderen, Belgium). This work was financially supported by the research council of the K.U.Leuven (BIL/05/46). Laboratories 1, 2 and 3 are part of the research community ‘Phagebiotics’, funded by the F.W.O. Vlaanderen. A.M.K. is supported by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada; and would like to acknowledge the technical assistance of Erika Lingohr and Andre Villegas.