Molecular analysis of Methanobacterium phage ΨM2


Thomas Leisinger E-mail; Tel. (1) 632 3324; Fax (1) 632 1148.


The methanogenic archaeon Methanobacterium thermoautotrophicum Marburg is infected by the double-stranded DNA phage ΨM2. The complete phage genome sequence of 26 111 bp was established. Thirty-one open reading frames (orfs), all of them organized in the same direction of transcription, were identified. On the basis of comparison of the deduced amino acid sequences to known proteins and by searching for conserved motifs, putative functions were assigned to the products of six orfs. These included three proteins involved in packaging DNA into the capsid, two putative phage structural proteins and a protein related to the Int family of site-specific recombinases. Analysis of the N-terminal amino acid sequences of three phage-encoded proteins led to the identification of two genes encoding structural proteins and of peiP, the structural gene of pseudomurein endoisopeptidase. This enzyme is involved in the lysis of host cells, and it appears to belong to a novel enzyme family. peiP was overexpressed in Escherichia coli, and its product was shown to catalyse the in vitro lysis of M. thermoautotrophicum cells. Comparison of the phage ΨM2 DNA sequence with parts of the sequence of the wild-type phage ΨM1 suggests that ΨM2 is a deletion derivative, which formed by homologous recombination between two copies of a direct repeat.


Because of their essential role in the final steps of anaerobic mineralization of organic matter and because they produce the fuel and greenhouse gas methane, the strictly anaerobic methanogens are among the best-studied Archaea with respect to biochemistry (Ferry, 1992; Weiss and Thauer, 1993) and genome analysis (Reeve, 1992). However, functional genetics of methanogenic Archaea is still in its infancy. For example, there is no information available about origins of DNA replication, and our understanding of mechanisms governing gene expression in these organisms is rudimentary. Systems for genetic transformation have been described for some methanogens (Gernhardt et al., 1990; Conway de Macario et al., 1996; Metcalf et al., 1997; Whitman et al., 1997), but not for Methanococcus jannaschii or Methanobacterium thermoautotrophicumΔH whose complete genome sequences are known (Bult et al., 1996; Smith et al., 1997). The need for systems to enable studies on the effects of genetic manipulation in vivo has led to an intensive search for extrachromosomal elements with potential for use as cloning vectors. As a result, the complete DNA sequences of plasmids of methanogens have been determined (Bokranz et al., 1990; Nölling et al., 1992; Stettler et al., 1994; Bult et al., 1996), and these efforts have led to the development of shuttle vectors replicating in Escherichia coli and in Methanococcus maripaludis (Tumbula et al., 1997) or in representatives of the genus Methanosarcina (Metcalf et al., 1997). In addition, some phages of methanogens, such as viruses of Methanobrevibacter smithii (Baresi and Bertani, 1984; Knox and Harris, 1986), Methanococcus voltae strains (Bertani, 1989; Wood et al., 1989) and Methanobacterium strains (Meile et al., 1989; Nölling et al., 1993) have been described. The best characterized among these archaeophages is phage ΨM1, a virus of the thermophilic methanogen Methanobacterium thermoautotrophicum Marburg.

Phage ΨM1 is a virulent, double-stranded DNA phage with a polyhedral head of 55 nm diameter and a tail of 210 nm in length (Meile et al., 1989). Phage particles have been shown by electron microscopy to contain 30.4 ± 1.0 kb of DNA, and restriction analysis indicated a genome size of 27.1 kb. The ΨM1 genome is thus circularly permuted, exhibits terminal redundancy of approximately 3 kb and is packaged from concatameric precursors (Jordan et al., 1989). It was also noted that about 15% of the viral particles contain concatemers of the cryptic 4.5 kb plasmid pME2001 carried by M. thermoautotrophicum Marburg, the sole known host of phage ΨM1 (Meile et al., 1989). The latter observation has led to the postulation that ΨM1 is capable of generalized transduction, an assumption that has been verified experimentally for some genetic markers (Meile et al., 1990).

During the propagation of phage ΨM1 under laboratory conditions, the spontaneous deletion mutant ΨM2 becomes predominant. It lacks a 0.7 kb fragment of DNA, which is located at co-ordinate 23.25 kb of the ΨM1 restriction map (Jordan et al., 1989) and in the following is designated as DR1. As ΨM2 is more stable than the wild-type phage ΨM1, the present study focuses on phage ΨM2 and the mechanism of its formation from its predecessor ΨM1. As a basis for comparison with other viral genomes and for gene expression studies in a methanogen, we report here the complete nucleotide sequence of archaeophage ΨM2 and the functional assignment of some of its genes.


Nucleotide sequence determination of the phage ΨM2 genome and of the supplementary DNA fragment DR1 of phage ΨM1

Both strands of the entire archaeophage ΨM2 genome and of the DNA element DR1 of the wild-type phage ΨM1 were sequenced with a redundancy of at least three (GenBank accession numbers AF065411 and AF065412 respectively). The assembled linear DNA sequence of ΨM2 has a length of 26 111 bp, and that of element DR1 of the wild-type phage ΨM1 extends over 692 bp. Their predicted restriction sites based on the determined DNA sequences were in full agreement with the published restriction map of phage ΨM1 (Jordan et al., 1989) as well as with additionally performed DNA digestion experiments (data not shown).

Properties of ΨM2 DNA

DNA digestions with the adenine-methylation-sensitive endonucleases ClaI, DpnI, EcoRI, EcoRV, MboI, PstI, SalI and XhoI as well as with the cytosine-methylation-sensitive restriction enzymes BamHI, HaeIII, HpaII, MspI, NaeI and SmaI indicated that there is no cytosine and adenine methylation of ΨM2 DNA (data not shown).

The overall G + C content of the archaeophage ΨM2 genome is 46.3%. This is somewhat lower than the 48% G + C determined for M. thermoautotrophicum Marburg (Touzel et al., 1992), the sole known host of archaeophages ΨM1 and ΨM2. The G + C content of the ΨM2 genome is not evenly distributed, with four low G + C (< 40%) DNA regions of at least 600 nucleotides extended over parts of the open reading frames orf3, orf4 and orf6 and over the entire open reading frames orf5, orf29 and orf30.

About 75% of the regions with at least two perfect direct repeats (≥ 7 nucleotides) per 100 nucleotides were found to cluster in four regions of the ΨM2 genome, between nucleotides 2250 and 3400, 4180 and 4950, 13 630 and 14 690 and 23 400 and 25 590. All four regions contain orfs, and the first and the last ones coincide with two of the low G + C regions mentioned above. In the case of the region spanning nucleotides 4180–4950, it has been shown previously that it harbours the pac locus of phage ΨM2 (Jordan et al., 1989).

General features of the ΨM2 coding regions

By applying the search method outlined in Experimental procedures, 31 protein coding regions of at least 90 amino acids in length (= molecular mass of 10 kDa) were identified in the forward strand. Nine open reading frames identified in the reverse strand were overlapping with putative genes in the forward strand. As their translation products showed no similarities with protein sequences stored in public databases, they were not taken into consideration further. All open reading frames on the forward strand of phage ΨM2 were preceded by a potential ribosome binding site with at least 45% identity to the proposed consensus sequence 5′-AGGAGGTGATC-3′ (Brown et al., 1989). Archaeal promoters are defined by a highly conserved box A (5′-WTAWW-3′) located 27 ± 4 bp upstream of the transcription start and by a box B consisting of a pyrimidine at the transcription start followed by a purine (Zillig et al., 1993). Based on these criteria, a promoter could be found for one-third of the open reading frames, suggesting that the putative genes orf3 to orf4, orf8 to orf10, orf12 to orf19 and orf20 to orf27 might be co-transcribed (see Fig. 1). Additional support for co-transcription of two of the four postulated gene clusters (orf3 to orf4 and orf8 to orf10) was provided by the identification of terminators, which are structurally identical to the ones found in archaeophage SSV1 of Sulpholobus shibatae (Zillig et al., 1993), following orf4 and orf10.

Figure 1.

. Linear representation of the 26 111 bp archaeophage ΨM2 genome, showing the major restriction sites and the numbering system first used by Jordan et al. (1989). The locations of the pac site (Jordan et al., 1989) and of fragment DR1 of the wild-type phage ΨM1 (see below) are indicated by filled triangles. orfs are represented by boxes numbered as in Table 2. The shading and the vertical offset mark the gene location in the three possible reading frames. Probable or verified functions of the gene products of several predicted orfs are indicated. Positions of promoters (P) and terminators (T) identified by sequence comparison are shown.

With 46.7% the G + C content of the ΨM2 orfs is almost identical to the overall G + C content of the ΨM2 genome (46.3%), but lower than the G + C content of 50.1% observed in the 76 M. thermoautotrophicum Marburg genes sequenced so far. The open reading frames of phage ΨM2 are initiated at the codons ATG (77%), GTG (13%) or TTG (10%) and terminated at TGA (45%), TAA (36%) or TAG (19%). This distribution of start and stop codons differs from the one in M. thermoautotrophicum Marburg genes (Table 1). The difference in codon usage between phage ΨM2 and its host is caused, on the one hand, by a higher preference for guanine and cytosine in the first position of a codon in M. thermoautotrophicum Marburg. On the other hand, it is caused by the marked host preference at the wobble position for cytosine over thymidine and to an A to G ratio of one. As a consequence, phage ΨM2 has a less biased codon usage with only one rare codon (defined as a codon representing ≤ 5% of the codons for a specific amino acid), whereas nine rare codons are observed in genes of M. thermoautotrophicum Marburg.

Table 1. . Codon usage of archaeophage ΨM2 compared with M. thermoautotrophicum Marburg. The overall G + C content of all genes and the G + C content in each of the three positions of all codons is shown for the 31 predicted genes of archaeophage ΨM2, its three experimentally confirmed orfs and for the 76 identified genes of M. thermoautotrophicum Marburg. For the wobble position, the ratio between adenine/guanine and thymine/cytosine, respectively, are indicated. The last two columns compare the percentile distribution of the start and stop codons in the different organisms.Thumbnail image of

ORFs of ΨM2 with no known function

For determining the putative function of the predicted proteins of archaeophage ΨM2, their amino acid sequences were screened for similarities to sequences stored in public databases (see Experimental procedures). Table 2 lists the predicted orfs of archaeophage ΨM2, and Fig. 1 shows their arrangement on the genome map. Some 68% of the proposed proteins (21 ORFs) shared no significant similarity with any protein sequence stored in the public databases. For six of these proteins, encoded by orf7, orf24, orf25, orf26, orf27 and orf30, at least one membrane-spanning region is predicted by the program TMpred (Hofmann and Stoffel, 1993).

Table 2. . General features of the putative orfs of archaeophage ΨM2. a. Experimentally confirmed ORF. Sp, SWISSPROT; pir, PIR database; t, TREMBL. The start/stop positions are calculated from the first nucleotide of the first codon to the last nucleotide of the stop codon, using the numbering system of Jordan et al. (1989). Column F specifies the reading frame in which a specific orf can be found. The molecular mass of the proteins was calculated using a program by Bjellqvist et al. (1993). The analysis of the protein coding regions was performed as described in Experimental procedures. In the cases in which a related protein could be identified, the similarity to this protein is indicated as the percentage of amino acid identity calculated using the program GAP (Wisconsin package 8.1.0., Genetics Computer Group). Based on this analysis, possible functions of the gene products in ΨM2 are predicted.Thumbnail image of

ORFs of ΨM2 to which probable functions can be assigned

Positioned to the right of the pac site, three predicted coding regions, orf8, orf9 and orf10 form a gene cluster (Fig. 1). The putative gene product ORF9 exhibits similarity to the large subunit of the terminase of Bacillus subtilis bacteriophage SPP1 (Chai et al., 1992). Unlike the bacteriophage SPP1 large terminase subunit, ORF9 encompasses a putative ATP-binding motif (motif A) and a nucleotide-binding pocket (motif B). In several E. coli and B. subtilis bacteriophages, these features are found in the small terminase subunit (Black, 1989). ORF8 of the archaeophage ΨM2 genome appears to encode a small terminase subunit. It exhibits limited similarity to the N-terminus of the small terminase subunit of prophage PBSX of B. subtilis (McDonnell et al., 1994). A conserved characteristic of small terminase subunits is an N-terminally located helix–turn–helix motif. In the small terminase subunit of PBSX and in ORF8, this motif is located between positions 21 and 42. However, the sequences of the two helix–turn–helix motifs share only weak similarities to each other.

orf10, the third gene of the cluster to the right of pac, encodes a probable portal protein, which is similar to the probable portal protein of Haemophilus influenzae phage HP1 (Esposito et al., 1996). The location and arrangement of orf8, orf9 and orf10 compared with the highly conserved DNA packaging genes in bacteriophages (Casjens, 1990) underline the assumption that these genes code for the enzyme complex necessary for the cutting, maturation and packaging of phage genomes into the head of phage ΨM2.

About 1000 bp downstream of the genes for the presumptive packaging complex (Fig. 1), orf12 to orf19 are arranged in close proximity to each other. ORF12 and ORF13 are highly similar to the late gene products XkdF (probability with BLAST: 3 × 10−10) and XkdG (probability with BLAST: 1 × 10−22), respectively, of the B. subtilis prophage PBSX (McDonnell et al., 1994). The sequence property approach of Hobohm and Sander (1995) for searching protein databases revealed that ORF13 is similar to the major head protein of coliphage Φ80 (reliability of 87%). The isolation of structural proteins of phage ΨM2 gave additional indications that orf13 encodes a structural protein (see below). Therefore, orf12 to orf19 of phage ΨM2 may encode a set of structural proteins.

ORF21 is the largest putative gene product encoded by ΨM2 DNA, and it is proposed to represent a tail protein. In the middle of its 1186 amino acid sequence, ORF21 contains nine direct repeats of the sequence KIEFP. Database searches revealed that ORF21 has 19.9% amino acid identity to the minor tail protein of mycobacteriophage L5 (Hatfull and Sarkis, 1993) and 15.5% amino acid identity to the smooth muscle form of myosin heavy chain (Matsuoka et al., 1993). With a probability of 1 × 10−4 (BLAST), it is also homologous to the gene product XkdO of Bacillus prophage PBSX (McDonnell et al., 1994). These similarities are mainly found in the non-repetitive parts of ORF21. The repetitive part itself shows similarities with an adhesive mussel protein (probability with BLAST: 9 × 10−5).

Putative gene orf21 is immediately followed by orf22. Based on similarities and on motif searches, orf22 is thought to encode an ATP/GTP-binding protein.

ORF29 is the last putative gene product to which a probable function could be assigned by similarity searches as a DNA integrase. ORF29 exhibits significant similarities with representatives of the phage λ site-specific DNA recombinase family in the highly conserved integrase regions (probabilities of 8 × 10−7 in BLAST searches). The λ family of integrases, also known as tyrosine recombinases, is conserved in organisms ranging from Archaea to yeast. Its members catalyse intermolecular DNA rearrangements without energy input through covalent binding of the target DNA to a nucleophilic tyrosine residue in the active site pocket of the enzyme. They share four strongly conserved residues, including the active site tyrosine, and they exhibit various levels of identity in the C-terminal portion of their amino acid sequences (Esposito and Scocca, 1997). The residual parts of the integrase proteins are variable and do not contribute to the activity. In Fig. 2, the catalytic core of the putative ΨM2 integrase is compared with the most similar sequences, the core of the recombinase xerC of H. influenzae (Fleischmann et al., 1995) and the transposase TnpA of Staphylococcus aureus transposon Tn554 (Murphy et al., 1985), as well as with the consensus sequence of the integrase family derived from 88 prokaryotic recombinases, including four proteins whose three-dimensional structure has been determined (Nunes-Düby et al., 1998). With the exception of the first Arg (located in box 1) of the conserved tetrad Arg-212–His-308 –Arg-311–Tyr-342 (according to the λ integrase numbering), 85% of all conserved residues in the catalytic core of the Int family of proteins are also present in the C-terminal part of the proposed integrase of ΨM2.

Figure 2.

. Multiple sequence alignment of the conserved catalytic core of the putative ΨM2 integrase with various members of the λ integrase family and the consensus sequence derived from 88 prokaryotic recombinases (Nunes-Düby et al., 1998). The alignment considers the two regions of marked sequence similarity (Box I and Box II) and the sequence patches I–III, which together allow the identification of members of the Int family of site-specific recombinases. Amino acids boxed in black form the active centre of the integrases, whereas the open boxed (hydrophobic) and grey boxed (acid, basic, S/T, G/A) amino acids are conserved residues clustered around the active site pocket. The integrase sequences shown are from Haemophilus influenzae (xerC; accession number P44818), Staphylococcus aureus (Tn554; accession number 224807) and from archaeophage ΨM2 (ORF29; this work).

Experimental identification of structural proteins and the lytic enzyme of phage ΨM2

Separation of ΨM2 virion proteins by SDS–PAGE allowed the identification of three major protein bands with apparent molecular masses of 35 kDa, 20 kDa and 10 kDa, which were subsequently subjected to N-terminal sequencing. For each of these three bands, partial amino acid sequences of the two proteins encoded by genes orf13 and orf18 were obtained (see Table 3). For ORF13, the N-terminal sequences of the 20 kDa and the 10 kDa protein bands were identical with each other, but apparently both N-terminally and C-terminally processed when compared with the sequence found in the 35 kDa band (see Table 3). The size reduction of the 10 kDa band compared with the 20 kDa band must, therefore, result from further C-terminal processing. The same conclusion applies to the N-terminal sequences obtained for the three proteins derived from ORF18. Processing at the N-and C-termini of these proteins is consistent with their possible function as head proteins. In the process of forming the mature head, head proteins are often cleaved proteolytically (Black, 1989). Co-migration of two structural proteins and their processed products, as observed here, is reminiscent of coliphage HK97, for which it has been demonstrated experimentally that the head shell subunits are covalently cross-linked to each other (Popa et al., 1991). Similarly, discrepancy between the apparent molecular mass of (co-migrating) head shell proteins as determined by SDS gel electrophoresis and their predicted mass has been reported previously for several phages (Popa et al., 1991; Hatfull and Sarkis, 1993; Van Sinderen et al., 1996).

Table 3. . N-terminal sequences of ΨM2 structural proteins. a. X stands for cycles without a callable amino acid, and an amino acid with uncertain identification was put in parenthesis.Thumbnail image of

The deduced N-terminus of the protein encoded by orf28 exhibited similarity to the N-terminus (EVGLNEFLDMKKRYEDFK; P. Pfister, unpublished) of the pseudomurein-degrading enzyme produced by Methanobacterium wolfei (Fig. 3C). Stax et al. (1992) have shown previously that, similar to the M. wolfei enzyme, a partially purified preparation of the lytic enzyme in ΨM1 lysates cleaves the ε-Ala-Lys isopeptide bond of pseudomurein. The product of orf28 is thus a pseudomurein endoisopeptidase, and the structural gene of this enzyme will be referred to as peiP.

Figure 3.

. Experimental identification of the lytic enzyme pseudomurein endoisopeptidase of archaeophage ΨM2. A. The gene peiP coding for the pseudomurein endoisopeptidase was overexpressed in E. coli BL21 (DE3) cells by adding 0.2 mM IPTG at OD600 = 0.6 and subsequent growth overnight at 30°C (see Experimental procedures). Aliquots of 20 μg of the following crude extracts were loaded onto an SDS–PAGE gel: lane 1, E. coli BL21 (DE3) containing the cloned gene for the pseudomurein endoisopeptidase; lane 2, E. coli BL21 (DE3) with plasmid pET24a; and lane 3, E. coli BL21 (DE3). The arrowhead indicates the overexpressed pseudomurein endoisopeptidase. B. Lysis of M. thermoautotrophicum Marburg cells by the overexpressed pseudomurein endoisopeptidase during incubation at 60°C under anaerobic conditions (see Experimental procedures). The arrow indicates the time of addition of crude extract to the samples. In samples 1–3, 65 μg of the following E. coli crude extracts were injected: (1) E. coli BL21 (DE3) containing the cloned peiP gene; (2) E. coli BL21 (DE3) with plasmid pET24a; (3) E. coli BL21 (DE3). Sample 4 is a negative control containing only M. thermoautotrophicum Marburg cells. A decrease in optical density at 546 nm indicates cell wall-degrading activity. C. The experimentally determined N-terminal sequence of the pseudomurein endoisopeptidase of M. wolfei (P. Pfister, unpublished) and the deduced N-terminus of archaeophage ΨM2-encoded pseudomurein endoisopeptidase are compared. Identical amino acids are bold and underlined.

The gene peiP was cloned and overexpressed in E. coli grown under aerobic conditions (see Fig. 3A). To reactivate the oxygen-sensitive pseudomurein endoisopeptidase, crude extracts of E. coli were reduced overnight with 30 mM dithiothreitol (DTT) under a H2–CO2 atmosphere (80%:20%, v/v) before enzymatic assays. 3Figure 3B shows that the reactivated pseudomurein endoisopeptidase expressed in a heterologous host was able to lyse cells of M. thermoautotrophicum Marburg under anaerobic and reducing conditions.

The element DR1 of the wild-type phage ΨM1

As mentioned above, archaeophage ΨM2 resulted from a deletion event in the wild-type phage ΨM1. The DNA fragment DR1 of the wild-type phage, which is missing in ΨM2, was cloned in E. coli together with parts of its flanking regions. Subsequent sequencing showed that DR1 has a length of 692 bp and apparently represents an insertion at position 22 554 of the phage ΨM2 sequence (see Fig. 4). Upstream of DR1 and at its 3′ end, directly repeated 82-bp-long sequences were identified (repeat 2 in Fig. 4). This suggests that DR1 was lost by homologous recombination between the two copies of repeat 2.

Figure 4.

. Schematic representation of element DR1 of wild-type phage ΨM1 and the site of its deletion in the phage ΨM2 genome. A. Parts of element DR1 that are at least 97% identical with phage ΨM2 DNA are boxed in grey. The highly identical orfA and orf27 (78% amino acid identity) are shown by black arrows. Two major repeats are shown by arrows 1 and 2. Repeat 1 (17 bp) occurs twice in the element DR1, and one of its copies is located within a short region, Dra, which has no other homology to ΨM2 DNA. Repeat 2 of DR1 (B) and repeat 2 with some sequence 3′ of the deletion in the ΨM2 genome (C) are outlined. The point of deletion/insertion sites (vertical arrows) are located at the 3′ end of repeat 2. Box A of the putative promoter and the postulated ribosomal binding site of peiP are indicated in bold underlined or bold doubled underlined respectively.

Analysis of DR1 indicates that it may have resulted from a tandem duplication. The 692-bp supplementary element contains one putative gene, designated orfA. Gene orfA exhibits 84% identity at the DNA level with orf27, which in ΨM2 maps immediately upstream of the deletion site. It is likely that these duplicated regions of the ΨM1 genome diverged and that, subsequently, one of them (DR1) was deleted. As shown in 4Fig. 4C, this deletion event did not change either the promoter or the ribosome binding site of peiP in ΨM2.


The most extensively studied archaeal viruses are the temperate phages SSV1 of Sulfolobus shibatae and φH of Halobacterium salinarium, which have served as models for archaeal gene expression. Analysis of the phage transcripts produced upon induction of the prophage or by phage infection of the host has yielded information on the transcriptional organization and on the temporal expression of viral genes. Transcription start points were shown to be preceded by consensus sequences for archaeal promoters, and the organization of genes encoding functionally related proteins in transcription units has been demonstrated (Gropp et al., 1989; Palm et al., 1991). The entire 15.4 kb nucleotide sequence of phage SSV1 (Palm et al., 1991) as well as the 12 kb nucleotide sequence of the central region of phage φH (Gropp et al., 1992) have been reported. Sulfolobus shibatae, the host of SSV1, is a representative of the Crenarchaeota, which, together with the Euryarchaeota, form two major phylogenetic lineages among the Archaea (Woese et al., 1990). M. thermoautotrophicum Marburg, the host of phages ΨM1 and ΨM2, belongs to the Euryarchaeota. With respect to phylogenetic analysis, the nucleotide sequence of phage ΨM2 reported here thus fills a gap and offers the possibility for comparison of two complete viral genomes within the Archaea. However, as shown by the analysis of the ΨM2 nucleotide sequence reported in the Results, we have not found any indication for amino acid similarities between proteins encoded by the two archaeal phages SSV1 and φH in comparison with ΨM2. When sequence similarities of ΨM2 proteins to proteins in the databanks were observed, they were limited in most cases to proteins encoded by phages of Gram-positive bacteria. It has been proposed that prokaryotic genomes consist of two different groups of genes: the deeply diverging informational genes and the more recently diverging operational genes that, in the course of evolution, have been subject to horizontal transfer (Rivera et al., 1998). According to this view and in line with the modular theory of phage evolution (Botstein, 1980), the phage ΨM2 genes are representatives of the functional class of operational genes.

Some of the genes identified on the ΨM2 genome encode functions that have to be postulated from what is already known about this virus. This applies to the proteins of the headful packaging machinery, which, in accordance with results of restriction analyses (Jordan et al., 1989), support packaging of concatemeric ΨM1 DNA in a processive series of cuts that are initiated at the pac site. ORF8 and ORF9 are thought to represent the small and the large subunit of a phage terminase that cuts at pac, and ORF10 is postulated to function as a portal protein through which phage DNA enters the procapsid. The genes encoding ORF8, ORF9 and ORF10 are contiguous, and the pac locus lies within the coding region of ORF8. This organization bears striking similarity to that of the packaging genes of the Salmonella typhimurium phage P22, a temperate phage whose encapsidation functions have been studied in some detail (Casjens, 1990).

In accordance with the detection of a lytic enzyme in ΨM1 lysates of M. thermoautotrophicum (Stax et al., 1992), the ΨM2 genome also contains peiP, the structural gene of pseudomurein endoisopeptidase. As database searches have not yielded proteins with amino acid sequence similarities to this enzyme, pseudomurein endoisopeptidase appears to define a novel enzyme family. Whereas about 12% of the orfs of M. thermoautotrophicumΔH encode polypeptides with amino-terminal sequences consistent with signal peptides (Smith et al., 1997), pseudomurein endoisopeptidase lacks such a signal. In common with lysis systems of characterized bacteriophages (Young, 1992), this implies that ΨM2 must encode a holin, i.e. a protein enabling the passage of pseudomurein endoisopeptidase through the cytoplasmic membrane, so that it can reach its cell wall target. As holins encoded by different phages have no detectable sequence similarity, it is likely that the presumptive holin gene of the ΨM2 genome has escaped identification so far.

The finding that orf 29 of the phage ΨM2 genome encodes a putative protein related to the Int family of site-specific recombinases was unexpected. Phage-encoded representatives of this class of enzymes are necessary for site-specific integration into and excision of viral genomes out of the genomes of their respective hosts in a variety of temperate bacteriophages (Nash, 1996) and in archaeophage SSV1 of S. shibatae (Muskhelishvili et al., 1993). They are not found in virulent phages. As infection of M. thermoautotrophicum Marburg with phage ΨM1 consistently led to lysis of the host and as none of the phage-resistant M. thermoautotrophicum mutants tested carried phage-related DNA, it was concluded that ΨM1 is a virulent phage (Meile et al., 1989). However, the presence of a putative integrase gene now argues for ΨM1 being a temperate phage. At the genome level, this is supported by the apparent absence of a gene for a DNA polymerase, a function encoded on all genomes of virulent, double-stranded DNA phages, and by the observation of generalized transduction of genetic markers by phage ΨM1 (Meile et al., 1990). Furthermore, Methanobacterium wolfei, a methanogen related to M. thermoautotrophicum Marburg, has been shown to carry a stably integrated defective prophage in its chromosome. DNA–DNA hybridization studies revealed that the genome of this element is closely related to phages ΨM1 and ΨM2, but encompasses only 80–90% of phage DNA, because it lacks or is not homologous in two non-contiguous segments of ΨM1 DNA (Stettler et al., 1995). It appears that one of these non-homologous regions covers part or all of the pseudomurein endoisopeptidase gene. It will now be interesting to examine the DNA sequences flanking the prophage DNA in M. wolfei. They might reveal direct repeats typically derived from integrative recombination of phage DNA with the host chromosome and thereby help us to identify the substrates of the phage ΨM2-encoded putative DNA integrase.

On the other hand, ORF29 of ΨM2 might be no integrase, as it shares the highest similarities with chromosomally encoded recombinases, such as XerC of H. influenzae (Fleischmann et al., 1995), which are necessary for the maintenance of replicons in a monomeric state and for correct plasmid inheritance. Most importantly, the protein coded by orf29 contains a Gly at the position of the first Arg of the conserved catalytic triad Arg-212–His-308–Arg-311 (according to the λ integrase numbering). In representatives of the Int family of site-specific recombinases, this amino acid change leads to loss of enzymatic activity (Nunes-Düby et al., 1998). Therefore, three hypotheses can be proposed to account for these observations: (i) the putative gene product of orf29 is an inactive integrase because of the change of the conserved Arg-212 into a Gly (according to the λ integrase numbering); (ii) ORF29 is active and required to separate dimeric phage replicons at replication, but does not fuse host and phage replicons to produce lysogens; (iii) ΨM1 and ΨM2 are temperate phages encoding the first example of an integrase that is acitve without the conserved arginine. If this is true, ΨM1 and ΨM2 may be unable to lysogenize their host under the laboratory conditions used for their propagation. Characterization of the putative integrase of ΨM2 is expected to expand our understanding of evolution within the Int family of site-specific recombinases.

Experimental procedures

Media, bacterial strains, phages and plasmids

Methanobacterium thermoautotrophicum Marburg (DSM2133) (Fuchs et al., 1978) was used for the propagation of bacteriophages ΨM1 (Meile et al., 1989) and ΨM2 (Jordan et al., 1989) by the liquid lysate technique described earlier (Meile et al., 1995). M. thermoautotrophicum Marburg was grown in serum flasks containing minimal medium (Schönheit et al., 1980) supplemented with 8 μM Na2WO4 and 5.8 μM Na2Se2O3. The flasks were gassed with 2 bar of H2–CO2 (80:20, v/v). E. coli strains DH5α (Hanahan, 1983), JM109, XL-1 blue (Stratagene) and BL21 (DE3) (Studier, 1989) were grown in liquid LB broth (Sambrook et al., 1989) or in LB broth solidified with 2% (w/v) agar. Ampicillin, kanamycin, Xgal (Sigma Chemical) and IPTG (Biosynth) were used at final concentrations of 100 μg ml−1, 25 μg ml−1, 0.002% (w/v) and 1 mM respectively. For the subcloning of ΨM2 DNA, plasmids pUC28/29 (Benes et al., 1993), pBluescript KS/SK (Stratagene) and pGEM-7Zf (Promega) were used. The pseudomurein endoisopeptidase gene was overexpressed in vector pET24a (Novagen).

DNA amplification and sequencing

Unless otherwise indicated, the standard methods of Sambrook et al. (1989) were used. Phage DNA was isolated as described previously (Jordan et al., 1989). Competent cells of E. coli were prepared according to Inoue et al. (1990) and transformed by the technique of Pope and Kent (1996). Restriction enzymes and T4 DNA ligase were used according to the instructions of the supplier (MBI Fermentas, Boehringer Mannheim). High-performance liquid chromatography (HPLC)-purified oligonucleotides used for polymerase chain reaction (PCR) and sequencing were purchased from Microsynth.

PCR reactions for DNA amplification were performed as follows: after heating the samples for 5 min at 94°C, the target DNA was amplified with 35 subsequent cycles at 94°C for 1 min, 40°C for 1 min and 72°C for 1 min with a 10 min extension. PCR fragments were then separated in and isolated from agarose gels.

Both strands of the complete ΨM2 genome and of the ΨM1 fragment DR1 with its flanking regions, subcloned as overlapping fragments in E. coli, were sequenced by cycle sequencing. The double-stranded DNA was denatured for 5 min at 95°C followed by 60 cycles at 95°C for 36 s, 55°C for 36 s and 72°C for 84 s. For the DNA polymerization reactions, the enzyme Thermosequenase (Amersham) was used. The cycle sequencing reactions were loaded either on an automatic sequencer (ABI model 373A) or on a direct blotting electrophoresis apparatus (GATC) with subsequent membrane development using the DIG detection protocol of Boehringer Mannheim.

Analysis of the nucleotide sequence of phage ΨM2

Single sequences from sequencing runs were assembled with the program SEQMAN of the DNASTAR software package (DNASTAR), and the nucleotide sequence was translated in all six reading frames by TRANSLATE (Wisconsin package 8.1.0., Genetics Computer Group). The search for orfs was performed manually using the following assumptions: an orf should encode for a protein of at least 90 amino acids and be preceded by a ribosomal binding site with at least 45% identity to the proposed consensus sequence 5′-AGGAGGTGATC-3′ (Brown et al., 1989). Identified ORFs were then subjected to a homology search against all non-redundant GenBank translations and the databases PDB, SWISSPROT and PIR using the FASTA program (Wisconsin Package 8.1.0; Pearson and Lipman, 1988), different BLAST algorithms such as BASIC BLAST, GAPPED BLAST and PSI BLAST at NCBI (Altschul et al., 1990; 1997) and WU BLAST at EBI (Warren Gish, unpublished), as well as PROPSEARCH at EMBL (Hobohm and Sander, 1995). Different motif search routines for searching through the PROSITE database together with a transmembrane helix prediction program (TMpredict) (Hofmann and Stoffel, 1993) and a program to find helix–turn–helix motifs from Wisconsin package 8.1.0. completed the analysis by the end of January 1998. Finally, the similarity was calculated for the significant similarities using the program GAP (Wisconsin package 8.1.0., Genetics Computer Group).

Isolation and determination of structural proteins

Crude phage lysate, prepared as described above, was supplemented with NaCl to a final concentration of 0.5 M and stored for 1 h at 4°C. After clarification of the lysate by centrifugation (10 min at 9000 g, 4°C), the supernatant was incubated for 1 h at 37°C with 0.05 mg ml−1 DNase and 0.02 mg ml−1 RNase A. Phage particles were precipitated overnight at 4°C by the addition of PEG solution to 1× (5× PEG solution: 207 g of polyethylene glycol 6000; 6 g of dextran sulphate; 49.5 g of NaCl; 350 ml of water). Phage particles were collected by centrifugation for 20 min at 9000 g, 4°C, resuspended in 10 ml of SM buffer (5.8 g of NaCl; 2 g of MgSO4 × 7 H2O; 50 ml of 1 M Tris-HCl, pH 7.5, and 1 l of water) and then treated with 1 volume of chloroform to remove the excess polyethylene glycol. After a second centrifugation step, the water phase containing the phage particles was purified further by equilibrium centrifugation for 48 h at 40 000 g, 18°C in a CsCl gradient with a density adjusted to 1.4 g cm−3. The resulting gradient was collected in 1 ml fractions. The phage-containing fractions, determined by DNA isolation according to Jordan et al. (1989), were concentrated and washed by ultrafiltration through Vivaspin concentration tubes (molecular mass cut off at 10 kDa; Vivascience). Structural proteins were isolated by phenol extraction according to Sauvéet al. (1994) and subsequently subjected to SDS–PAGE according to the method of Schaegger and von Jagow (1987). The proteins were blotted onto a PVDF membrane (Millipore) with an electrophoretic transfer cell from Bio-Rad according to the manufacturer's instructions. After staining with Coomassie brilliant blue R, the protein bands were excised from the blot for N-terminal amino acid sequencing by automated Edman degradation.

Overexpression and activity measurement of the pseudomurein endoisopeptidase

The gene for the lytic enzyme of phage ΨM2 was amplified by PCR (conditions above) using the oligonucleotide Psi1 (5′-GGA GGG CCA CAT ATG AGA TC-3′; NdeI site underlined) and Psi2 (5′-TGC CCA AGC TTC TTT TTT C-3′; HindIII site underlined). The PCR product was digested with NdeI and HindIII and cloned into the expression vector pET24a (Novagen). For pseudomurein endoisopeptidase expression, an exponentially growing culture of E. coli BL21 (DE3) containing the cloned peiP gene was grown at 30°C to an OD600 of 0.6. At this stage, the culture was induced with 0.2 mM IPTG and incubated for 12 h. The E. coli cells were collected by centrifugation for 10 min at 5000 g, 4°C, washed once with 20 mM potassium phosphate buffer, pH 8.0, and broken in a French pressure cell. Clarified crude extract obtained after 10 min centrifugation at 16 000 g, 4°C, was moved into an anaerobic chamber (Coy Instruments) filled with a gas atmosphere of N2–H2 (95%:5%, v/v), reduced by the addition of 30 mM DTT, 2 mM MgCl2 and 2 bar H2–CO2 and stored at 4°C overnight.

The cell wall-degrading activity was monitored by the decrease in optical density at 546 nm of a cell suspension of M. thermoautotrophicum Marburg over time. The enzymatic assay was carried out anaerobically in a Varian Cary 1E spectrophotometer at 546 nm and at 60°C in stoppered 1.5 ml plastic cuvettes containing 1 ml of a buffer containing 20 mM potassium phosphate, pH 7.0, 30 mM DTT, 2 mM MgCl2, 108M. thermoautotrophicum Marburg cells and 65 μg of E. coli crude extract.


We thank D. Esposito for helpful comments and advice. This work was supported by grants 31-40775.94 and 31-50593.97 from the Swiss National Foundation for Scientific Research.