Although the genomic sequences of a number of Archaea have been completed in the last three years, genetic systems in the sequenced organisms are absent. In contrast, genetic studies of the mesophiles in the archaeal genus Methanococcus have become commonplace following the recent developments of antibiotic resistance markers, DNA transformation methods, reporter genes, shuttle vectors and expression vectors. These developments have led to investigations of the transcription of the genes for hydrogen metabolism, nitrogen fixation and flagellin assembly. These genetic systems can potentially be used to analyse the genomic sequence of the hyperthermophile Methanococcus jannaschii, addressing questions of its physiology and the function of its many uncharacterized open reading frames. Thus, the sequence of M. jannaschii can serve as a starting point for gene isolation, while in vivo genetics in the mesophilic methanococci can provide the experimental systems to test the predictions from genomics.
Methanogens are ubiquitous microorganisms that catalyse the terminal step in the anaerobic food chain by reducing simple compounds to methane (reviewed in Zinder, 1993). Members of the genus Methanococcus are marine methanogens of low mol% G + C content that are limited to growth on either CO2 and H2 or formate. Nearly 30 isolates of the four mesophilic species have been described (Keswani et al., 1996). The mesophilic species are not closely related to the thermophilic and hyperthermophilic methanococci, which should probably be reclassified into a novel genus and family respectively.
The development of genetic systems in the methanococci is of special interest. Methanococci are Archaea, one of the three domains of life in addition to the Bacteria and Eukarya (Woese et al., 1990). Although Archaea are clearly a distinct line of descent (Woese et al., 1990), many aspects of their physiology and biochemistry are poorly understood in comparison with that of members of the other two domains (as most dramatically illustrated by Bult et al., 1996). Because they are one extreme of prokaryotic diversity, additional characterization of the Archaea will provide important information on the nature of the ancient ancestor of the prokaryotes. In addition, many features of archaeal transcription, translation and replication are similar to eukaryal processes (reviewed by Reeve, 1992). As the archaeal systems are expected to be simpler than the eukaryal systems, many basic aspects can be studied directly in the Archaea. Lastly, about 1% of the carbon fixed by plants each year is processed by methanogens. Because of its significance in the carbon cycle, the physiology and biochemistry of methanogenesis is also of special interest. Thus, many central issues in the biology of Archaea remain to be solved, and the methanococci may prove to be useful organisms in which to study these questions.
Until very recently, studies on methanogens and other Archaea have largely been performed without genetic methods. Most of the isolated genes were obtained by screening libraries with probes based upon purified proteins or, more rarely, by complementation of mutations in Escherichia coli (reviewed by Reeve, 1992). Genetic methods for methanogens have developed slowly for a number of reasons. Methanogens are strictly anaerobic lithotrophs. For many species, growth is slow, and cultivation is labour intensive. Like other archaea, methanogens lack sensitivity to many of the antibiotics that are used as genetic markers (Bock and Kandler, 1985). Because they use different expression signals from bacteria, resistance genes derived from the bacteria must be engineered with the archaeal transcription signals. Likewise, even broad-host-range vectors that are of great utility in bacteria do not replicate in archaea, presumably because of differences in DNA replication. In fact, although great progress has been made recently, very few genetic systems are available in Archaea (Noll and Vargas, 1997).
In spite of these difficulties, genetic systems have great potential in archaeal research. Although progress has been made in the last two decades, large gaps exist in our knowledge of archaeal physiology, molecular biology and biochemistry. The availability of genomic sequences provides one of the best estimates of how little we actually know. For instance, in M. jannaschii, functional assignments can be made for less than half of the open reading frames (ORFs) identified in the genomic sequence, and many of these assignments indicate only the general nature of the proposed genes and not specific functions (Bult et al., 1996). A metabolic reconstruction based upon this sequence contains numerous incomplete pathways, implying that many basic biosynthetic genes are yet to be identified (Selkov et al., 1997). Genetic techniques, representing some of the most powerful investigative tools in biology, will surely play an important role in elucidating the nature of these organisms. Given the tremendous gaps in our knowledge of archaea, the choice of the particular archaeon to study should be guided by the experimental systems available in the organism. Thus, rapid progress in addressing fundamental questions in archaeal biology can be made if experimentally tractable organisms are chosen for study.
Methanococci possess particular advantages for genetic studies. The methanococci are among the fastest growing mesophilic methanogens, and a typical generation time for autotrophic growth is 2 h. Unlike many methanogens, which are either strict autotrophs and take up organic substrates poorly or have very complex nutritional requirements, many methanococci are facultative autotrophs that readily take up amino acids and other organic substrates from the medium, which facilitates labelling studies and the isolation of mutants. Because of early work on media and protocol development, methanococci can be plated with efficiencies near 100% as well as cultivated in fermentors at the 400-litre scale. Unlike many methanogens that possess complex and poorly characterized cell envelopes, the cell walls of methanococci are composed of a simple protein S-layer, and cells lyse in low-ionic-strength buffer or very low concentrations of detergents (Whitman et al., 1986). This property facilitates isolation of DNA and other cellular components.
Current genetic tools and methods
A polyethylene glycol method was optimized for Methanococcus maripaludis, which yielded over 105 transformants μg−1 an integrative plasmid (Tumbula et al., 1994). Electroporation of Methanococcus voltae protoplasts stabilized with 1% BSA yielded nearly 200 transformants μg−1 circular plasmid DNA and 3500 transformants μg−1 linearized DNA (Patel et al., 1994; Berghofer and Klein, 1995; Jarrell et al., 1996a). Although a phage has been described in M. voltae, transduction has not yet been found (Wood et al., 1989; and unpublished data).
Markers, selections, reporter genes and vectors
Currently, the most widely used marker in the methanogens is the puromycin resistance marker, or pac cassette, developed by Klein, Sibold and coworkers (Gernhardt et al., 1990). The cassette consists of the puromycin transacetylase (pac) gene from Streptomyces alboniger linked to the transcriptional promoter and terminator from the methylreductase (mcr) operon of M. voltae. This cassette confers puromycin resistance to other methanococci as well as Methanosarcina spp. (Blank et al., 1995; Metcalf et al., 1997). A similar cassette conferring neomycin resistance in M. maripaludis has also been constructed (Argyle et al., 1995). Lastly, the introduction of a site-specific deletion into the hisA gene of M. voltae allowed for the selection of histidine prototrophy upon transformation with plasmids bearing the wild-type gene (Pfeiffer et al., 1998a). Thus, several genetic markers are available in the methanococci.
A self-replicating shuttle vector conferring puromycin resistance has been developed for M. maripaludis (Tumbula et al., 1997a). The vector, pDLT44, was constructed by ligating partial EcoRI digests of pURB500, a cryptic methanococcal plasmid from M. maripaludis C5, and pMEB.2, an E. coli pUC vector containing the pac cassette. pDLT44 transformed M. maripaludis with frequencies in excess of 107 transformants μg−1 DNA. These frequencies are sufficient to screen libraries of randomly cloned DNA. Recently, a strong promoter and ribosome binding site from the gene encoding the M. voltae histone have been introduced upstream of a multiple cloning site in pDLT44 (W. Gardner and W. B. Whitman, unpublished results). The resulting vector can express heterologous proteins that, because of their unusual coenzymes or oxygen sensitivity, are not expressed in an active form in bacteria or Eukarya. Currently, the host range of pDLT44 in methanogens is limited to M. maripaludis, and it fails to transform M. voltae (Tumbula et al., 1997a).
Because Archaea lack murein in their cell walls, the penicillin and cycloserine selections for auxotrophic mutants are ineffective. An alternative method of selection for non-growing cells is based upon the bacteriocidal activity of the nucleobase analogues azahypoxanthine, azaguanine and azauracil (Bowen and Whitman, 1987). Using this method, 104-fold enrichments of acetate auxotrophs of M. maripaludis were obtained after mutagenesis with ethylmethane sulphonate (Ladapo and Whitman, 1990).
Three reporter genes have been designed for expression studies in methanococci. Klein and coworkers have demonstrated that the β-glucuronidase (uidA) gene from E. coli and the trehalose (treA) gene from Bacillus subtilis could be expressed in M. voltae (Beneke et al., 1995; Sniezko et al., 1998). In these constructions, the reporter gene is downstream of unique restriction sites for cloning the promoter of interest and upstream of a strong transcription terminator from the M. voltae mcr operon. In M. maripaludis, a β-galactosidase reporter system was based upon the lacZYA operon (Cohen-Kupiec et al., 1997). Colonies expressing β-galactosidase could be visualized after incubating plates under air, which kills methanogens. Therefore, colonies must be replica plated before visualization of β-galactosidase activity. All three reporter systems allow quantitative measurement of expression levels.
In vivo genetic studies
M. voltae contains four groups of hydrogenase genes (reviewed in Sorgenfrei et al., 1997). Two of these gene groups, fru and vhu, encode the selenium-containing F420-reducing and F420-independent hydrogenases respectively. The expression of these gene groups appears to be constitutive. Two additional gene groups, frc and vhc, encode the selenium-free homologues of fru and vhu. The frc and vhc gene groups are expressed divergently (Fig.1A). Transcripts of these genes are undetectable when selenium is present in the medium, but appear during growth in the absence of selenium (Berghofer et al., 1994). The 453 bp sequence between frc and vhc has been the focus of the first in vivo genetic studies in the methanogens (Beneke et al., 1995; Berghofer and Klein, 1995). Strains were constructed with the uidA reporter placed on either side of the frc–vhc intergenic region. These strains had β-glucuronidase activity only when cells were grown in the absence of selenium (Beneke et al., 1995). Thus, expression of the reporter was similar to that of the frc and vhc hydrogenases. Mutagenesis of putative control signals in this intergenic region is under way (Sorgenfrei et al., 1997). The presence of three heptameric direct repeats that overlap the transcriptional start site of frc are of particular interest (Fig.1A). Disruption of each of the three adjacent heptamer repeats caused partial derepression of β-glucuronidase from the frc promoter in the presence of selenium. Upon mutation of a fourth, isolated copy of this heptamer, little change in expression was observed. Removal of the 212 bp region containing these heptamers also caused derepression of vhc in the presence of selenium. An effect of the frc-proximal heptamers on vhc expression is an unusual feature of gene regulation in prokaryotes. It will be interesting to determine whether regulation of archaeal transcription also involves eukaryal features.
An unusual structural feature of the Vhu hydrogenase has been studied genetically (Pfeiffer et al., 1998b). Unlike the other three hydrogenases, whose large subunits are encoded by a single gene, the corresponding region of Vhu is encoded by two genes, vhuA and vhuU. Fusion of these genes did not impair the expression of active Vhu hydrogenase. Likewise, the vhuA–vhuU fusion had no affect on vhc and frc expression, disproving a hypothesis that VhuU was involved in negative regulation of the Vhc and Frc hydrogenases (Sorgenfrei et al., 1997).
The genetics of nitrogen fixation
Genes essential for N2 fixation in M. maripaludis have been identified by a transposon mutagenesis method that should be applicable to other cloned genes (Blank et al., 1995; Kessler et al., 1998). The pac cassette was introduced into a derivative of a mini-Mu plasmid, such that pac was flanked by the ends of the Mu transposon. The construction was used to mutagenize a lambda clone containing the nif operon (Fig.1B). Twelve independent insertions plus two directed insertions were isolated and transformed into M. maripaludis. The expected double cross-over events in the regions flanking the pac-Mu construct were verified by Southern analysis. Eight insertions prevented growth with N2 as the sole nitrogen source, while insertions downstream of the nif operon had no effect. Thus, the role of this operon in N2 fixation was confirmed.
Sequences involved in transcriptional regulation of the nifH system have also been identified (Cohen-Kupiec et al., 1997). In this study, the nifH promoter was fused to the E. coli lacZYA genes. Beta-galactosidase activity from this construct in ammonia-grown cells was the same as in promoterless constructions, but activity increased 25-fold when cells were grown on N2. The reporter system also allowed characterization of two sets of palindromic sequences near the nifH transcription start site (Fig.1B). Changing the upstream palindrome to a palindrome of a different sequence derepressed β-galactosidase activity during growth with ammonia. Mutations in the second palindrome had little effect. These and other experiments suggested that the first palindrome was similar to repressor binding sites common in bacteria. A major goal of future studies will be to confirm the regulatory mechanism by characterization of the repressor. Of special interest is whether or not transcriptional regulation in Archaea is unique or possesses strong similarities to bacterial or eukaryal regulation.
Archaeal flagellins are unrelated to bacterial flagellins in sequence, structure and assembly (reviewed in Jarrell et al., 1996b), and several lines of evidence suggest that they are homologous to bacterial type IV pili (Bayley and Jarrell, 1998). In M. voltae, the flagellar genes are arranged into two transcriptional units. The first unit contains only flaA, which is one of the four flagellin genes present. The second unit contains flaB1 to flaI and includes the remaining flagellin genes, flaB1, flaB2 and flaB3 (Fig.1C). The latter transcript also appears to be processed to a smaller transcript containing flaB1 and flaB2 (Kalmokoff and Jarrell, 1991). The first flagellin mutants in the Archaea were generated by an integration vector constructed from an internal fragment of the flaA gene (Jarrell et al., 1996a). The flaA mutant, although less motile, appeared to be identical to the wild type in both flagellar structure and molecular mass of the subunits. As the sequences of the flagellin genes are nearly identical, a mutant of flaB2 was also obtained using the same vector. The flaB2 mutant was non-flagellated and showed changes in the subunit molecular masses. These results suggest that at least one of the ORFs downstream of flaB2 is involved in the processing of the flagellins, but verification will await mutagenesis of the individual ORFs. Jarrell et al. (1996b) have proposed a model for flagellar assembly that will probably be tested by insertional mutation. As Archaea contain multiple flagellins, it will be interesting to determine how many are required for motility. This question could be addressed by mutagenesis of flaB1 and flaB3. The apparently unique assembly mechanism also deserves further study, as does the unknown nature of the modification of the flagellins. Fortunately, much structural data exist with which to compare genetic analyses (Jarrell et al., 1996b).
Impact of the genomic sequence of Methanococcus jannaschii
The availability of a complete archaeal genomic sequence of the hyperthermophile Methanococcus jannaschii begs for a genetic system in which hypotheses of gene function can be tested (Bult et al., 1996). Although various laboratories have predicted functions for 46–73% of the ORFs, many of these predictions are very general and do not relate to specific cellular activities (Kyrpides et al., 1996; Koonin et al., 1997). For instance, many of these predictions are limited to the identification of common protein motifs or similarity to a known family of enzymes. These predictions probably also differ greatly in their accuracy, and the true function of most of these genes will only be known after direct experimentation.
In the absence of a genetic system in M. jannaschii, the mesophiles M. maripaludis and M. voltae provide an opportunity to begin testing the functions of many of the sequence predictions. The rationales for this approach are strong. The physiology of the mesophiles and M. jannaschii appear similar, although the mesophiles are much easier to manipulate in the laboratory. On the amino acid level, genes in M. jannaschii and the mesophilic methanococci share about 60–80% sequence identity (W. Kim and W. B. Whitman, unpublished results). Although this level of sequence similarity is fairly low, similar to that between E. coli and Haemophilus influenzae, it is high enough for many purposes. Naturally, the genomic sequence of a mesophilic methanococcus would also advance these studies greatly. For instance, the studies described above have relied upon the clustering of functionally related genes to identify additional genes in a particular pathway. However, the genomic sequence of M. jannaschii suggests that many functionally related genes are not clustered (Bult et al., 1996; Bayley and Jarrell, 1998; Kunkel et al., 1998).
The need for functional analyses can be illustrated by a few examples. The initial analysis of the M. jannaschii genomic sequence failed to identify homologues to lysyl- and cysteinyl-tRNA synthetases (Bult et al., 1996). Even though ORF MJ0539 of M. jannaschii possessed little overall sequence similarity to other cysteinyl-tRNA synthetases, Koonin et al. (1997) identified motifs characteristic of class I tRNA synthetases and predicted that MJ0539 encoded the missing cysteinyl-tRNA synthetase. Using degenerate primers to screen a genomic library of M. maripaludis, Ibba et al. (1997) isolated the homologue of MJ0539. Upon expression in E. coli, the gene product possessed lysyl-tRNA synthetase activity and not cysteinyl-tRNA synthase activity. This conclusion was confirmed by purification of lysyl-tRNA synthetase from M. maripaludis and comparison of its N-terminal amino acid sequence with MJ0539. All previously described lysyl-tRNA synthetases have been class II synthetases, and the methanogen enzyme is the first example of a class I lysyl-tRNA synthetase. The larger implication of these results is that the specificities of the aminoacyl-tRNA sythetases were still in evolutionary flux at the time of the divergence of the three domains of life (Ibba et al., 1997).
Similarly, sequence analyses of M. jannaschii identified two ORFs encoding the large subunit of acetohydroxyacid (acetolactate) synthase, an essential enzyme in branched-chain amino acid biosynthesis (Bult et al., 1996). Purification of the enzyme and sequencing of the gene from the mesophile Methanococcus aeolicus revealed that only one of these ORFs (MJ0277) possessed high sequence similarity to the authentic enzyme (Bowen et al., 1997). The other ORF, MJ0663, appeared to be a member of a large family of enzymes that included acetohydroxyacid synthase as well as pyruvate decarboxylase, pyruvate oxidase and glyoxylate carboligase. However, the actual activity of MJ0663 could not be predicted from comparison with the sequences of the other enzymes in this family. Genetics could be used to inactivate this gene to determine its physiological function.
Four out of seven of the genes for the common aromatic amino acid biosynthetic pathway are apparent in M. jannaschii (Bult et al., 1996). It is possible that the sequences of the missing three enzymes are merely too highly diverged from those in bacteria to be recognizable. However, a recent isotope-labelling study suggested that the first steps of the aromatic amino acid biosynthetic pathway in the methanogens might actually differ from the bacterial pathway (Tumbula et al., 1997b). This result could be confirmed by isolation of aromatic amino acid auxotrophs in M. maripaludis. Currently, strategies are being developed for the isolation of these mutants (Whitman et al., 1997). Mutants of M. maripaludis can be produced by transformation with a library of small genomic fragments cloned into a non-replicating vector containing the pac cassette. Upon recombination, plasmids integrate into the gene specified by the cloned fragment. If the insertion inactivates the gene, the mutant can be enriched by selection with base analogues. Upon isolation of the mutant, the plasmid can then be recovered by electroporation into E. coli. The genotype of the mutant can then be identified by sequencing the insert, and the plasmid can be transformed back into M. maripaludis to confirm the phenotype. This approach has already been used to isolate a number of acetate auxotrophs (Whitman et al., 1997) and should have general utility.
Conclusions and future directions
Over the last 10 years, efforts in a number of laboratories have developed a set of powerful and efficient genetic techniques for use in the study of mesophilic methanococci. With the advent of genomics, these techniques could be used to address fundamental questions in archaeal research. In addition to the problems discussed above, a large number of potential targets of molecular genetic studies have emerged from analyses of the M. jannaschii genomic sequence (Bult et al., 1996; Kyrpides et al., 1996; Selkov et al., 1997). For instance, in M. jannaschii, genes of a common pathway are seldom clustered in operons. The rationale for operon structure or the lack thereof should become clearer as the identification of more ORFs is confirmed and in vivo regulation studies are performed. Although labelling and enzyme studies indicate that CO2 is fixed via the Ljungdahl–Wood pathway of acetyl-CoA biosynthesis, archaeal genomes contain a homologue to RuBisCO, the key enzyme of the Calvin cycle. Knock-out mutants in mesophilic methanococci could address the physiological role of this enzyme and the early evolution of this pathway. Koonin et al. (1997) have proposed that cysteine biosynthesis in M. jannaschii may occur by a reversal of the eukaryal cysteine catabolic pathway. Insertional mutagenesis of the genes encoding the enzymes for the proposed pathway would be a straightforward test of this hypothesis. Similarly, it may be possible to isolate auxotrophs for methanogenic coenzymes and identify the genes involved in the biosynthesis of these unique compounds. It is expected that additional genetic studies in the methanococci will reveal unique molecular mechanisms in Archaea and shed light on the nature of ancient organisms. Finally, as more functional studies are performed, we expect to encounter more examples of the pitfalls of over-reliance on interdomain sequence analyses.
This work was supported in part by a US Department of Energy grant DE-FG02-97ER20269 to W.B.W.