Evolution of the soluble diiron monooxygenases

Authors

  • Joseph G Leahy,

    Corresponding author
    1. Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL 35899, USA
      *Corresponding author. Tel.: +1 (256) 824-6371; Fax: +1 (256) 824-6305, E-mail address: leahyj@uah.edu
    Search for more papers by this author
  • Patricia J Batchelor,

    1. Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL 35899, USA
    Search for more papers by this author
  • Suzanne M Morcomb

    1. Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL 35899, USA
    Search for more papers by this author

*Corresponding author. Tel.: +1 (256) 824-6371; Fax: +1 (256) 824-6305, E-mail address: leahyj@uah.edu

Abstract

Based on structural, biochemical, and genetic data, the soluble diiron monooxygenases can be divided into four groups: the soluble methane monooxygenases, the Amo alkene monooxygenase of Rhodococcus corallinus B-276, the phenol hydroxylases, and the four-component alkene/aromatic monooxygenases. The limited phylogenetic distribution of these enzymes among bacteria, together with available genetic evidence, indicates that they have been spread largely through horizontal gene transfer. Phylogenetic analyses reveal that the α- and β-oxygenase subunits are paralogous proteins and were derived from an ancient gene duplication of a carboxylate-bridged diiron protein, with subsequent divergence yielding a catalytic α-oxygenase subunit and a structural β-oxygenase subunit. The oxidoreductase and ferredoxin components of these enzymes are likely to have been acquired by horizontal transfer from ancestors common to unrelated diiron and Rieske center oxygenases and other enzymes. The cumulative results of phylogenetic reconstructions suggest that the alkene/aromatic monooxygenases diverged first from the last common ancestor for these enzymes, followed by the phenol hydroxylases, Amo alkene monooxygenase, and methane monooxygenases.

1Introduction

Oxygenases comprise a structurally, biochemically, and evolutionarily disparate group of enzymes that incorporate dioxygen into specific organic or inorganic substrates, and function in catabolic and detoxification reactions in both prokaryotic and eukaryotic organisms [1]. One important group of bacterial oxygenases includes the soluble methane monooxygenases (sMMOs), the phenol hydroxylases, the toluene monooxygenases, and the alkene monooxygenases. These are soluble, multicomponent enzymes which utilize dioxygen to catalyze the initial hydroxylation or epoxidation step in pathways for the oxidation of their respective hydrocarbon substrates, and require NAD(P)H as an electron donor [2–7]. From available structural information acquired to date, it appears that these enzyme systems all contain three or four components: a dimeric hydroxylase protein composed of two or three subunits in a (αβγ)2 or αβ quaternary structure, an NADH oxidoreductase with an N-terminal chloroplast-type ferredoxin domain and a C-terminal reductase domain with FAD- and NAD(P)-ribose binding regions, a small effector or coupling protein with no prosthetic groups, and in some cases, a Rieske-type ferredoxin protein [4–11]. Structural and genetic analyses of the sMMOs, the toluene 2-, 3-, and 4-monooxygenases, the phenol hydroxylase of Pseudomonas sp. strain CF600, and two alkene monooxygenases have shown that in each of these enzymes the α-oxygenase subunit contains a carboxylate-bridged diiron center at the active site which is homologous to the similarly bridged diiron-oxo clusters present in the R2 component of ribonucleotide reductase and stearoyl-ACP Δ9 desaturase [3,12–21]. Fox et al. [17] proposed that oxygenases containing this type of diiron center comprised an evolutionarily related group of proteins, the class II diiron proteins, to be distinguished from the unrelated class I diiron proteins exemplified by hemerythrin. (These form classes I and III, respectively, of the four categories of diiron protein delineated by Nordlund and Eklund [19].) Sequence analyses of isofunctional subunits between the sMMOs, alkene monooxygenases, toluene monooxygenases, and phenol hydroxylases support the further contention that they form a unified catalytic subclass within this group [22], which we will refer to as the soluble diiron monooxygenases.

The members of this family of oxygenases have been the subject of extensive research in recent years, largely because of interest in the nature of the diiron center at the active site of these enzymes and its function in catalysis, as well as the structure, arrangement, and biochemistry of electron transport chain and regulatory subunits which serve to deliver electrons to the oxygenase protein. From an ecological and environmental point of view, the study of the diiron monooxygenases has been motivated by their importance in expanding the substrate range of pseudomonads and other bacteria to include a diverse array of hydrocarbons, among them such hazardous compounds as benzene and trichloroethylene (TCE). The sMMOs have been the focus of particular attention because of the critical role that methane plays both in the carbon cycle [23] and as a greenhouse gas [24]. As a result of previous studies, we have begun to gain a better understanding of the structure of these oxygenases, their mechanisms for catalysis and substrate specificity, the organization of genes encoding monooxygenase subunits, and the degree of similarity among isofunctional subunits between different enzymes and different groups of enzymes. In this paper, we analyze the evolution of the soluble diiron monooxygenases, reviewing the data made available from prior investigations and extending these findings through an analysis of sequence identities, sequence alignments, and phylogenetic relationships for the α-oxygenase, β-oxygenase, NADH oxidoreductase, and ferredoxin components. We report here for the first time conclusive evidence that the α- and β-oxygenase genes arose from an ancient gene duplication event, and use this information both as a means of identifying the roots of the phylogenetic tree for the various oxygenase subunits and in determining the order of divergence of enzyme subfamilies. The role of horizontal gene transfer in the distribution of these enzymes is also discussed, as is the phylogenetic relationship between the electron transfer components and the isofunctional subunits of the membrane-bound diiron hydroxylases, the Rieske center non-heme iron oxygenases, and other enzymes.

2Enzyme structure, function, and distribution

2.1The sMMOs

The soluble diiron monooxygenases can be divided into four groups of enzymes based on their structure, and, to a lesser extent, their substrate specificity. This classification is rooted in the phylogenetic relationships between the enzymes in these groups, as will be discussed in detail later in this paper. One group is composed of the sMMOs, which are structurally and biochemically the most well-characterized members of this oxygenase family. These enzymes are unique among the diiron monooxygenases in their capacity to oxidize the highly stable methane molecule to methanol, the first step in the pathway for the utilization of methane by methanotrophs [23]. They contain a dimeric hydroxylase protein with three subunits in the (αβγ)2 stoichiometry, together with the reductase and effector proteins as discussed earlier. The sMMOs from the type X methanotroph Methylococcus capsulatus Bath [8–11,25], and the type II methanotrophs Methylosinus trichosporium OB3b [12,13,26] and Methylocystis sp. strains M [27,28] and WI 14 [29] have been purified and characterized. X-ray crystal structures have been reported for the hydroxylase components from the Mc. capsulatus Bath [16,30] and Ms. trichosporium OB3b [31] strains. Nuclear magnetic resonance (NMR) solution structures have been solved for the effector proteins from these strains [32,33], and for the ferredoxin domain of the reductase component from Mc. capsulatus Bath [34]. Structural analysis has shown that each α-oxygenase subunit of the hydroxylase (protein A or MMOH) contains a μ-hydroxo-bridged binuclear iron center that is the site of activation of both dioxygen and methane [16,35]. The reductase (protein C or MMOR) transfers electrons from NADH through its FAD and [2Fe–2S] cofactors to the active site [10]. The effector protein (protein B or MMOB) is a small α/β protein that is structurally and genetically related to putidaredoxin [32]. It plays a number of roles in catalysis, including effects on electron transfer and the rate and regioselectivity of substrate oxidation [11,36–42]. Primary component interactions are between MMOR with the α- and β-oxygenase subunits of the hydroxylase and MMOB with the α-oxygenase subunit [37,43]. Available data suggest that the γ-oxygenase subunit provides further stabilization of contacts between MMOB and the hydroxylase, while MMOB and MMOR are both required for conformational changes which follow the binding of substrate and which serve to bring the reductase close to the oxygenase active site [41]. Dioxygen bound at the diiron center is reductively activated to hydroxylate methane, with one atom forming the hydroxyl group on methanol and the other being reduced to water [44,45].

A gene of unknown function, orfY, is found immediately upstream of the oxidoreductase genes in all of the operons encoding sMMOs which have been described thus far. Recent work by Merkx and Lippard [46] has shown that orfY is expressed in Mc. capsulatus Bath and that it interacts with MMOH, although its precise role has yet to be described.

The sMMOs exhibit generally broad specificities, catalyzing the insertion of oxygen into a number of different hydrocarbon compounds. The sMMOs from Mc. capsulatus Bath, Ms. trichosporium OB3b, and Methylocystis sp. strain WI 14 oxidize a diverse array of hydrocarbons, including carbon monoxide, substituted methanes, alkanes, cycloalkanes, alkenes, haloalkenes, ethers, and aromatic and heterocyclic hydrocarbons [29,47–50]. Lipscomb [51] has attributed the lack of specificity of these enzymes to the tremendous oxidizing power required for the hydroxylation of the highly stable methane molecule. As far as is known, these transformations are adventitious and provide no benefit to the methanotrophs that express sMMOs. These reactions are of practical interest, however, in that the co-oxidation of hazardous compounds such as TCE yields products that decompose under abiotic conditions or are mineralized by other bacteria in the same community [52]. Consequently, there has been intense interest in the use of methanotrophs for the bioremediation of TCE-contaminated environments (see for example [53]).

Unlike the evolutionarily unrelated particulate methane monooxygenases (pMMOs), which are present in all methanotrophs [54], the sMMOs have been identified in relatively few strains, although these strains are phylogenetically diverse and include type I, II, and X methanotrophs [55]. The pMMOs are membrane-bound, copper-containing enzymes related to ammonia monooxygenases [56,57], and exhibit a narrower substrate specificity than do the sMMOs [49]. These enzymes are only expressed and active under conditions of copper sufficiency, while the sMMOs are expressed only when concentrations of copper are limiting [58]. Murrell et al. [58] have suggested that the sMMOs may provide a competitive advantage to methanotrophs residing in copper-depleted environments such as peat bogs. Given the apparently scattered phylogenetic distribution of sMMOs among the methanotrophs, it seems likely that the sMMOs represent ‘alternative’ methane oxygenases which have been acquired through horizontal gene transfer, and enable recipients to colonize a wider range of habitats than possible with the pMMO alone.

2.2The alkene monooxygenase of Rhodococcus corallinus B-276

At present, the only known member of this second group of soluble diiron monooxygenases is the alkene monooxygenase of Rh. corallinus B-276, which was isolated on propene as a source of carbon and energy [59]. We will refer to this enzyme as the ‘Amo alkene monooxygenase’ to reflect its genetic nomenclature and to distinguish it from ammonia monooxygenase. This enzyme has been purified and characterized, and shown to contain hydroxylase, reductase, and effector subunits which are assembled in the same fashion as the sMMOs [4]. The structure of the hydroxylase differs in containing only the α- and β-subunits, which are arranged in an αβ configuration. The component subunits of the enzyme are homologous to their counterparts in the sMMOs [60], and the α-oxygenase was shown by identification of iron cluster ligands and by electron paramagnetic resonance spectroscopy to contain a similar binuclear iron center [20,61]. The alkene monooxygenase and the sMMOs also share the ability to epoxygenate alkenes, but while the alkene monooxygenase carries out stereospecific epoxidation reactions exclusively [20,62], the sMMOs yield both alcohols and racemic mixtures of epoxides as products [47,49]. The alkene monooxygenase also exhibits a far narrower substrate specificity, with the purified enzyme having been shown to oxidize C3 and C4 1- and 2-alkenes, styrene, and various chloroalkenes, but not 1-hexene or ethane [4,63]. There is, however, significant interest in this enzyme as a tool for the synthesis of optically active epoxides [20,62].

2.3The phenol hydroxylases

This group of enzymes consists of three-component oxygenases that hydroxylate phenolic substrates to the corresponding catechols. We will designate this group ‘phenol hydroxylases’ for convenience, acknowledging that phenol has also been shown to be hydroxylated by flavoprotein hydroxylases from both bacteria [64,65] and fungi [66], as well as by enzymes in another group of diiron monooxygenases discussed later. Structural and genetic analyses of two of these enzymes, the Dmp phenol hydroxylase from Pseudomonas sp. strain CF600 [3,67,68], and the toluene o-monooxygenase, Tom, from Burkholderia cepacia G4 [5], have shown that these enzymes contain an (αβγ)2 hydroxylase protein with a diiron center, an FAD/[2Fe–2S] reductase, and an effector protein which is required for catalysis. NMR structural determinations for the Dmp effector protein, P2, showed it to have similar secondary structural elements as the homologous MMOB (protein B) effector of the sMMOs [69]. Both Dmp and Tom catalyze the oxidation of phenol and certain methyl-substituted phenols as the initial stage in the catabolism of these compounds [67,70]. The Tom enzyme, however, is also able to incorporate an oxygen atom into hydrocarbons with an unactivated benzene nucleus, and sequentially oxidizes toluene to o-cresol and 3-methylcatechol in the initial steps of the toluene o-monooxygenase pathway [5]. The enzyme oxidizes a variety of other substrates as well, including diethyl ether, TCE, the three isomers of dichloroethylene, vinyl chloride, and naphthalene [71–74]. The broad specificity of the Tom enzyme has prompted investigations with B. cepacia G4 and its derivatives as potential agents for bioremediation applications [75–77].

A number of phenol hydroxylases with homology to Dmp and Tom have been identified among pseudomonads and Acinetobacter spp. [78–85]. Little is known of their precise structure except what can be deduced from sequence analyses, however, and the substrate specificity of these enzymes is not well delineated. Some, such as the toluene/benzene 2-monooxygenase of B. cepacia JS150 [81], and the phenol hydroxylases of Comamonas testosteroni R5 and Ralstonia eutropha E2 [85], exhibit a substrate range similar to the Tom enzyme and are able to hydroxylate benzene, toluene, and other unactivated aromatic hydrocarbons. Interestingly, dimethyl sulfide has been shown to be a substrate for some of the phenol hydroxylases, including Dmp, Tom, and the hydroxylase from C. testosteroni TA441 [86].

2.4The four-component alkene/aromatic monooxygenases

The fourth group of soluble diiron monooxygenases contains a mixed assemblage of four-component alkene monooxygenases and aromatic ring monooxygenases that exhibit overlapping substrate specificities. The archetypal member of this group is the toluene 4-monooxygenase (Tmo) from Pseudomonas mendocina KR1, identified by Pikus et al. [6] as the first aromatic oxygenase in this family of enzymes. Results of extensive structural analyses have shown that the Tmo consists of an (αβγ)2 hexameric hydroxylase with a diiron hydroxo-bridged center, and NADH reductase and effector proteins homologous to those of the sMMOs [6,87,88]. This enzyme is distinct, however, in that it also contains a separate Rieske-type ferredoxin component akin to the ferredoxins of the three-component aromatic ring dioxygenases [6,89]. This small, soluble, iron–sulfur protein functions in the transfer of electrons from reductase to hydroxylase, and was shown by NMR to contain the conserved iron ligands characteristic of other Rieske ferredoxins [89]. The effector protein of Tmo apparently mediates effects on catalysis through conformational changes in the hydroxylase as it does in other oxygenases [88], and has also been shown to play a role in regiospecificity and the efficiency of electron flow coupled with hydroxylation [90]. Characterization by NMR revealed similar secondary structure topologies as the effectors from the sMMOs and Dmp, but all were found to differ significantly in three-dimensional structure [88].

Tmo was originally investigated for its role in the regiospecific hydroxylation of toluene to produce p-cresol, the initial step in the 4-monooxygenase pathway of P. mendocina KR1 [87]. Subsequent studies have shown it to have a fairly broad substrate specificity, oxidizing acetanilide, chlorobenzene, ethylbenzene, TCE, 1,2-dichloroethane, chloroform, and C3–C8 alkenes, but not phenolic compounds [91–94]. Aromatic monooxygenases related to Tmo have since been identified in other pseudomonads; these include the toluene 3-monooxygenase (Tbu) of Ralstonia pickettii PKO1 [18], the toluene/o-xylene monooxygenase (Tou) of P. stutzeri OX1 [95], and the phenol hydroxylase (Phl) of R. eutropha JMP134 [96]. Based on sequence analyses, all were deduced to have the same four-component structure as Tmo, and all have attracted interest because of their ability to degrade TCE and other chlorinated aliphatic hydrocarbons [91,96–100]. Tbu, like Tmo, hydroxylates unactivated aromatic compounds but not phenols, and catalyzes the regiospecific oxidation of toluene at the 3 carbon [101,102]. The Tou enzyme differs from Tbu and Tmo in two respects: it hydroxylates both unactivated aromatic hydrocarbons and phenolic compounds, and it hydroxylates toluene with relaxed regiospecificity, yielding a mixture of o-, m-, and p-cresols [95,103]. Less is known for the Phl phenol hydroxylase, which has not been reported to hydroxylate unactivated aromatics.

Two alkene monooxygenases have also been identified which exhibit the four-component structure characteristic of this group. One, the Aam alkene monooxygenase from Xanthobacter sp. strain Py2, has been purified and resolved into an iron-containing (αβγ)2 hydroxylase, FAD-containing reductase, Rieske ferredoxin, and putative effector protein [7]. Sequence analyses revealed that these components [104] and those of the related isoprene monooxygenase from Rhodococcus sp. strain AD45 [105] exhibit a higher degree of similarity to the isofunctional proteins of Tmo than to the Amo alkene monooxygenase discussed earlier. Both enzymes are of commercial interest because of their ability to produce optically active epoxides from some substrates [105,106]. The specificity of the Aam alkene monooxygenase overlaps with that of the aromatic monooxygenases in the group, having been shown not only to oxidize unsubstituted and chlorinated alkenes up to six carbons, but also benzene, toluene, and phenol [104].

3Genetic organization of operons

Table 1 lists the soluble diiron monooxygenases that were analyzed for the present study, together with the source organisms, the organization of gene operons that encode the oxygenase subunits for each enzyme, and their genomic location. The order of genes in operons reflects the structural relationships between these enzymes, i.e. the order is identical for enzymes within the four structural groups of oxygenases as outlined above, but differs between groups. It can be concluded that there is a strong evolutionary basis for these subdivisions, and that the divergence of these enzymes into functional groups was associated with extensive gene rearrangement. Gene order data do not, however, reveal any clear relationships between the groups. It is notable, however, that the oxidoreductase gene is always the last of the oxygenase genes encoded, which suggests that expression of the oxidoreductase in the absence of the other subunits may be detrimental to the cell.

Table 1.  Subdivisions and enzymes of the soluble diiron monooxygenase family, source strain, and organization and genomic location of genes encoding oxygenase subunits Thumbnail image of

While the oxygenase-encoding genes are in most cases tightly clustered, some of the operons contain significant gaps between coding regions. This is most apparent for the touABCDEF operon, which contains a 187-bp gap between the ferredoxin and effector genes. The sMMO operons also contain one major gap, but the position of the gap varies. In the type II methanotrophs Ms. trichosporium OB3b and Methylocystis sp. strains M and WI 14, a relatively large gap appears between the α-oxygenase and β-oxygenase genes, whereas in the type I methanotrophs Methylomonas sp. strains KSPIII and KSWIII and the type X methanotroph Mc. capsulatus Bath, the gap is between the γ-oxygenase and orfY genes. These gaps may reflect a recent acquisition or rearrangement of genes within the operon, whereby the coding regions on either side of the gap have not yet fully ‘coalesced’. For the sMMOs, the positions of these gaps also provide evidence of phylogenetic relationships, with the type II enzymes and type I and X enzymes clearly forming distinct groups on this basis.

In many cases, the operons that encode the soluble diiron monooxygenases contain genes that encode other functions, although these functions are generally related to the catabolism of the oxygenase substrate. The dmp operon, for example, contains 15 genes, with the first gene, dmpK, encoding a protein that appears to be involved in the insertion of the iron atoms into the active site of the oxygenase protein [119]. The dmpQBCDEFGH genes follow the oxygenase-encoding genes directly and encode enzymes of the meta-pathway for the dissimilation of catechol [120]. All of the other phenol hydroxylase operons contain a gene homologous to dmpK, and most have been shown to contain at least some of the genes of the meta-pathway [78–80,83–85]. In the case of the tbu operon encoding the toluene 3-monooxygenase, the gene encoding the positive transcriptional regulator of the operon, TbuT, is located downstream of the oxygenase genes and is itself regulated by read-through transcription [121].

4Genomic location, base composition, and evidence for horizontal gene transfer

Of the 23 diiron monooxygenase operons analyzed for the present study, 12 have been shown to be chromosomally-encoded and six plasmid-encoded (Table 1). Given the prominent role of extrachromosomal DNA elements in the horizontal transfer of genes in the prokaryotes [122], it can be surmised that the plasmidic genes have been acquired through this mechanism. The chromosomally-encoded genes may also have a similar origin, however, as evidenced by the restricted phylogenetic distribution of these enzymes among bacterial species and strains, a pattern described earlier for the sMMOs in methanotrophs. Stated another way, there is little evidence for persistent vertical transmission of the soluble diiron monooxygenases, which would be manifested in the retention of these enzymes as species or genus characters (see discussion by Ochman et al. [123]). It is likely, rather, that these genetic loci may have been originally acquired via plasmidic transfer or other means and subsequently integrated into the chromosome.

It should also be emphasized that prokaryotic genes are constantly in flux from chromosome to plasmid and vice versa in response to selective pressures that optimize genomic location for the fitness of the organism. According to the local adaptation hypothesis [124,125], genes which are important for adaptation to local environments are more likely to be plasmid-encoded, while essential ‘housekeeping genes’ tend to be chromosomally-encoded. There is a cost associated with mobility, however, so as conditions stabilize in a particular ecosystem, ‘local’ adaptations may become ‘long-term’ adaptations, and genes that were present on plasmids may be mobilized by the chromosome to reduce energetic costs [126,127]. Consequently, the genomic location of genes encoding the soluble diiron monooxygenases may reflect not only the history and mechanism of gene acquisition, but also the stability of the environment for the organism.

The evidence for horizontal transfer is particularly intriguing for the genes encoding the alkene monooxygenases of Rh. corallinus B-276 and Xanthobacter sp. strain Py2, both of which are encoded on linear ‘megaplasmids’ (185 and 320 kb, respectively) [63,118]. While both enzymes catalyze the epoxidation of alkenes, they are structurally distinct and are expressed in bacteria that are not closely related (Rhodococcus is an Actinobacterial genus and Xanthobacter is a member of the α-Proteobacteria [128]). The plasmid encoding the Aam alkene monooxygenase, pEK1, was also shown to contain the genes that encode epoxide metabolism, including a gene, xecG, which may be involved in the biosynthesis of coenzyme M. Coenzyme M acts as a nucleophile for the catalysis of epoxide ring opening in both Xanthobacter sp. strain Py2 and in Rh. corallinus B-276, and thus is essential for alkene metabolism in these organisms [129,130]. XecG was found to exhibit a high degree of sequence identity with counterparts in the methanogens Methanobacterium thermoautotrophicum, Methanococcus jannaschii, as well as Bacillus subtilis, suggesting the possibility of trans-domain gene and/or plasmid transfer from Archaea to Bacteria, or vice versa [131].

Apart from the phylogenetic distribution of genes and their location on extrachromosomal elements, support for horizontal gene transfer can also be derived by comparison of the GC base composition of the genes of interest with the average genomic composition for the organism. Recently acquired sequences can be identified as having GC compositions that reflect codon usage in the donor genome [132]. Among the soluble diiron monooxygenases, four of the 23 operons analyzed, tom, tmo, tou, and iso, contain genes with lower GC composition than the average for the genome. (Base compositions obtained from references cited in Table 1, or calculated from deduced amino acid sequence data using the MacVector program, version 6.01, Oxford Molecular Group, Beaverton, OR, USA. Appropriate genomic base compositions were obtained from [133–138].) All can be concluded to have originated from phylogenetically distinct organisms, with the tom and iso operons still retained on plasmids that were the likely vehicle of transmission, and the tou operon having since integrated into the chromosome. Another interesting observation which emerges is that for many of the gene clusters which encode the oxygenases (sometimes as part of a larger operon), one or more interior genes exhibit a lower GC composition than the first and last of the genes in the cluster. This pattern, which is most evident in the amo, phl, phh, tom, tbm, phc, pox, and tmo operons, and the mmo operon of Mc. capsulatus Bath, suggests either that these gene clusters were acquired by horizontal transfer with subsequent amelioration (adjustment of codon usage [139]) occurring more rapidly at the termini of the operon than at the interior, or that the interior genes were acquired separately from the genes at the termini and at a later evolutionary time. The former seems to be a more likely possibility, given that these variations are often no more than 3%.

5Approach to phylogenetic analyses

To further investigate the evolution of the soluble diiron monooxygenases, the phylogenies of the α- and β-oxygenase, oxidoreductase, and ferredoxin proteins were assessed from deduced amino acid sequences for the enzymes listed in Table 1. The γ-oxygenase subunits were not considered in these analyses because the Amo alkene monooxygenase does not contain this protein, and the γ-oxygenase subunits of the remaining three groups do not exhibit significant amino acid similarity. The effector proteins were excluded because preliminary attempts to infer their phylogeny yielded trees for which topologies were poorly supported and therefore not useful for consideration of evolutionary relationships.

Three different approaches were used for phylogenetic analysis of each of the subunits. Pairwise identity scores were calculated for all possible pairs of isofunctional sequences using the GAP algorithm of the GCG Wisconsin Package (Version 10.0, Accelrys, Burlington, MA, USA). Multiple sequence alignments were carried out using the PILEUP program from the same package. Finally, phylogenetic trees were inferred by maximum likelihood analysis using PROTML from the MOLPHY 2.3b3 package [140]. The JTT model of amino acid substitutions was used [141] with adjustment to observed amino acid frequencies [142]. Bootstrap values were estimated by the RELL method [143] implemented in PROTML. Optimal trees were identified by branch rearrangement (-r option of PROTML) imposed on an initial neighbor-joining tree [144]. Alternative topologies with constraints on specific nodes were also evaluated by maximum likelihood. Trees were drawn using the DRAWTREE program from the PHYLIP 3.5c package [145]. For the oxidoreductase and ferredoxin proteins, homologous sequences for isofunctional subunits from unrelated enzymes were included in multiple sequence alignments and phylogenetic tree constructions to better delineate tree topologies and to clarify the evolutionary relationship between these proteins.

6Comparisons of α-oxygenase sequences

Table 2 summarizes for each of the four groups of soluble diiron monooxygenases the pairwise identity scores for the α-oxygenase subunits. Mean identity scores within each of the groups range from 58.36 to 86.31% (exclusive of the Amo alkene monooxygenase), and are significantly higher than are the mean scores for any of the between-group comparisons, which vary from 21.45 to 35.27%. It can be concluded that the α-oxygenase subunits of each group bear a closer evolutionary relationship to one another than to those of other groups, and that these groups comprise evolutionary subfamilies as previously indicated from gene order data. While further analysis of the data reveals few clues as to the relationships between these subfamilies, it is apparent that the Amo alkene monooxygenase and the sMMOs are more closely related to one another (mean identity score of 35.27%) than they are to the other groups (mean scores of 21.45–26.75%).

Table 2.  Pairwise identity data for α-oxygenase subunits, expressed as mean values for comparisons of deduced amino acid sequences within and between designated subfamilies
  1. 100% identities resulting from same sequence comparisons were excluded from the data used for these calculations. Sequence data were obtained from GenBank under the accession numbers listed in Table 1.

 sMMOsAmo alkene monooxygenasePhenol hydroxylasesAlkene/aromatic monooxygenases
sMMOs86.31±8.7235.27±0.7621.45±1.8622.37±1.58
Amo alkene monooxygenase 100.00±0.0024.97±1.5623.65±2.10
Phenol hydroxylases  71.80±10.8726.75±2.08
Alkene/aromatic monooxygenases   58.36±14.25

Fig. 1 shows the alignment of deduced amino acid sequences for the α-oxygenase subunits, with sequences divided into subfamilies. Essential residues that have been conserved in evolution of these proteins have been previously identified by Rosenzweig et al. [16], Elango et al. [31], and Coufal et al. [22] and are as follows. All of the sequences contain the two copies of the iron ligand residue sequence pattern E.E/DX2H characteristic of the soluble diiron monooxygenases, ribonucleotide reductase, and the stearoyl-ACP Δ9 desaturases [17,19]. Relative to MmoX of Mc. capsulatus Bath, these ligands are at positions corresponding to E114, E144 (D for PhhN), H147, E209, E243, and H246. Other conserved or nearly conserved residues are as follows: D143, R146, S238, D242, and R245, which participate in hydrogen bonding between C and F helices; T213 and N214, either or both of which may be involved in proton delivery to the active site; A117 and G250, which are small residues that are retained in tightly packed regions of the protein; the ‘canyon’ residues Y67, K74, L321, G325, and P329, that may dock with protein B or the oxidoreductase components; the ‘handle’ residues A224, G228, and D229, which may function in binding with another subunit to alter the conformation of the E/F helices; Y292, W371, Y376, and P371, which are also possible docking residues; and P424, G443, P461, and Y464, which are believed to interact with the γ-oxygenase subunit.

Figure 1.

Figure 1.

Sequence alignments of the proteins corresponding to the α-oxygenase subunit, with sequences grouped by subfamily. Residues which are conserved in two or more subfamilies are boxed in black. Residues that are conserved within a single subfamily are boxed in gray. Sequences were obtained from GenBank under the accession numbers listed in Table 1. Abbreviations for MmoX sequences: M, Methylocystis sp. strain M; WI14, Methylocystis sp. strain WI 14; OB3b, Ms. trichosporium OB3b; KSPIII, Methylomonas sp. strain KSPIII; KSWIII, Methylomonas sp. strain KSWIII; Bath, Mc. capsulatus Bath. Abbreviations for annotated residues: D, docking residues; H, hydrophobic residues surrounding the diiron center; S, structural residues involved in hydrogen bonding of α-helices A and C; *, iron cluster ligands; FeA, iron cluster ligands for FeA iron center; FeB, iron cluster ligands for FeB iron center; closed triangles, residues conserved in ribonucleotide reductases and stearoyl-ACP Δ9 desaturases; open triangles, residues conserved in stearoyl-ACP Δ9 desaturases. Other labels describe the location or possible function of the residues.

Figure 1.

Figure 1.

Sequence alignments of the proteins corresponding to the α-oxygenase subunit, with sequences grouped by subfamily. Residues which are conserved in two or more subfamilies are boxed in black. Residues that are conserved within a single subfamily are boxed in gray. Sequences were obtained from GenBank under the accession numbers listed in Table 1. Abbreviations for MmoX sequences: M, Methylocystis sp. strain M; WI14, Methylocystis sp. strain WI 14; OB3b, Ms. trichosporium OB3b; KSPIII, Methylomonas sp. strain KSPIII; KSWIII, Methylomonas sp. strain KSWIII; Bath, Mc. capsulatus Bath. Abbreviations for annotated residues: D, docking residues; H, hydrophobic residues surrounding the diiron center; S, structural residues involved in hydrogen bonding of α-helices A and C; *, iron cluster ligands; FeA, iron cluster ligands for FeA iron center; FeB, iron cluster ligands for FeB iron center; closed triangles, residues conserved in ribonucleotide reductases and stearoyl-ACP Δ9 desaturases; open triangles, residues conserved in stearoyl-ACP Δ9 desaturases. Other labels describe the location or possible function of the residues.

Figure 1.

Figure 1.

Sequence alignments of the proteins corresponding to the α-oxygenase subunit, with sequences grouped by subfamily. Residues which are conserved in two or more subfamilies are boxed in black. Residues that are conserved within a single subfamily are boxed in gray. Sequences were obtained from GenBank under the accession numbers listed in Table 1. Abbreviations for MmoX sequences: M, Methylocystis sp. strain M; WI14, Methylocystis sp. strain WI 14; OB3b, Ms. trichosporium OB3b; KSPIII, Methylomonas sp. strain KSPIII; KSWIII, Methylomonas sp. strain KSWIII; Bath, Mc. capsulatus Bath. Abbreviations for annotated residues: D, docking residues; H, hydrophobic residues surrounding the diiron center; S, structural residues involved in hydrogen bonding of α-helices A and C; *, iron cluster ligands; FeA, iron cluster ligands for FeA iron center; FeB, iron cluster ligands for FeB iron center; closed triangles, residues conserved in ribonucleotide reductases and stearoyl-ACP Δ9 desaturases; open triangles, residues conserved in stearoyl-ACP Δ9 desaturases. Other labels describe the location or possible function of the residues.

A number of hydrophobic residues that surround the active site are positioned in the region from L110 to I239; these are generally not conserved but in some cases may play an important role in substrate recognition and binding, and the regiospecificity of oxidation. L110, for example, was shown by Rosenzweig et al. [43] to be specifically involved in substrate ‘gating’ for MmoX. Similarly, the mutation of V106 (at the same position) to A in TomA3 was demonstrated by Canada et al. [116] to increase the ability of the toluene o-monooxygenase to hydroxylate large multi-ring aromatic compounds as substrates, presumably because of the greater access to the catalytic site afforded by the smaller alanine gate. The G103, Q141, F205 residues of TmoA have all been identified as determinants for the regiospecificity of the toluene 4-monooxygenase, with mutations in G103 and F205 enhancing selectivity for ortho- and meta-hydroxylations, respectively [90,146].

While the universal conservation of structural and catalytic residues among the α-oxygenase subunits clearly substantiates the grouping of the soluble diiron monooxygenases into a single enzyme family, a consideration of the insertions/deletions among the sequences and the extensive conservation of residues within the four subfamilies of enzymes (i.e. residues boxed in gray) confirms the evolutionary basis for these divisions. Based on the depicted alignment, the sMMOs, Amo alkene monooxygenase, phenol hydroxylases, and alkene/aromatic monooxygenase sequences contain three, four, five, and eight deletions, respectively, which are unique to but universally conserved in their subfamilies. (Alternatively, these net deletions may be interpreted as insertions common to all of the subfamilies except the subfamily missing the particular sequence.) The sMMOs as a group exhibit the greatest conservation of sequence, possibly due to greater constraints on active site geometry required for methane oxidation. At the other extreme, the alkene/aromatic monooxygenases are a more disparate group in terms of substrate specificity, and this is reflected in the significantly lower degree of sequence conservation among the enzymes in this group.

7Comparisons of β-oxygenase sequences

By contrast with the α-oxygenase subunits, the evolution of the β-oxygenase subunits is freed from the necessity of retaining specific active site ligands or of preserving precise tertiary structure around an active site. It would therefore be expected that these sequences would exhibit greater divergence than for the α-oxygenase subunits, and this is apparent from both pairwise identity data (Table 3) and multiple sequence alignments (Fig. 2). Divergence is so extensive between the sMMOs and both the phenol hydroxylases and alkene/aromatic monooxygenases that for some pairings the sequences cannot be aligned. Mean pairwise identity scores within subfamilies are also generally lower than for the α-oxygenase subunits, ranging from 68.51% for the sMMOs to 49.88% for the alkene/aromatic monooxygenases. The alignments show that some residues are absolutely conserved, and can be concluded to play structurally or functionally important roles for the enzymes. These are D100, P101, and D185, which are believed to be involved in intersubunit interactions, the interior residues W218 and R228, and the surface residue N313 (positions relative to MmoY of Mc. capsulatus Bath) [22]. Overall, however, the conservation of residues within subfamilies is much less extensive than for the α-oxygenase proteins, with the sMMOs again exhibiting the highest proportion of identical residues.

Table 3.  Pairwise identity data for β-oxygenase subunits, expressed as mean values for comparisons of deduced amino acid sequences within and between designated subfamilies
  1. 100% identities resulting from same sequence comparisons were excluded from the data used for these calculations. An asterisk indicates that one or more pairs of sequences could not be aligned; these results were also excluded from the calculations. Sequence data were obtained from GenBank under the accession numbers listed in Table 1.

 sMMOsAmo alkene monooxygenasePhenol hydroxylasesAlkene/aromatic monooxygenases
sMMOs68.51±16.7129.04±1.1321.52±2.21*21.58±1.68*
Amo alkene monooxygenase 100.00±0.0020.97±1.1324.97±1.53
Phenol hydroxylases  53.14±17.7123.91±2.47
Alkene/aromatic monooxygenases   49.88±15.06
Figure 2.

Figure 2.

Sequence alignments of the proteins corresponding to the β-oxygenase subunit, with sequences grouped by subfamily. Sequences were obtained from GenBank under the accession numbers listed in Table 1. Abbreviations and boxing of residues are as described for Fig. 1. Labels describe the location or possible function of the residues.

Figure 2.

Figure 2.

Sequence alignments of the proteins corresponding to the β-oxygenase subunit, with sequences grouped by subfamily. Sequences were obtained from GenBank under the accession numbers listed in Table 1. Abbreviations and boxing of residues are as described for Fig. 1. Labels describe the location or possible function of the residues.

8Phylogeny of the α- and β-oxygenase subunits

Interestingly, the X-ray crystal structure of the sMMO from Mc. capsulatus Bath reported by Rosenzweig et al. [16] revealed the α- and β-oxygenase subunits of this enzyme to be extraordinarily similar in structure, with 10 α-helices displaying nearly identical folds. Coufal et al. [22] suggested that the parallel structure of the subunits was indicative of an evolutionary relationship, with the β-oxygenase subunit preserving structural features of an α-oxygenase subunit precursor. These subunits may, then, be paralogous proteins, having arisen from an ancient gene duplication of an ancestral diiron protein. With the α-oxygenase subunit maintaining a catalytic role, the β-oxygenase subunit has diverged from and co-evolved with the α-oxygenase subunit to assume primarily a structural role. However, as reported by Rosenzweig et al. and Coufal et al., the α- and β-oxygenase subunits of Mc. capsulatus Bath do not exhibit detectable homology in their primary sequence, so the precise phylogenetic relationship between these proteins has not been established.

In the present study, we sought to further assess the possibility that the α- and β-subunits are paralogous proteins with a common evolutionary origin. Our initial approach was to determine whether any degree of sequence homology could be discerned between pairs of α- and β-oxygenase subunits, both from the same enzyme and from different enzymes. Pairwise comparisons of deduced amino acid sequences were done as before, using entire β-oxygenase subunit and truncated α-oxygenase subunit sequences. The results of these analyses (Table 4) show that many of the phenol hydroxylases and alkene/aromatic monooxygenases contain α- and β-oxygenase subunits that bear significant homology to one another, thereby providing evidence of a common ancestor for these proteins. No homology was detected for the corresponding subunits of either the sMMOs or the Amo alkene monooxygenase, indicating that their sequences may have diverged more extensively and are evolutionarily more distant from one another. The results also show that DmpN exhibits homology with nearly all of the β-oxygenase subunits, while TouE is homologous to almost all of the α-oxygenase subunits, suggesting that these two proteins may have diverged the least (among known sequences) from the common ancestor.

Table 4.  Results of pairwise comparisons of the deduced amino acid sequences of oxygenase β-subunits with α-subunits AmoC, DmpN, MmoX (of Methylomonas sp. strain KSPIII), and TouA; of oxygenase α-subunits with β-subunits AmoA, DmpL, MmoY (of Methylomonas sp. strain KSPIII), and TouE; and of oxygenase α- and β-subunits of the same enzyme
  1. Sequence data were obtained from GenBank under the accession numbers listed in Table 1. Preliminary analyses revealed no detectable homology between the C-terminus of α-subunits and the β-subunits, so α-subunit sequences were truncated at the C-terminus to yield polypeptides of approximately the same length as the longest β-subunits as a means of facilitating alignments for sequence comparisons. Depicted values are therefore relative and not precise measures of sequence identity. The Dmp, Tou, and Methylomonas sp. strain KSPIII Mmo subunits were selected for comparisons of α- and β-subunits between enzymes because preliminary analyses showed that these proteins exhibited the highest identities for these comparisons for their respective subfamilies.

Strainβ-Subunit% Identity to:α-Subunit% Identity to:% Identity for α- and β-subunits of same enzyme
  AmoCDmpNMmoXTouA AmoADmpLMmoYTouE 
KSPIIIMmoY21.82MmoX20.1318.39
KSWIIIMmoY21.82MmoX20.1318.39
WI 14MmoY21.28MmoX18.58
MMmoY20.91MmoX18.24
OB3bMmoY19.6321.80MmoX18.89
BathMmoY21.22MmoX21.97
B-276AmoA21.4516.78AmoC20.78
CF600DmpL24.2220.13DmpN21.4524.2221.8222.8724.22
P35XPhhL24.2220.13PhhN21.6723.6722.1923.0223.67
HPhlB23.5524.2220.13PhlD21.6723.6721.5823.3723.67
20BDsoB21.4316.57DsoD23.2025.7823.1021.1220.92
NCIB8250MopL21.4316.57MopN23.2025.7823.1020.7920.59
G4TomA119.0021.89TomA320.2022.0317.33
R5PhcL20.00PhcN22.0422.3821.60
TA441AphL19.66AphN22.0422.3421.62
JS150TbmB18.67TbmD21.5021.09
E2PoxBPoxD21.3321.9523.39
OX1TouE20.7822.8718.3922.76TouA16.7822.7622.76
AD45IsoE20.2824.90IsoA19.8017.9421.34
Py2AamE20.2022.2622.8819.60AamA20.3316.7218.57
KR1TmoE17.0723.2318.0918.75TmoA18.5822.1016.61
PKO1TbuA220.6921.5823.4919.80TbuA117.90
JMP134PhlO18.5620.2720.6019.11PhlK

A multiple sequence alignment was used to confirm the homology of all of the α- and β-oxygenase subunit sequences, and to identify amino acids and structural features of the common ancestral protein that are retained in both subunits. A truncated alignment of four selected α-oxygenase subunit and four β-oxygenase subunit proteins is shown in Fig. 3. For the depicted proteins, there are 34 residues that are conserved in at least two of the α-oxygenase subunits and at least two of the β-oxygenase subunits. The Y67 residue (relative to MmoX), which resides in the ‘canyon’ region and is thought to dock other proteins, is conserved in all of the α-oxygenase subunits and in all the β-oxygenase subunits except for four phenol hydroxylases (data for other sequences not shown). Remarkably, vestigial residues of the catalytic diiron centers, such as the iron ligands E114, E209, and H246, are still present in some of the β-oxygenase subunits. Also notable is the conservation among the β-oxygenase subunits of the α-oxygenase subunit hydrogen-bonding residues D143, R146, D242, and R245, and ‘handle’ residues A224, G228, and D229. The validity of the overall alignment of the two subunits is confirmed by the nearly perfect superimposition of six of the analogous α-helices (A–F) from the α- and β-oxygenase subunits of the sMMO from Mc. capsulatus Bath [16].

Figure 3.

Sequence alignments of selected α- and β-oxygenase subunits, taken from the global alignments of the 23 α- and β-oxygenase subunit sequences from Figs. 1 and 2, respectively. Residues which are conserved in at least two of the α-oxygenase subunits and in at least two of the β-oxygenase subunits are boxed in black. α-Helices identified by Rosenzweig et al. [16] for the α- and β-oxygenase subunits of the sMMO from Mc. capsulatus Bath are boxed and labelled.

Maximum likelihood analysis was used to construct a phylogenetic tree of the deduced amino acid sequences for the α- and β-oxygenase subunits, which is depicted in Fig. 4. (Note that in earlier work we described the unlinked and unrooted trees for the α- and β-oxygenase subunits [147], and that a similar α-oxygenase subunit tree appeared more recently in a paper by Kahng et al. [148].) In the phylogeny depicted here, the α- and β-oxygenase subunits form two distinct trees which are joined by well-supported internal nodes (bootstrap values of 100 and 91%, respectively). Each of the trees branches into four clades which correspond to the subfamilies of soluble diiron monooxygenases. Taken together, the results of this analysis support the notion that the α- and β-oxygenase subunits are paralogous proteins and share a common ancestor. In view of the homology of the α-oxygenase subunit to both the ribonucleotide reductases and the stearoyl-ACP Δ9 desaturases, as well as the retention of some of the original iron cluster ligands in the present-day β-oxygenase subunits, it can also be surmised that these proteins originated with the duplication of a gene encoding an ancestral carboxylate-bridged protein with a diiron center. Subsequent evolution could have led to the loss of the active site in the β-oxygenase subunit, as discussed earlier, and the simultaneous radiation and co-evolution of both subunits to yield variations in substrate specificity. The non-catalytic β-oxygenase subunits have evidently diverged more rapidly than the catalytic α-oxygenase subunits, as predicted by the results of pairwise identity comparisons and sequence alignments.

Figure 4.

Unrooted maximum likelihood phylogenetic tree of α- and β-oxygenase subunits of the soluble diiron monooxygenases. Sequences were obtained from GenBank under the accession numbers listed in Table 1. Two hundred and twenty shared residues were analyzed after exclusion of all gaps from the alignment. The tree was obtained using methods described in the text. Numbers at the nodes represent values for local bootstrap probabilities, which are excluded for very short branches. A slash (/) between sequence designations indicates the sequences were identical or nearly identical after editing of gaps.

Through the reciprocal rooting of the trees for the α- and β-oxygenase subunits, the approximate location of the ‘root’ of both trees can be identified, and time directionality established from this point. In the β-oxygenase subunit tree, the alkene/aromatic monooxygenases diverge first, with later divergence of a phenol hydroxylase lineage from a branch leading to the Amo alkene monooxygenase and the sMMOs. The Amo alkene monooxygenase branches off well before the sMMOs, which are evolutionarily the most distant from the putative common ancestor. The topology of the α-oxygenase subunit tree differs from that of the β-oxygenase subunit in the position of the alkene/aromatic monooxygenases, which diverge between the Amo alkene monooxygenase/sMMO and phenol hydroxylase lineages. While there is excellent bootstrap support for the monophyly of the Amo alkene monooxygenase/sMMOs and for each of the two other subfamilies, the node from which the alkene/aromatic monooxygenases diverge has relatively low bootstrap support. However, a RELL analysis of a total of 16 different tree topologies showed the depicted tree to have the greatest RELL bootstrap value (92.81%). In the second highest scoring tree (RELL bootstrap of 2.55%), the Amo alkene monooxygenase/sMMO lineage also diverges prior to the other clades in the α-oxygenase subunit half of the tree. These analyses support the topology of the phylogenetic tree as depicted in Fig. 4, and it can be concluded that the incongruous position of the alkene/aromatic monooxygenase lineage in α- and β-oxygenase subunit trees is not likely to be an artifact. The simplest explanation for these results is that a gene ‘swapping’ (recombination) event occurred after the divergence of alkene/aromatic monooxygenases from the other lineages, but prior to the radiation of the alkene/aromatic monooxygenase clade. There are two possibilities for this scenario: either the divergence of the β-oxygenase subunits is out of place as a result of a progenitor operon acquiring a β-oxygenase gene which diverged before the split between the Amo alkene monooxygenase/sMMOs and phenol hydroxylases, or the α-oxygenase subunit is out of place as a result of a progenitor operon acquiring an α-oxygenase gene from an early phenol hydroxylase-like operon. The latter explanation is inherently more appealing because it implies that the progenitor alkene/aromatic monooxygenase may have exchanged its catalytic subunit for a homolog diverging later from the α-oxygenase subunit lineage, perhaps one with superior catalytic properties. In the alternative case, an exchange would have entailed the acquisition of a more ‘primitive’β-oxygenase subunit, which seems unlikely.

Taken together, the evidence from the inferred phylogeny suggests that the soluble diiron monooxygenases evolved from a common ancestral enzyme capable of catalyzing the hydroxylation of alkene and aromatic hydrocarbons, with divergent evolution yielding enzymes of varying degrees of substrate specificity and regiospecificity. This contention is supported by the observation that at least some enzymes in all four subfamilies have been shown to oxidize alkenes, while for three of the four subfamilies aromatic compounds can serve as substrates. Moreover, enzymes with broad specificity that may retain the traits of the ancestral enzyme still exist, including the Aam alkene monooxygenase, the Tom (toluene o-) monooxygenase, and the Tou (toluene/o-xylene) monooxygenase, the latter of which hydroxylates aromatic hydrocarbons without regiospecificity. By contrast, the Tbu (toluene 3-) monooxygenase and Tmo (toluene 4-) monooxygenase that diverge after Tou (in the catalytic α-oxygenase subunit tree) are regiospecific monooxygenases that do not hydroxylate phenolic substrates, an indication that substrate specificity and regiospecificity evolved more recently. The phenol hydroxylases may have diverged into two groups of enzymes with different substrate specificities, the broader specificity Tom, Tbm, Aph, and Phc enzymes all having been shown to hydroxylate both activated and unactivated aromatic hydrocarbons, and the others (Dmp, Phh, Phl, Dso, Mop) known only to hydroxylate phenolic aromatics. The sMMO subunits branch off at points which are furthest from the hypothetical common ancestor, indicating that the ability to oxidize methane was a relatively recent evolutionary event, perhaps requiring more extensive evolution to ‘fine-tune’ the geometry of the diiron center to enable methane to be oxidized. The divergence of the non-methane-oxidizing Amo alkene monooxygenase before the sMMOs also suggests that this capability had not been acquired by that point in time.

In comparing the phylogeny of the soluble diiron monooxygenases to previously described organismal phylogenies [128], the available data indicate that the enzymes of different subfamilies have in most cases been independently assorted to various groups of Proteobacteria and Actinobacteria (and likely other groups as well) through horizontal transfer. This notion is supported by the distribution of the phylogenetically distinct phenol hydroxylases and four-component aromatic monooxygenases among closely related or identical species of β- and γ-Proteobacteria, as well as by the acquisition of different alkene monooxygenases by two strains of Rhodococcus spp. The evidence for horizontal rather than vertical transmission of all of these enzymes, as discussed earlier, is the restricted distribution of the enzymes, i.e. they do not appear to have been retained by any entire species and may be ‘homeless’ enzymes that are passed from strain to strain to support adaptations to local environments. Horizontal transfer is not random, however, and occurs with greater frequency among closely related species because of mechanisms that limit uptake or integration of foreign DNA [149]. This is clearly the case for the phenol hydroxylases, for which one lineage is distributed among the β-Proteobacterial genera Burkholderia (Tbm and Tom) and Comamonas (Aph and Phc), and the other lineage among the γ-Proteobacterial genera Pseudomonas (Dmp, Phh, and Phl) and Acinetobacter (Dso and Mop). Similarly, the type II sMMOs of the α-Proteobacteria (Ms. trichosporium OB3b, Methylocystis strains), form a monophyletic group to the exclusion of the type I (Methylomonas strains) and X (Mc. capsulatus Bath) enzymes of the γ-Proteobacteria.

9Phylogeny of the oxidoreductase components

Table 5 depicts the summarized pairwise identity scores for the oxidoreductase components. Sequence identities are lower within subfamilies than for the catalytic α-oxygenase subunits (mean scores of 41.34–63.55%), but are in most cases higher between subfamilies (mean scores of 28.87–30.36%). The higher identity scores between subfamilies are most likely associated with the conservation of residues in the NAD(P)-binding, FAD-binding, and ferredoxin domains, which are required for electron transfer in these proteins.

Table 5.  Pairwise identity data for oxidoreductase components, expressed as mean values for comparisons of deduced amino acid sequences within and between designated subfamilies
  1. 100% identities resulting from same sequence comparisons were excluded from the data used for these calculations. Sequence data were obtained from GenBank under the accession numbers listed in Table 1.

 sMMOsAmo alkene monooxygenasePhenol hydroxylasesAlkene/aromatic monooxygenases
sMMOs59.54±18.2530.36±3.5228.99±1.8528.87±3.17
Amo alkene monooxygenase 100.00±0.0034.20±1.2232.46±2.40
Phenol hydroxylases  63.55±11.9930.22±1.34
Alkene/aromatic monooxygenases   41.34±10.50

A phylogenetic tree of the oxidoreductase components was inferred from deduced amino acid sequences, using maximum likelihood analysis as before. Unlike the α- and β-oxygenase subunits, the oxidoreductase components are known to be closely related to isofunctional subunits of other oxygenases, all being members of the ferredoxin-NADP+ reductase family [117,150–153]. To clarify the relationships between these proteins, a number of homologous sequences were identified using the TFASTA program from the GCG program suite (Accelrys), and included in the multiple sequence alignment from which the tree was constructed. With reference to the classification system Batie et al. [150] developed for the Rieske center non-heme iron oxygenases (originally ‘dioxygenases’), the resulting phylogeny (Fig. 5) is composed of class IB-type and class III-type reductase components. Both class IB and III reductases are defined on the basis that they contain FAD and a chloroplast-type [2Fe–2S] center, while the class III proteins require an additional Rieske center ferredoxin as a separate subunit. Among the Rieske center diiron oxygenases, the toluate dioxygenase reductase and its relatives and the 2-oxo-1,2-dihydroquinoline 8-monooxygenase reductase have been previously placed in class IB, and the naphthalene dioxygenase reductase and its relatives in class III [154]. Under the same classification, the reductases of the membrane-bound diiron monooxygenases (exemplified by the xylene monooxygenase) fall in class IB, as do the reductases of the Amo alkene monooxygenase, phenol hydroxylases, and the sMMOs. The alkene/aromatic monooxygenases contain a separate ferredoxin and therefore can be included in class III.

Figure 5.

Unrooted maximum likelihood phylogenetic tree of oxidoreductase components of the soluble diiron monooxygenases and homologous reductases from other oxygenases. Sequences for the soluble diiron monooxygenases were obtained from GenBank under the accession numbers listed in Table 1. Abbreviations, accession numbers, and references for other sequences are as follows: AntC, anthranilate dioxygenase of Acinetobacter sp. strain ADP1, AF071556 [154]; BenC, benzoate 1,2-dioxygenase of Acinetobacter sp. strain ADP1, M76990, M62649 [152]; CarAd, carbazole 2,3-dioxygenase of P. stutzeri OM1, AB001723 [155]; CbdC, 2-halobenzoate 1,2-dioxygenase of B. cepacia 2CBS, X79076 [156]; DntAa, 2,4-dinitrotoluene dioxygenase of Burkholderia sp. strain DNT, U62430 [157]; HybA, salicylate 5-hydroxylase of P. aeruginosa sp. strain JB2, AF087482 [158]; NahAa, naphthalene 1,2-dioxygenase of P. putida G7, M83949 [159]; NtdAa, 2-nitrotoluene dioxygenase of Pseudomonas sp. strain JS42, U49504 [160]; OxoR, 2-oxo-1,2-dihydroquinoline 8-monooxygenase of P. putida 86, Y12654 [161]; PahAa, polyaromatic hydrocarbon dioxygenase of P. putida OUS82, AB004059 [162]; PhnAa, 3,4-dihydroxyphenanthrene dioxygenase of A. faecalis AKF2, AB024945 [163]; XylA, xylene monooxygenase of P. putida PaW1, D63341 [164]; XylZ, toluate 1,2-dioxygenase of P. putida PaW1, M64747 [165]. Two hundred and twelve shared residues were analyzed after exclusion of all gaps from the alignment. The tree was obtained using methods described in the text. Tree branches were drawn to emphasize the similarity in topology to the β-oxygenase subunit tree depicted in Fig. 4. Dashed lines indicate lineages for enzymes other than the soluble diiron monooxygenases. Numbers at the nodes represent values for local bootstrap probabilities, which are excluded for very short branches. A slash (/) between sequence designations indicates the sequences were identical or nearly identical after editing of gaps. The phylogenetic positions of the reductase subunits of the methanesulfonic acid monooxygenase of Methylosulfonomonas methylovora M2 ([166], accession number AF091716) and Marinosulfonomonas methylotropha TR3 (Baxter et al. [167], accession number AF360864) are not depicted; these proteins diverge at a point between the clade containing NahAa and the clade composed of the alkene/aromatic monooxygenases.

While the phylogenetic reconstruction of the reductases is unrooted, the approximate position of a root can be inferred from the bifurcation of the tree into monophyletic branches of class IB and class III reductases. This inference implies the existence of an ancestral reductase that diverged into proteins that function in concert with a separate ferredoxin component (class III), and those that deliver electrons directly to the catalytic component of the enzyme (class IB). The reductases of these two classes were apparently distributed to various progenitors of three major groups of phylogenetically unrelated oxygenases, namely, the soluble diiron monooxygenases, the xylene monooxygenases and their relatives, and at least three distinct subfamilies of Rieske center oxygenases. Within the class IB lineage, the xylene monooxygenase reductase and its relatives diverged first, followed by the reductases of the Amo alkene monooxygenase, the phenol hydroxylases, the toluate dioxygenase and relatives, the 2-oxo-1,2-dihydroquinoline 8-monooxygenase, and finally the sMMOs. Among the class III reductases, naphthalene dioxygenase and its relatives were first to diverge, followed by the four-component alkene/aromatic monooxygenases and the methanesulfonic acid monooxygenases of Methylosulfonomonas methylovora M2 [166] and Marinosulfonomonas methylotropha TR3 [167]. (The methanesulfonic acid monooxygenases occupy a relatively distant position from their point of divergence, and their phylogenetic position is not depicted because their inclusion in the phylogeny decreases the resolution and bootstrap support of the tree.) Interestingly, the reductases from two Rieske center oxygenases, the CarAd subunit from the carbazole dioxygenase of P. stutzeri OM1 [155] and the PhnAa subunit from the 3,4-dihydroxyphenanthrene dioxygenase of Alcaligenes faecalis AKF2 [163], are close relatives of the Tmo and Tou reductases, respectively. This finding suggests that Car and Phn are composite enzymes which have acquired their reductases by horizontal transfer from the unrelated alkene/aromatic monooxygenases.

With time directionality established from the point of divergence of the class IB and class III reductases, the inferred phylogeny of the oxidoreductase components of the soluble diiron monooxygenases is remarkably similar to the portion of the β-oxygenase subunit tree which branches from the last common ancestor. This lends further credence to the earlier contention that the incongruous topology of the α-oxygenase subunits reflects the occurrence of an early gene swapping event in the evolution of the alkene/aromatic monooxygenases. It should be noted that the oxidoreductase tree does differ from both the α- and β-oxygenase subunit trees in the position of AmoD, which emerges in a lineage near the postulated root of the tree, and well before the divergence of the phenol hydroxylases and the sMMOs. This variance clearly suggests that the α- and β-oxygenase subunits of the Amo alkene monooxygenase were acquired more recently than the oxidoreductase, likely through another recombination event.

An abridged alignment of the deduced amino sequences of the oxidoreductases is shown in Fig. 6, with sequences divided into ‘subfamilies’ on the basis of clades derived from the inferred phylogeny. All are related to FNR and contain a two-domain module, one domain binding flavin and the other binding pyridine nucleotide. Essential residues in these and the N-terminal chloroplast ferredoxin-like domains have been discussed previously by Neidle et al. [152], Correll et al. [168], and Byrne et al. [18] and are labelled on the figure. Interactions with FAD are mediated by a conserved isoalloxazine ring-binding motif RX(F/Y)S and the phosphate-binding sequence GX2(S/T). NAD(P) is bound by the ribose-binding sequence GGXGX2–3P and the nicotinamide-binding sequence YX(A/C)G. Other conserved residues include the ferredoxin domain CX2GXCX2CX6GX18–22LXC, which coordinates the [2Fe–2S] center, and the G145, Q146, and P204 residues (numbering relative to MmoC of Mc. capsulatus Bath), for which the specific functions are not known. There is no apparent conservation of residues which are unique to either the class IB or class III reductases, indicating that evolutionary divergence of the two groups has not resulted in marked variations in structure that are specific to their respective classes. Moreover, few residues are unique to the various subfamilies, which is not particularly surprising given that the reductases probably do not play an important role in substrate specificity, and can therefore tolerate greater sequence variations at residues which do not participate in electron transfer reactions, cofactor/coenzyme binding, or protein–protein interactions. The extent of this variation is particularly apparent in the case of the alkene/aromatic monooxygenases, which contain only five residues that are conserved and unique to their subfamily.

Figure 6.

Figure 6.

Sequence alignments of the proteins corresponding to the oxidoreductase components of soluble diiron monooxygenases and other oxygenases, with sequences grouped by subfamily. Sequences were obtained from GenBank under the accession numbers listed in Table 1 and Fig. 5. Boxing of residues is as described for Fig. 1. Abbreviations are provided in Fig. 5. Labels describe the hypothesized function of conserved domains.

Figure 6.

Figure 6.

Sequence alignments of the proteins corresponding to the oxidoreductase components of soluble diiron monooxygenases and other oxygenases, with sequences grouped by subfamily. Sequences were obtained from GenBank under the accession numbers listed in Table 1 and Fig. 5. Boxing of residues is as described for Fig. 1. Abbreviations are provided in Fig. 5. Labels describe the hypothesized function of conserved domains.

10Phylogeny of the ferredoxin components

The Rieske-type ferredoxin components of the alkene/aromatic monooxygenases exhibit an average sequence identity of 48.97±12.54% to one another, and are homologous to the ferredoxins that are components of the Rieske center non-heme iron oxygenases, cytochromes, and other enzymes. The maximum likelihood tree of the deduced amino sequences of the ferredoxins from the alkene/aromatic monooxygenases, together with a number of the most closely related ferredoxins (as determined from TFASTA results), is depicted in Fig. 7. To determine the position of the root of the tree containing these sequences, a sequence from the homologous but distantly related cytochrome ferredoxins was used as an outgroup. Preliminary analyses revealed that the cytochrome bc1 ferredoxin from Paracoccus denitrificans, FbcF [175], was the closest known relative to the sequences in this group, and was included in the phylogeny for this purpose.

Figure 7.

Unrooted maximum likelihood phylogenetic tree of the Rieske-type ferredoxin components of the alkene/aromatic monooxygenases and homologous proteins from other enzymes. Sequences for the alkene/aromatic monooxygenases were obtained from GenBank under the accession numbers listed in Table 1. Abbreviations, accession numbers, and references for other sequences are as follows: BpdB, biphenyl dioxygenase of Rhodococcus sp. strain M5, U27591 [169]; BphA3, biphenyl dioxygenase of P. pseudoalcaligenes KF707, M83673 [170]; BedB, benzene dioxygenase of P. putida ML2, AF148496, U08463, U25434, AF143810, L04642, L04643 [171]; CarAc – CB3, carbazole 2,3-dioxygenase of Sphingomonas sp. strain CB3, AF060489 [172]; CarAc – OM1, carbazole 2,3-dioxygenase of P. stutzeri OM1, AB001723 [155]; CmtAd, p-cumate 2,3-dioxygenase of P. putida F1, U242215 [173]; CumA3, cumene dioxygenase of P. fluorescens IP01, D37828 [174]; DntAb, 2,4-dinitrotoluene dioxygenase of Burkholderia sp. strain DNT, U62430 [157]; FbcF, ferredoxin subunit of the cytochrome bc1 complex of Paracoccus denitrificans, X05799 [175]; HcaA3, 3-phenylpropionate dioxygenase of Escherichia coli K-12, Y11070 [176]; HybD, salicylate 5-hydroxylase of P. aeruginosa sp. strain JB2, AF087482 [158]; IpbA3, isopropylbenzene dioxygenase of Pseudomonas sp. strain JR1, U53507 [177]; MocE, demethylase of Sinorhizobium meliloti L5-30, AF076471 [178]; NahAb, naphthalene 1,2-dioxygenase of P. putida G7, M83949 [159]; NsaA3, 1,2-dihydroxynaphthalene dioxygenase of Sphingomonas sp. strain BN6, U65001 [179]; NtdAb, 2-nitrotoluene dioxygenase of Pseudomonas sp. strain JS42, U49504 [160]; PahAb, polyaromatic hydrocarbon dioxygenase of P. putida OUS82, AB004059 [162]; PhnAb, 3,4-dihydroxyphenanthrene dioxygenase of A. faecalis AKF2, AB024945 [163]; PsbAd, p-cumate 2,3-dioxygenase of Rhodopseudomonas palustris No. 7, AB022919 [180]; S. solfataricus P1, archaeal Rieske-type ferredoxin of S. solfataricus P1, AB047031 [181]; TcbAc, chlorobenzene dioxygenase of Pseudomonas sp. strain P51, U15298, M61114 [182]; TodB, toluene 2,3-dioxygenase of P. putida F1, J04996 [183]. Seventy-eight shared residues were analyzed after exclusion of all gaps from the alignment. The tree was obtained using methods described in the text. Tree branches were drawn to emphasize the similarity in topology to the oxidoreductase subunit tree depicted in Fig. 5. Dashed lines indicate lineages for enzymes other than the soluble diiron monooxygenases. Numbers at the nodes represent values for local bootstrap probabilities, which are excluded for very short branches. A slash (/) between sequence designations indicates the sequences were identical or nearly identical after editing of gaps. The phylogenetic positions of the ferredoxin subunits of the methanesulfonic acid monooxygenase of Methylosulfonomonas methylovora M2 ([166], accession number AF091716) and Marinosulfonomonas methylotropha TR3 ([167], accession number AF354805) are not depicted; these proteins diverge at a point between the S. solfataricus P1 ferredoxin and the clade composed of the alkene/aromatic monooxygenases.

The resulting phylogeny bifurcates at the postulated root between a monophyletic group of ferredoxins from Rieske center oxygenases and a second lineage containing the ferredoxins of a heterogeneous group of enzymes that includes the alkene/aromatic monooxygenases. In the upper branch as depicted on the phylogeny, it appears that the ferredoxins from the class IIB Rieske center oxygenases diverged first from a point near the last common ancestor for these proteins (class IIB reductases contain FAD and require a separate Rieske-type ferredoxin). Diverging later are a clade containing the ferredoxin components of the ‘composite’ carbazole dioxygenase of P. stutzeri OM1 and the 3,4-dihydroxyphenanthrene dioxygenase of A. faecalis AKF2 discussed earlier, followed by the p-cumate dioxygenases. The topology of the lower branch is similar to the phylogeny of the oxidoreductase components, with the ferredoxins of NahAb and its relatives emerging near the root, and the corresponding subunits of the methanesulfonic acid monooxygenases and alkene/aromatic monooxygenases diverging later. (Note that the sequences for the methanesulfonic acid monooxygenases were again excluded because their inclusion decreased the resolution and bootstrap support for the tree.)

Two other groups of Rieske-type ferredoxins are derived from the lower branch of the inferred phylogeny, both diverging after the clade that contains NahAb and its relatives. The first group consists of the ferredoxin components of demethylases involved in rhizopine metabolism by the rhizobia, represented by the MocE sequence from Sinorhizobium meliloti L5-30 [178]. These demethylases appear to be another example of composite enzymes, since they contain a catalytic subunit which is most closely related to that of the xylene monooxygenase, but require both an oxidoreductase and a Rieske ferredoxin as electron transfer components [178,184]. Emerging from a position between the points of divergence of MocE and the alkene/aromatic monooxygenases is a group of ferredoxins of unknown function that have been identified by genomic sequencing of the archaeon Sulfolobus solfataricus, represented in this phylogeny by the sequence from S. solfataricus P1 [181]. Additional phylogenetic analyses (not shown) indicate that these proteins are most closely related to the ferredoxin components of the nitrite reductases from Mesorhizobium loti ([185], accession numbers AP003000 and BA000012) and Staphylococcus carnosus ([186], accession number AF029224). Because of their position in the phylogeny and relationship to other bacterial ferredoxins, it can be concluded that the S. solfataricus proteins were probably acquired by trans-domain horizontal gene transfer from the Bacteria to the Archaea.

An abridged multiple sequence alignment containing the deduced amino sequences of these ferredoxins is shown in Fig. 8, with sequences again divided into subfamilies on the basis of phylogenetic clades. Inspection of the alignment reveals that the cysteine and histidine residues which act as ligands for the two Rieske [2Fe–2S] clusters [187] are contained within a CXHX16–17CX2H motif that is present in all of the sequences. Other conserved residues include G163 and P169 (numbered relative to FbcF), for which the functions are not known. As with the oxidoreductases, the ferredoxins are electron transfer proteins that do not play a role in substrate specificity and therefore do not exhibit extensive conservation of residues that are unique to subfamilies, i.e. significant sequence variation is tolerated for residues not involved in electron transfer reactions, cofactor binding, or component interactions.

Figure 8.

Sequence alignments of the proteins corresponding to the oxidoreductase components of soluble diiron monooxygenases and other enzymes, with sequences grouped by subfamily. Sequences were obtained from GenBank under the accession numbers listed in Table 1 and Fig. 7. Boxing of residues is as described for Fig. 1. Abbreviations are provided in Fig. 7. Labels describe the hypothesized function of conserved domains.

11Summary of phylogenetic evidence

11.1Origins and divergence of the soluble diiron monooxygenases

The results of phylogenetic analyses strongly suggest that the genes encoding the soluble diiron monooxygenases were assembled through a combination of gene duplication and the acquisition of appropriate electron transfer and other components, much in the same manner as the operons encoding the V/F/A ATPases [188–190] and bacterial luciferases [191] are known to have evolved. The catalytic α- and non-catalytic β-oxygenase subunits for the soluble diiron monooxygenases are clearly derived from the duplication of a gene encoding a progenitor carboxylate-bridged diiron protein, possibly a close relative of the ribonucleotide reductases, which are known to be ancient enzymes [192]. In addition to the striking structural similarity of the two subunits, their primary amino acid sequences bear significant homology to one another, with some of the non-catalytic β-oxygenase subunits even retaining vestigial active site residues as evidence of the gene duplication event. The oxidoreductase and ferredoxin components for these enzymes, on the other hand, were most likely acquired by horizontal transfer. The reductases share a common ancestor with the isofunctional components of the membrane-bound diiron monooxygenases, the class IB and class III Rieske center non-heme iron oxygenases, and the methanesulfonic acid monooxygenases, indicating that all were acquired from the same genetic ‘pool’ of these electron transfer proteins. In a similar way, the Rieske-type ferredoxin components of the alkene/aromatic monooxygenases have clearly originated from an ancestor common to the ferredoxins of the cytochromes, the ferredoxin-requiring Rieske center oxygenases, the methanesulfonic acid monooxygenases, a putative nitrite reductase from the archaeon S. solfataricus P1, and demethylases of the rhizobial species. A comparison of phylogenies revealed highly similar evolutionary histories for the oxidoreductase and ferredoxin components of the class III Rieske center oxygenases (naphthalene dioxygenase and its relatives) and the alkene/aromatic monooxygenases, suggesting that these subunits were brought together relatively early in evolution, and were subsequently co-acquired as a genetic ‘module’ by the antecedents of these two groups of enzymes. The phylogeny of the effector proteins was not addressed in this study, but they have been shown to be homologous to the putidaredoxins and therefore likely to be descendant from the ancestor of both groups. The γ-oxygenase subunits of the different subfamilies of these enzymes do not show significant sequence similarity, suggesting that they were acquired independently after the divergence of the subfamilies, or were acquired earlier and have diverged to the extent where no detectable similarity remains.

Phylogenetic evidence further suggests that the last common ancestor of the soluble diiron monooxygenases was capable of oxidizing both alkene and aromatic hydrocarbons, with subsequent divergence yielding four subfamilies of enzymes with overlapping substrate specificities. Reconstructions using α- and β-oxygenase subunit and oxidoreductase component sequences suggest that the alkene/aromatic monooxygenases diverged first, but that they may have later acquired an early phenol hydroxylase-like α-oxygenase subunit through horizontal transfer. In this light, it is tempting to speculate that the ancestral enzyme may have contained a separate ferredoxin component, and that the requirement for this subunit was lost in the phenol hydroxylase, Amo alkene monooxygenase, and sMMO lineages which emerged in that order after the alkene/aromatic monooxygenases. Ultimately, the alkene/aromatic monooxygenases diverged into distinct alkene and aromatic monooxygenase clades based on their role in metabolism, with both the Tou (toluene/o-xylene) monooxygenase and the Aam alkene monooxygenase retaining broad specificities and the ability to oxidize alkenes, phenols, and unactivated aromatic hydrocarbons. Further evolution of this group conferred both regiospecificity and a narrower substrate specificity to some enzymes, namely the Tmo (toluene 4-) monooxygenase and the Tbu (toluene 3-) monooxygenase. In the phenol hydroxylase subfamily, bifurcation of the lineage has apparently resulted in what appears to be a segregation of broad specificity phenol hydroxylases of the β-Proteobacteria (exemplified by the Tom [toluene o-] monooxygenase) from narrower specificity enzymes of the γ-Proteobacteria (exemplified by the Dmp phenol hydroxylase). The Amo alkene monooxygenase, like the alkene/aromatic monooxygenases, contains subunits with differing evolutionary histories. The oxidoreductase component is relatively ancient and diverges very near the postulated last common ancestor for these proteins, whereas the α- and β-oxygenase subunits were not acquired until after the split between the phenol hydroxylase and sMMO lineages. The sMMOs diverge into at least two lineages which correspond to bacterial phylogenies, the type I and X enzymes of the γ-Proteobacteria comprising one group and the type II enzymes of the α-Proteobacteria the second. Based on the phylogenetic position of the sMMOs at a point that is furthest from the last common ancestor of the soluble diiron monooxygenases, it appears likely that the acquisition of the ability to oxidize methane was a relatively recent evolutionary event in this family of enzymes.

11.2Role of horizontal gene transfer in the evolution of the soluble diiron monooxygenases

The results of recent genomic sequencing studies have revealed that the Bacteria and Archaea are chimeric entities, containing patchwork mosaic genomes with a mixture of ancestral DNA sequences and those acquired more recently by horizontal transfer from other organisms [123,127,193]. The prokaryotic genome is continually exposed to new sequences through horizontal gene transfer events, providing a mechanism for the rapid acquisition of novel capabilities, and thereby driving species adaptation and divergence [123]. In this context, and based on their scattered phylogenetic distribution, the soluble diiron monooxygenases can be concluded to be ‘homeless’ enzymes whose genes have been successfully transmitted through horizontal transfer but are not maintained permanently in bacterial lineages. Instead, they appear to have been passed among the methanotrophs, the pseudomonads and Acinetobacter spp., the Actinobacteria, and other bacteria in response to the presence of appropriate hydrocarbon substrates in the environment. For the methanotrophs, the sMMOs may serve as useful adjuncts to the pMMOs that allowed them to colonize a broader range of environments. For the metabolically versatile pseudomonads of the β- and γ-Proteobacteria, the phenol hydroxylases and toluene monooxygenases have been coupled with the enzymes of ortho- and meta-cleavage pathways to enable a wide variety of aromatic compounds to be utilized as growth substrates. In a similar way, the acquisition of alkene monooxygenases may provide a means of broadening the substrate range for Rhodococcus and other genera, although at this time little is known of the distribution of these enzymes.

Many of the soluble diiron monooxygenases appear to have been acquired relatively recently through horizontal gene transfer, as indicated by aberrant GC base compositions and/or the plasmid location of the operons encoding these enzymes. These include the toluene o-monooxygenase, toluene 4-monooxygenase, toluene/o-xylene monooxygenase, toluene/benzene monooxygenase, Amo alkene monooxygenases, isoprene monooxygenase of Rhodococcus sp. strain AD45, and phenol hydroxylases of P. putida CF600 and P. putida H. The evidence is less compelling for the phenol hydroxylases of P. putida P35X and C. testosteroni R5, and the sMMO of Mc. capsulatus Bath, all of which are encoded by operons that contain one or more interior genes with lower GC composition than the genome. The remaining enzymes are either chromosomally-encoded or the genomic location of the operon is unknown, and the GC composition does not differ significantly from the genome. These enzymes may have been acquired recently from related species with similar base compositions, or sufficient time may have passed for amelioration of the sequences to have occurred. In any case, a chromosomal location for these operons is not likely to be indicative of persistent vertical transmission, but rather of chromosomal integration facilitated by selective pressures associated with the availability of hydrocarbon substrate vs. the cost of maintaining the genes on a plasmid. The chromosomal location of all of the sMMO genes characterized thus far may, in this light, reflect the availability of methane for the methanotrophs in their habitats.

12Concluding remarks

For the current study, the inferences of phylogenetic relationships among the soluble diiron monooxygenases were based almost entirely on the sequences of enzymes that have been identified in the pseudomonads, Acinetobacter spp., the methanotrophs, and Rhodococcus spp. It is by no means clear that these enzymes are limited to the bacteria in these groups, since our interest in these groups is based in part on the fact that most are readily cultivated and have some value as biocatalysts or as agents of bioremediation. Given that two of the enzymes in this family were identified in Rhodococcus spp., it seems likely that further work will uncover numerous additional examples among the Actinobacteria. We must also consider the possibilities not only of trans-‘division’ horizontal transfers of genes encoding these enzymes among the Bacteria, but also of trans-domain exchanges between the Bacteria and the Archaea. In light of these observations, the findings of the present study may only serve as the tip of the iceberg with regard to the diversity of the soluble diiron monooxygenases. Clearly, too, much additional work remains to clarify patterns and mechanisms for horizontal gene transfer, particularly with regard to the part played by linear plasmids as conveyors of the alkene monooxygenases, and the largely unknown role of transposons. While genomic sequencing studies may cast further light on the distribution and spread of these enzymes in prokaryotes, their itinerant nature may allow their genes to escape detection through this route. It is more likely that a true assessment of their importance may only be obtained through PCR- or probe-based molecular ecological studies that allow the sequences encoding these enzymes to be identified in both culturable and non-culturable organisms from a variety of environments.

Acknowledgements

We gratefully acknowledge Miklos Muller, Tetsuo Hashimoto, Pat Dash, Russell Malmberg, Shinji Yasuhira, and Arlin Stolzfus for their guidance in the use of PROTML, and Joseph Felsenstein for assistance with PHYLIP. We also thank Malcolm Shields, Alan Harker, and William Hickey for making data available prior to publication, and Stephanie Aikman, Jeremy Smith, and Stanley Patti for assistance with sequence retrieval and analyses.

Ancillary