Multiple copies of ammonia monooxygenase (amo) operons have evolved under biased AT/GC mutational pressure in ammonia-oxidizing autotrophic bacteria


*Corresponding author. Present address: Department of Biology, University of Louisville, Louisville, KY 40292, USA. Tel.: +1 (502) 852-6771; Fax: +1 (502) 852-0725; E-mail:


The recent availability of complete sequences of ammonia monooxygenase (16 amoA, 5 amoB and 5 amoC gene sequences) and particulate methane monooxygenase (2 pmoA, pmoB and pmoC gene sequences each) genes allowed for a detailed analysis of their relatedness. Nucleotide sequence analysis was performed in order to identify the origins of the nearly identical operon copies within a given nitrosofier/methanotroph strain. Our data suggest that amo-homologous gene evolution has occurred in individual strains (orthology) under biased AT/GC pressure rather than by horizontal transfer. The multiple operon copies within individual strains are the result of operon duplication (paralogy). While the near identity of the multiple operon copies makes it impossible to determine whether paralogous gene expansion occurred in the last common ancestor of ammonia oxidizers or after speciation took place, we conclude that the duplication events were not recent events. We propose that the elimination of third basepair degeneracy between copies within one organism is implemented by a rectification mechanism resulting in concerted evolution.


Ammonia monooxygenase (AMO) is a three-subunit enzyme [1] which is expressed from 1, 2 or 3 copies of polygenic operons in ammonia-oxidizing autotrophic bacteria [2–4]. AMO has a key function in autotrophic ammonia catabolism where it catalyzes the first step in the oxidation of ammonia resulting in the production of hydroxylamine [5]. Hydroxylamine is subsequently oxidized to nitrite by the periplasmic enzyme hydroxylamine oxidoreductase [5].

Recent analysis of AMO-encoding DNA sequences revealed a near identity of the individual amoA gene copies in Nitrosospira sp. NpAV while the similarity between amoA genes from different strains was significantly lower [3]. In that article, we hypothesized that the almost completely identical multiple amoA gene copies were products of gene duplication rather than horizontal transfer. Because of the lack of third base pair degeneracy, we hypothesized that the duplication had occurred either recently or that the high level of sequence identity was maintained by a rectification mechanism [3]. In our effort to generate an amo gene database, a number of complete amo gene and operon sequences have been recently determined [6] and deposited into GenBank (Table 1). This and the deposit of two complete DNA sequences of the pMMO-encoding pmo operons [7] made it possible to analyze gene sequences from different organisms with consideration of the effect of mutational drift in the third codon positions. We included the pmo operon sequences in our analysis because of indications that AMO and pMMO are evolutionarily related [8]. While the operon nature of the pmoCAB and amoCAB gene clusters has been confirmed [4, 9], participation of all three subunits A, B and C in the functional hetero-multimeric enzyme has been recently shown only for pMMO [10]. In this paper we demonstrate that the evolution of amo-homologous genes (whose multi-subunit expression products, AMO and pMMO, can oxidize ammonia) in β- and γ-proteobacterial nitrosofying chemolithoautotrophs occurred under AT/GC mutational pressure.

Table 1.  Nucleotide sequences of genes encoding subunits of ammonia monooxygenase for various pure culture strains of ammonia-oxidizing autotrophic soil bacteria
OrganismLocusGenBank accession number
Nitrosospira sp. NpAVamoAB1AF032438, U38250
 amoCAB2AF016003, U20644
 amoCAB3U92432, U72981, U38251
Nitrosospira sp. Np39-19amoA1AF042170
Nitrosospira briensis C-128amoA1U76553
Nitrosovibrio tenuis Nv-12amoA1U76552
Nitrosolobus multiformis ATCC 25196amoAB1>U91603
Nitrosolobus multiformis Nm-24CamoAB1>AF042171
Nitrosomonas europaea ATCC 19178amoABL08050, AF058691, AF058692
 amoC1U96180, AF058691
Nitrosomonas eutropha C-91amoAB2>U72670
Nitrosococcus oceanus C-107amoCABU96611
Methylococcus capsulatus BathpmoCAB1L40804

2Materials and methods

Complete genes or operons encoding Amo polypeptides were cloned and sequenced as described elsewhere [1–3, 6]. These sequences (see also Table 1) and sequences of amo genes from Nitrosomonas europaea[4, 11] and pmo genes from Methylococcus capsulatus (Bath) (GenBank accession numbers L40804 and U94337) were aligned using Sequencher 3.0 Software (GeneCodes, Ann Arbor, MI, USA) and a Power Macintosh (Apple, Cupertino, CA, USA). Individual sequences were analyzed for (G+C) content of the ORF and the third codon position. Selected sequence pairs such as Nitrosospira sp. NpAV and N. europaea as well as Nitrosococcus oceanus and M. capsulatus (Bath) were analyzed for codon usage and substitution as described in the following text and in cited references.


3.1Correlation between the (G+C) contents of amo-homologous genes and their third codon positions

Nucleotide sequence comparison between homologous operon copies within one strain confirmed the lack of third base pair degeneracy in all three operon member genes (amoCAB/pmoCAB) from Nitrosospira (3 operons), Nitrosolobus (3 operons), Nitrosomonas (2 operons), and Methylococcus (2 operons) strains (data not shown, refer to GenBank deposits in Table 1). The amo gene sequences from Nitrosococcus and Nitrosovibrio could not be compared for third base pair degeneracy because of the availability of only one sequence. These observations and the high level of sequence similarity among amo operons rather than just the amoA genes led us to ask two questions once again: (i) was the near identity of the multiple amo operon copies a product of operon duplication or horizontal transfer; and (ii) if duplication, did the duplication event occur rather recently? Our search for answers employed the neutral theory of evolution [12] that says that functionally less important parts of the genome evolve faster than more important ones. Muto and Osawa [13] used ribosomal protein-encoding genes in their analysis to demonstrate that the genomic (G+C) content of bacteria is related to their phylogeny. In an expansion of this work, Osawa et al. [14] were able to link directional mutational pressure and codon usage. This concept has been recently applied to discuss the molecular evolution of the ubiquitous functional protein, catalase [15]. While the purpose of Klotz et al. [15] was merely to demonstrate that species-specific codon bias will affect any phylogenetic comparison that is solely based on DNA sequences, it provided a good example for use of this concept to identify horizontal transfer of protein genes between habitat-sharing bacteria (Figuur. 3 in [15]). Thus, we believed that a G+C content analysis of ammonia monooxygenase-encoding genes would help us to answer our first question. Analyses of codon usage and substitution patterns were undertaken in order to determine the direction of the mutation pressure.

Because of AMO's key function in catabolism of ammonia oxidizers, the subunit peptides, AmoC, AmoA and AmoB, must have evolved under high functional pressures; in particular the identified catalytic subunit, AmoA [16]. Consequently, saturation with multiple substitutions in the third codon position should have been limited to predominantly silent mutations allowing only for modest levels of conservative or other missense mutations. If amo genes have evolved under a biased AT/GC pressure, then amo genes and the intergenic sequences should reveal a bias to either higher (A+T) or higher (G+C) content of the third codon positions in nitrosofiers with AT-rich or GC-rich genomes, respectively. We found that this is, indeed, the case: the correlation of (G+C) content between the amo-homologous genes and the third codon positions of these genes using complete sequences from the organisms listed in Table 1 generated a strictly linear relationship. This is depicted in Fig. 1 for the available complete sequences of amoA/pmoA and amoB/pmoB genes (the regression for amoC/pmoC is linear as well; however, with a different slope, data not shown). Because the identity levels between individual amo/pmo gene copies are very high (e.g., substitution values per total number of nucleotides for the three operon copies from Nitrosospira sp. NpAV are: amoC 6/813, amoA 5/825 and amoB 11/1248), only one copy has been selected for analysis from strains with multiple operon copies. The fact that the regression analysis of the data in Fig. 1 did not produce any outlying data points suggests that a recent acquisition of amo genes by horizontal transfer is highly unlikely and that gene (operon) duplication is the most likely cause for the existence of multiple amo operon copies.

Figure 1.

Correlation of GC content between the complete amo genes and the third codon positions of these amo genes. The symbol for the source organism of each amoA and amoB gene is numbered. The straight full lines were determined by linear regression; the dashed line indicates the theoretical position for the case that gene and third codon GC contents were identical. Mean values of GC content were calculated for multiple nearly identical copies of amo genes. The numbers refer to the following organisms (number of amo gene copies used for the calculations is given in parentheses): 1: Nitrosospira sp. NpAV (3); 2: Nitrosospira sp. 39–19 (3); 3: Nitrosospira briensis C-128 (1); 4: Nitrosospira tenuis Nv-12 (1); 5: Nitrosolobus multiformis C-71 (3); 6: Nitrosomonas eutropha C-91 (2); 7: Nitrosomonas europaea ATCC19178 (1); 8: Nitrosococcus oceanus C-107 (1); 9: Methylococcus capsulatus Bath (2).

3.2Codon substitution patterns

Based upon the correlation shown in Fig. 1, we selected the amo operons of the β-proteobacteria Nitrosomonas europaea (NEU) and Nitrosospira sp. NpAV (NAV) as well as the amo and pmo operons of the γ-proteobacteria Nitrosococcus oceanus (NOC) and Methylococcus capsulatus (Bath) (MCB), respectively, for further (G+C) content analysis. Additionally, all four strains have different genomic (G+C) contents of 51% (NEU), 54% (NAV), 48% (NOC) and 61% (MCB) [17]. These differences are also reflected in the (G+C) contents of the ORFs (Fig. 1). Data and results of our analyses are presented in Table 2Table 3.

Table 2.  G+Ca substitutions at 968 homologous codon sites of AMO protein-encoding genes between Nitrosospira sp. NpAV and Nitrosomonas europaea
Codon positionSilentbConservativecOthersTotal
  1. Of the 276 AmoA, 420 AmoB and 272 AmoC codon sites, 118, 147 and 128, respectively, are identical.

  2. aNumber of codons that gain or lose G+C in Nitrosospira sp. NpAV as compared to N. europaea.

  3. bSilent (synonymous) codon substitution.

  4. cConservative amino acid substitutions are K/R, L/I, L/V, I/V, S/T, A/G, E/D, Q/N, and F/Y.

  5. dNo gain or loss (A↔T; G↔C).

AmoA total7813253351498158
AmoB total13192015614352320273
AmoC total7424115531453144
Table 3.  G+Ca substitutions at 932 homologous codon sites of pMMO and AMO protein-encoding genes between Methylococcus capsulatus (Bath) and Nitrosococcus oceanus
Codon positionSilentbConservativecOthersTotal
  1. Of the 247 AmoA/pMmoA, 418 AmoB/pMmoB and 267 AmoC/pMmoC codon sites, 67, 89 and 74, respectively, are identical.

  2. aNumber of codons that gain or lose G+C in Methylococcus capsulatus (Bath) as compared to Nitrosococcus oceanus.

  3. bSilent (synonymous) codon substitution.

  4. cConservative amino acid substitutions are K/R, L/I, L/V, I/V, S/T, A/G, E/D, Q/N, and F/Y.

  5. dNo gain or loss (A↔T; G↔C).

AmoA total46142515411341318180
AmoB total75243325620684533329
AmoC total519211539442021193

3.2.1The β-proteobacteria Nitrosomonas europaea and Nitrosospira sp. NpAV (see Table 2)

We compared a total of 968 homologous codon sites in the amo operons from Nitrosomonas europaea and Nitrosospira sp. NpAV. Among the 968 codons, 393 (41%) codons were identical and 575 (59%) codons were substituted. Of the 575 substituted codons, 369 codons (64%) in NAV gained (G+C), 97 codons (17%) in NAV lost (G+C) while 109 codons (19%) did not show a gain or loss in (G+C) in NAV over (A+T) in NEU. As expected, 385 (67%) codon substitutions led to silent (synonymous) mutations with 347 (60%) having occurred in the 3rd position. Only one third of the substitutions led to conservative (10%) and other (23%) changes in the primary structures.

Comparison of all NAV amoCAB nucleotide sites (2877) to NEU showed a net gain of 301 (10%) (G+C) over (A+T). The (G+C) content of the intergenic amoC-amoA spacer was found to be: 61.5% in NAV and 47.6%NEU yielding an average of 13% more (G+C) in the NAV spacer (identity levels between the spacers among individual NAV operon copies are 2 substitutions per 226 nt). This represents a faster evolution (13 vs. 10%) of the spacer in NAV towards a higher (G+C) content than the peptide coding ORFs which is in agreement with the neutral theory of molecular evolution [12].

3.2.2The γ-proteobacteria Methylococcus capsulatus and Nitrosococcus oceanus (see Table 3)

By comparing Methylococcus capsulatus to Nitrosococcus oceanus, we identified a total of 932 homologous pmo/amo gene codon sites (in the 3 ORFs). Of these sites only 230 (25%) codons were identical and 702 (75%) codons were substituted. Among the 702 substituted codons, 373 codons (53%) in MCB gained (G+C), 138 codons (20%) in MCB lost (G+C) while 191 codons (27%) did not show a gain or loss in (G+C) in MCB over (A+T) in NOC. Of the 702 substituted codons, only 298 (43%) codon substitutions led to silent (synonymous) mutations with 274 (39%) having occurred in the 3rd position. More than half of the substitutions led to conservative (15%) and other (42%) changes in the peptide sequences.

Comparison of all MCB pmoCAB nucleotide sites (2763) to NOC showed a net gain of 303 (11%) (G+C) over (A+T). The (G+C) content of the intergenic pmoC-pmoA spacer was found to be 61.1% in MCB, while the amoC-amoA spacer in NOC has only 35% (G+C). Because the (G+C) contents of the MCB spacer and the pmo ORFs are comparable to that of its genome, it is more effective to say that there is approximately 26% less (G+C) in the NOC spacer. This represents a faster evolution (26 vs. 23%) of the spacer in NOC towards a higher (A+T) content than the peptide coding ORFs. This data set indicates that the evolution of amo-homologous genes (whose expression products, AMO and pMMO, can oxidize ammonia) in γ-proteobacterial chemolithoautotrophs occurs under directed AT/GC mutational pressure.

3.3Codon usage

Codon usage for Amo peptides was determined for amo operons from NAV and NEU (data not shown). As expected from the codon substitution analysis, NEU has preference for codons with more (A+T) while NAV prefers (G+C)-rich codons. In β-proteobacterial Amo ORFs, the preferred translational stop codon is UAA; however, the preceding in-frame stop codon is predominantly UGA. Both codons are recognized by release factor RF2 which is believed to be the dominant RF in bacteria [14]. Interestingly, UAG and UGA are used as ORF stop codons in γ-proteobacterial amo and pmo genes (NOC, MBC) but rarely in the γ-proteobacterium, E. coli[14].

4Discussion and conclusions

The genes encoding the hetero-multimeric enzymes, AMO and pMMO, in chemolithoautotrophic bacteria have been analyzed for their (G+C) contents and compared with respect to their conservation in coding information and codon usage. We conclude from this analysis that the amoCAB/pmoCAB gene clusters in β- and γ-proteobacterial bacteria, the operon member genes as well as the spacer regions have evolved under a biased AT/GC mutational pressure. The presented data do not support horizontal amo operon (gene) exchange. Yet unknown reasons have initiated paralogous evolution of amo and pmo operons within individual organisms (strains). Because of the near identity of the multiple operon copies, this is most likely due to gene cluster formation (operon duplication) rather than translocation; however, this will have to be confirmed by pulsed-field gel electrophoresis experiments. Unfortunately, the near identity of the multiple copies prevents a successful calculation of when paralogous evolution began with respect to speciation. Is the presently obvious relatedness between amo and pmo genes from β- and γ-proteobacterial nitrosofiers and methanotrophs [8]a reflection of divergent, orthologous amo-homologous gene evolution that started with speciation or did the enzymatic systems for chemolithotrophic energy production from ammonium or methane evolve convergently from enzymatic predecessors in separate photosynthetic ancestors as proposed by Teske et al. [18]? The latter hypothesis on convergent evolution of non-homologous amo (and pmo) operons was based upon a close 16S rRNA-phylogenetic link of the chemolithoautotrophs to photosynthetic proteobacterial species and their similar organization of membrane structural arrangements [18]. By citing several references, Teske et al. [18] suggested that chemolithoautotrophy may have arisen repeatedly from photosynthesis by independent conversion. While such a conversion seems theoretically possible, the data presented by Teske et al. [18] do not contradict the hypothesis that the last common ancestor to ammonia and methane oxidizers contained an amo-homologous locus. The classification based upon 16S rDNA tells us that all ammonia- and methane-oxidizing chemolithoautotrophs are members of the large assemblage of the proteobacteria ([18] and references therein). Recent studies using both 16S and 23S RNA as well as the sequences of bacterial elongation factor TU and the β-subunit of ATPase suggest that the β-proteobacteria may be considered as a subgroup of the γ-proteobacterial subdivision [19]. Hence it is possible that specific types of membrane structural arrangements suitable for key proteins in photosynthesis and chemolithotrophy have evolved with further speciation. In contrast to the original morphological classification of the nitrosofiers which was greatly based upon differences in membrane structural arrangements, strains in the genus Nitrosospira turned out to be phylogenetically (16S rDNA) more closely related to Nitrosolobus than to Nitrosomonas[18, 20]. Polyhedral inclusion bodies that house multiple molecules of the enzyme ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCo), the carboxysomes, have been identified in all cyanobacteria and in many chemolithoautotrophs belonging to the α-, β- and γ-proteobacteria including ammonia oxidizers [21]. The amino acid sequences of the involved proteins are highly similar and the encoding operons are believed to be homologous [21] despite the diverse phylogenetic position of the organisms (e.g. strains in the genus Thiobacillus). Hence, we support the first hypothesis of divergent, orthologous amo(pmo)-homologous operon evolution because of the following reasons: (i) the amo (and pmo) ORFs share a high level of sequence similarity at the amino acid and nucleotide sequence levels [1–4, 6, 8]; (ii) the organization of genes in the amo operons is identical with no spacer between the amoA and amoB ORFs in β-proteobacterial nitrosofiers while the amo genes in the γ-proteobacterial nitrosofier Nitrosococcus oceanus are arranged like the pmo genes of the γ-proteobacterial methanotroph, Methylococcus capsulatus; (iii) all amo operons are succeeded by another conserved ORF, ORF4, of yet unknown function; and (iv) the similarity among amo (and pmo) genes has extended to the level of amo (pmo) operon transcription [4, 9]. Furthermore, phylogenetic analyses based on AmoA amino acid and 16S rDNA sequences yielded congruent trees [8, 22] which supports the concept of orthologous evolution rather than horizontal transfer of amoA genes in chemolithoautotrophs.

Instead of successive gene expansion within individual strains, it is more likely that the last common ancestor already contained multiple copies of nearly identical amo-homologous genes. This means that the present variable copy number of amo operons is a result of copy loss rather than gene expansion. It is known that biased mutational pressure towards high (A+T) coincides with genomic economization, hence the reduction of the genome by discarding non-essential genes [14]. Given the more steady (though oligotrophic) supply of nutrients in marine environments, the habitat of the (A+T)-rich Nitrosococcus oceanus, reduction in amo operon copy number could be explained as the result of genome economization. It is striking that none of the investigated nitrosofier genomes (including N. oceanus) contained pseudo-loci of any of the three amo genes or a copy of amoA or amoB that was not accompanied by the other operon member genes [1, 3, 6].

If the variable amo operon copy number in individual strains is the result of genome economization rather than expansion, it appears that amo operon copy loss has occurred at once or, if gradually by genetic drift, a rather long time ago. We believe that a gradual loss of an amo operon after a loss-of-function mutation in any of the subunits would have constituted a disadvantage to the organism. The cloning of amo and pmo genes is notoriously difficult [10] and the cloning of complete amo genes and their expression from plasmid clones has been, so far, accomplished only a few times [1, 3]. Even fragments of amo and pmo genes appear to be toxic to heterologous expression hosts such as Escherichia coli[10]. This observed phenomenon of amo-homologous gene toxicity is most likely due to the nature of their expression products: Nguyen et al. [10] reported that pMMO constitutes 60–80% of isolated membrane protein in M. capsulatus under high copper availability. It could therefore be concluded that nitrosofier genomes lack amo pseudogenes or incomplete amo operons in order to prevent (i) displacement of functional Amo peptides from the membrane by non-functional Amo peptides and (ii) domain formation which may lead to membrane instability.

It is further noteworthy that Amo proteins in cells with more operon copies are more similar in primary structure to one another than to those in cells with fewer copies [6]. These observations suggest that reverse paralogy (reduction in copy number) has had a dynamic effect on orthologous gene divergence in that it reduced the rectification potential and efficiency, thereby allowing more genetic drift of functional amo genes. If this were correct, we could propose that the sequence of triple copy amo operons of the Nitrosospira type is more similar to that of the last common ancestor amo operon(s) than to those of double or single copy amo operons in β- and γ-proteobacterial nitrosofiers or the pmo operons of methanoautotrophs. This hypothesis is supported by analysis of amoA gene sequence-based phylogenetic trees [6, 8, 22] revealing that the multi-copy amo operon ammonia oxidizers cluster closely together, hence they are characterized by short branch lengths.

The mechanism responsible for multiple copy rectification could be the mechanism responsible for implementation of biased AT/GC mutational pressure and the removal of incomplete (unrepairable) amo loci. In conjunction with the previous paragraph, this system appears to be more efficient in the multi-copy amo operon ammonia oxidizers where not only one but two operons collectively can serve as master copies to correct a mutated third operon, thereby reducing the pace of evolution (shorter branch lengths). While no such mechanism has yet been identified, the presence of nearly identical genes in other prokaryotes whose products represent key catabolic enzymes indicates that rectification systems reside in some prokaryotes. Examples for near identical genes in other prokaryotes are the RuBisCo (rbcLS) gene clusters in thiobacilli (two copies with =99% nucleotide sequence identity in T. ferrooxidans[23] and T. neapolitanus[21]) and the monomethylamine transferase (mttBC) operons (two copies with =95% nucleotide sequence identity) in Methanosarcina barkeri MS [24]. None of these sequences has been analyzed for directed AT/GC mutational pressure on their evolution nor has the mechanism of multiple copy rectification been investigated. While the existence of multiple nearly identical operons is rather rare in prokaryotes, they seem to encode crucial catabolic enzymes. The hope is that the information provided here will be valuable for future studies addressing multi-copy gene rectification in prokaryotes.


This work was supported by grants from NSF (IBN-9628556 to M.G.K., IBN-9527919 to J.M.N.) and the USDA NRI-CGP (#9600839 to J.M.N., #9604332 to M.G.K.). Helpful discussions with M.E. Lidstrom, J.M. Shively and M. Wagner are acknowledged.