Structure, function and evolution of microbial adenylyl and guanylyl cyclases

Authors


Summary

Cells respond to signals of both environmental and biological origin. Responses are often receptor mediated and result in the synthesis of so-called second messengers that then provide a link between extracellular signals and downstream events, including changes in gene expression. Cyclic nucleotides (cAMP and cGMP) are among the most widely studied of this class of molecule. Research on their function and mode of action has been a paradigm for signal transduction systems and has shaped our understanding of this important area of biology. Cyclic nucleotides have diverse regulatory roles in both unicellular and multicellular organisms, highlighting the utility and success of this system of molecular communication. This review will examine the structural diversity of microbial adenylyl and guanylyl cyclases, the enzymes that synthesize cAMP and cGMP respectively. We will address the relationship of structure to biological function and speculate on the complex origin of these crucial regulatory molecules. A review is timely because the explosion of data from the various genome projects is providing new and exciting insights into protein function and evolution.

Introduction

Discovery of the second messenger cAMP (adenosine 3′,5′-cyclic monophosphate) and the molecular dissection of its role in hormone action (Sutherland, 1972) paved the way for a new era in biological research. This soon led to the finding that cGMP (guanosine 3′,5′-cyclic monophosphate) is also an important player in signal transduction. cAMP is synthesized by adenylyl cyclase (AC) from ATP, and cGMP is synthesized by guanylyl cyclase (GC) from GTP. Both are degraded to nucleoside monophosphates by cyclic nucleotide phosphodiesterases (PDEs). The levels of cAMP and cGMP in the cell are tightly balanced by the opposing actions of purine nucleotide cyclases and PDEs, both of which can occur in multiple isoforms, often with distinct subcellular locations. This arrangement facilitates fine tuning of cellular responses. In their roles as eukaryotic second messengers, cAMP and cGMP can bind to and activate a number of intracellular receptors; these include cyclic nucleotide-dependent protein kinases, cyclic nucleotide-gated ion channels, PDEs, a Rap1 guanine nucleotide-exchange factor (EPAC) (de Rooij et al., 1998) and GAF domains. The protein kinases are key regulators of these signalling pathways, and their activation triggers a cascade of downstream events that leads to a cellular response. In many prokaryotes, the biological activity of cAMP is mediated through its role in transcriptional activation.

A classification system for purine nucleotide cyclases has been proposed (Danchin, 1993), which defines three distinct groupings, mainly on the basis of sequence. This has recently been extended to six classes (Cotta et al., 1998; Sismeiro et al., 1998; Tellez-Sosa et al., 2002) and will probably be expanded further as genome sequencing progresses through additional phyla. This review will deal primarily with microbial cyclases in class III, which encompasses both ACs and GCs. Class III cyclases (‘the universal class’) are defined on the basis of conserved motifs (Fig. 1), and are found in both prokaryotes and eukaryotes. Their definitive features are the sequence and structure of the catalytic site where critical residues have been largely maintained throughout evolution. This reflects a common mechanism for purine binding and for synthesis of the respective cyclic nucleotide. In contrast, there can be considerable variability in the topology and physiological roles of these enzymes. This combination of conserved and divergent features provides a useful tool to address phylogenetic relationships and to question the origin and evolution of these important signalling molecules. The other five cyclase classes, which contain only ACs, have so far been identified exclusively in prokaryotes, where a given species may often possess more than one class.

Figure 1.

Alignment of the purine-binding pockets of class III microbial adenylyl and guanylyl cyclases with mammalian cyclases. The alignments show the diagnostic residues that distinguish ACs (green) from GCs (red). These positions are marked above by an asterisk. The sequences are in three distinct groups: GCs (upper), ACs with a purine-binding aspartate (lower), ACs in which this residue is replaced by a serine or threonine (middle). The catalytic domains used are indicated: C, the single catalytic domain; C1, the N-terminal catalytic domain; C2, the C-terminal catalytic domain. In each case, these domains contain the asparagine and arginine pair required for catalysis (highlighted in light blue, marked above with a black triangle). Residues identical in 11 or more sequences are highlighted in grey to reflect the conservation between ACs and GCs across phyla. The secondary structure inferred from rat type II AC is shown in orange beneath the sequence. The sequences used are as follows. GCs: human membrane GC 2D (NP_000171); Chlamydomonas hypothetical GC (genie.76.4); Paramecium GC1 (CAB44361); Plasmodium PfGCβ (CAD52725); Dictyostelium GCA (CAB42641); Dictyostelium sGC (AAK92097); Synechocystis Cya2 (NP_440289). ACs: Chlamydomonas hypothetical AC (genie.445.7); Paramecium AC1 (CAD60410); Plasmodium PfACα (AAO64441); Plasmodium PfACβ (NP_704518); Dictyostelium ACR (AAD50121); Anabaena CyaB1 (BAA13998); Mycobacterium Rv1319c (NP_215835); Euglena PACα (BAB85619); Mycobacterium Rv1625c (O30820); Trypanosoma brucei GRESAG4.1 (CAD21883); Saccharomyces CYR1 (P08678); Neurospora CR-1 (Q01631); Dictyostelium ACA (Q03100); Trichodesmium‘ammonia permease’ (ZP_00074053); Mycobacterium Rv1264 (Q11055); rat type II (AAA40682).

The topology and catalytic mechanism of class III cyclases

Broadly speaking, class III cyclases fall into two main topological groups: cytosolic proteins and those that are integral to the cell membrane and contain at least one transmembrane (TM) helix. The mammalian G protein-dependent ACs are the best characterized in terms of structure and catalytic mechanism, and the resulting data have provided the basis for investigating these aspects in microbial cyclases. Nine different mammalian isoforms have now been identified, each of which has a distinct tissue distribution and/or regulatory specificity. These and all the G protein-dependent ACs of higher eukaryotes conform to the same basic structure, consisting of two catalytic domains (C1 and C2) each preceded by a set of six TM helices (Fig. 2A). Activation occurs indirectly when a ligand binds to a G protein-coupled receptor, which in turn interacts with heterotrimeric G proteins, subunits of which bind to the AC and stimulate synthesis of cAMP. GCs with a similar topology have also been identified in protozoans. These enzymes may be bifunctional as they have an associated N-terminal P-type ATPase-like domain with additional multiple TM helices (Fig. 2B) (Linder et al., 1999; Carucci et al., 2000).

Figure 2.

Topology of class III adenylyl and guanylyl cyclases.
A. The membrane-localized G protein- dependent ACs of higher eukaryotes. The heterodimeric nature of the catalytic core formed by the C1 and C2 catalytic domains is shown.
B. The membrane-bound GCs, such as those found in Plasmodium and Paramecium. These proteins have a putative ATPase domain located at the N-terminus that may have a regulatory function.
C. The soluble ACs. Although the heterodimeric nature of catalytic domains is similar to that of the membrane-localized ACs, these enzymes are not closely related in evolutionary terms (Fig. 4).
D. The soluble GCs, such as the mammalian enzyme, that are formed of two distinct subunits.
E. The receptor-type GCs, typically found in higher eukaryotes.
F. The receptor-type ACs of trypanosomes, which form homodimers.
Class III cyclases can also vary in the number of transmembrane helices that they possess and in the number and type of regulatory domains (Fig. 3).

Crystallography (Tesmer et al., 1997; Zhang et al., 1997), mutagenesis (Tang et al., 1995; Yan et al., 1997; Sunahara et al., 1998; Tucker et al., 1998) and modelling (Liu et al., 1997) studies have provided crucial information on the structure of the active site and the catalytic mechanism of mammalian ACs and GCs. Amino acids that are essential for catalysis, substrate binding, metal ion binding and for defining purine specificity have been identified (Fig. 1). The conserved biochemical function of each of these amino acids in microbial isoforms has been confirmed by mutagenesis studies and by resolution of the three-dimensional structure of a protozoan AC (Bieger and Essen, 2001). The most highly conserved of these residues across phyla are the asparagine/arginine pair required to stabilize the transition state of the enzyme (highlighted in light blue in Fig. 1). Also highly conserved are lysine or glutamate residues that determine adenine or guanine binding respectively. Other residues diagnostic of purine-binding specificity are the aspartate (in ACs) or a cysteine (in GCs), which are located at the positions identified in Fig. 1. These residues are highly conserved in higher eukaryotes, although they are less highly conserved among lower eukaryotes and prokaryotes (see below).

The class III AC active site has a wreath-like configuration (Tesmer et al., 1997; Zhang et al., 1997) that is formed by dimerization, an event essential for activity. In the mammalian G protein-dependent ACs, the catalytic site is made up of a heterodimer constituted by the intramolecular association of the C1 and C2 domains. These form a pseudosymmetrical structure with two non-identical pockets, only one of which is able to bind a substrate molecule (Fig. 2A). The second pocket may have a regulatory function and is able to bind the diterpene forskolin, a non-physiological activator of mammalian ACs. This type of conformation (but not necessarily the forskolin-binding capacity) is also found in the phylogenetically distinct soluble ACs, such as those of the mammals and the Plasmodium PfACβ (Fig. 2C). The catalytic sites of mammalian soluble GCs are also formed by heterodimeric association but, in this case, the interacting catalytic domains are contributed by distinct polypeptides (Fig. 2D). The third major type of cyclase active site, which includes those of mammalian receptor GCs (Fig. 2E) and trypanosome ACs (Fig. 2F), is formed by homodimeric interaction. In these molecules, essential motifs are present in each subunit, and their interaction gives rise to two identical pockets, both of which are capable of binding to a substrate molecule (Fig. 2E and F). Although structurally similar, these enzymes share only a distant evolutionary origin (Fig. 4).

Figure 4.

Cluster analysis of microbial class III cyclases identifies four distinct subgroups. The catalytic domains of ACs and GCs from several organisms were aligned using clustalw and analysed by the treetop phylogenetic tree prediction tool (Moscow State University). Bootstrap values are indicated for each node on the rectangular cladogram and were derived using the ‘Cluster algorithm’. The following sequences were used: human soluble AC (AAF65931); partial sequence from Anopheles gambiae hypothetical AC (XP_314782); Plasmodium PfACβ (NP_704518); Chloroflexus hypothetical AC (ZP_00018205); Dictyostelium soluble GC (AAK92097); Trypanosoma brucei GRESAG4.1 (CAD21883); Saccharomyces CYR1 (P08678); Candida AC (AAG18428); Neurospora CR-1 (Q01631); Plasmodium PfGCβ (CAD52725); rat type II AC (AAA40682); human retinal GC2D (NP_000171); Dictyostelium ACG (Q03101); Mycobacterium Rv1625c (O30820); Plasmodium PfACα (AAO64441); Chlamydomonas hypothetical AC (genie 445.7); Dictyostelium ACB (AAD50121); Spirulina CyaC (T17197); Anabaena cyaB1 (BAA13998); Synechocystis Cya2 (GC) (NP_440289). Protein sequences were obtained from the NCBI protein database unless otherwise stated.

The AC catalytic mechanism, which is common to all the above types, is analogous to that of DNA polymerases and involves two metal ions (normally Mg2+) in a nucleophilic attack of the 3′ hydroxyl ion of the ribose moiety on the α-phosphate of a nucleoside triphosphate. This mechanistic similarity is reflected by a common structural fold (Artymiuk et al., 1997; Zhang et al., 1997), which contains two conserved aspartate residues that facilitate metal ion binding. The catalytic mechanism of the class II ACs (see below) predicted from the crystallographic analysis of the Bacillus anthracis EF (oedema factor) is also proposed to constitute in-line nucleophilic attack on the α-phosphate (Drum et al., 2002), but only a single metal ion is thought to be involved.

The active site within the class III AC and GC catalytic domains has been highly conserved, constrained by the need to bind ATP or GTP and to convert it to the corresponding cyclic nucleotide. However, over the course of evolution, a wide variety of regulatory domains has become associated with the catalytic regions (Fig. 3). The biological roles of most of these are unknown, but the ingenious and diverse strategies by which the regulatory domains act to modulate signal transduction so that cells respond appropriately to environmental changes or biochemical stimuli are becoming better understood. The following sections will illustrate the diversity of microbial cyclases and include examples from several different phyla. Space precludes a comprehensive survey of all known isoforms.

Figure 3.

The regulatory domains of microbial cyclases. A range of different functional domains has become associated with class III purine nucleotide cyclases, presumably by independent gene fusion events. The key (boxed) shows the cyclase-associated motifs (predicted by a CD search using the NCBI Protein blast tool) and identifies TM helices and catalytic domains. The catalytic domain that contains most of the residues important for catalysis is coloured blue (usually C2) and the auxiliary catalytic domain (usually C1) (where present) purple. The length of each protein is to scale (size bar indicates 250 amino acids).
A. The GC isoforms depicted are the Dictyostelium soluble GC (sGC) (AAK92097), the Paramecium bifunctional GC1 (CAB44361), a hypothetical Chlamydomonas GC (genie.76.4), the human retinal GC2D (NP_000171) and the Cya2 from Synechocystis (NP_440289).
B. The AC isoforms depicted are the Ras- activated CYR1 from Saccharomyces (P08678), the ‘ammonia permease’ from Trichodesmium (ZP_00074053), the photoactivated PACα subunit from Euglena (BAB85619), a hypothetical Chloroflexus AC (ZP_00018205), the PfACα of Plasmodium (AAO64441), Anabaena CyaB1 (BAA13998), the ACG of Dictyostelium (Q03101) and the hypothetical AC Rv1319c from Mycobacterium (NP_215835). The motifs identified in the CD search using the NCBI blast tool are as follows: HAMP (histidine kinases, adenylyl cyclases, methyl-binding proteins and phosphatases) domain; PAS (period clock protein, aryl hydrocarbon receptor and single-minded protein) domain; GAF (cGMP-dependent PDEs, adenylyl cyclases and formate hydrogen lyase transcriptional activator); CHASE (cyclases, histidine kinases, associated sensory extracellular) domain; PP2C (protein phosphatase 2C)-like domain and ANP (atrial natriuretic peptide) receptor.

Prokaryotic class III cyclases

In terms of structure, bacterial class III ACs have been studied in less detail than those of eukaryotes. Most were originally identified by functional complementation of AC-deficient strains of Escherichia coli. However, genome projects (particularly of pathogenic microorganisms) have identified numerous hypothetical proteins that can tentatively be included in this AC class. The genome of Mycobacterium tuberculosis, for example, contains a surprisingly large number of putative class III AC genes. However, none of these 15 has yet been assigned a biological function, and the role of cAMP in this organism has not been defined. The mycobacterial enzymes can be subdivided into three types based on predicted topology. Members of the first group partly resemble G protein-dependent AC isoforms of higher eukaryotes (Fig. 2A). They correspond to half these enzymes, with a single catalytic domain and six predicted transmembrane (TM) domains. One of these mycobacterial ACs (Rv1625c) has been expressed in both E. coli and mammalian cells. Membrane insertion occurred in both systems, and AC activity has been demonstrated. This molecule might be a progenitor of the mammalian membrane-associated ACs, and it has even been suggested that it may have been spread by horizontal gene transfer (Guo et al., 2001).

The majority of the other mycobacterial ACs are similar to those of the related Gram-positive actinobacterium Streptomyces, and one (Rv1264) has been characterized biochemically (Linder et al., 2002). This enzyme has two distinct domains. The residues necessary for activity are in the C-terminal catalytic region and are analogous to those of mammalian ACs, an observation that has been confirmed by mutagenesis studies (K296 and D365, substrate definition; D256 and D300, metal ion co-ordination; R376, transition state stabilization). The N-terminal region appears to have an autoinhibitory role. The third type of AC (e.g. Rv1319c and Rv1320c) is part of a subclass comprising ACs from diverse bacterial species. In Rv1319c, there are six predicted TM helices associated with a HAMP motif (see legend to Fig. 3), a conserved feature of several different proteins predicted to have a role in signal transduction.

Of the prokaryotic class III cyclases, those of the Gram-negative cyanobacteria are the best characterized. In these photosynthetic organisms, cAMP levels have been shown to change in response to several environmental conditions including levels of light, oxygen, pH and nitrogen (Hood et al., 1979; Kasahara and Ohmori, 1999). Reflecting this diversity of function, the genome of the filamentous cyanobacterium Anabaena, for example, encodes six class III AC isoforms of various topologies. Of these, CyaB1 is of particular interest (Fig. 3B). It has a C-terminal catalytic portion and an N-terminal regulatory region, which incorporates both a PAS motif (a ubiquitous small molecule receptor) and two GAF domains (see legend to Fig. 3). Uniquely, activation of this enzyme is mediated by the binding of cAMP (rather than cGMP) to one of the GAF domains. An aspartate residue that is highly conserved in the purine-binding domain of ACs from higher eukaryotes (Fig. 1) is replaced in CyaB1 by a threonine residue that is essential for maximal activity (Kanacher et al., 2002). Recently, it has been shown that bicarbonate ions can activate CyaB1, and that this responsiveness is dependent on the presence of the threonine residue (and also the critical lysine in the purine-binding pocket, K646) (Cann et al., 2003). The presence of a threonine at a corresponding position has also been noted in ACs from diverse eukaryotic phyla (Muhia et al., 2003) (Fig. 1), and bicarbonate responsiveness has been demonstrated in some of these, including the CyaB of the myxobacterium Stigmatella aurantiaca, the Rv1319c of M. tuberculosis (Cann et al., 2003) and the soluble AC of mammals (Chen et al., 2000). This may be a general feature of ACs with a threonine at this position.

The CyaC of the cyanobacterium Spirulina platensis is another bicarbonate ion-responsive AC (Chen et al., 2000). In this organism, cAMP is involved in the regulation of gliding movement and respiration (Kasahara and Ohmori, 1999), and at least six class III AC genes have been isolated by complementation (Kasahara et al., 2001). From a functional perspective, it is interesting that CyaC has an N-terminal region reminiscent of a bacterial two-component regulatory system. This comprises a transmitter and two receiver domains, and data suggest that CyaC undergoes an autophosphorylation reaction characteristic of a hybrid sensory kinase (Kasahara and Ohmori, 1999). In this type of reaction, the initial step involves phosphorylation of a histidine residue in the transmitter domain, followed by transfer of the phosphoryl group to an aspartate residue in the receiver domain. Activation of CyaC is dependent on this process. There are also a small number of cyanobacterial ACs that have an aspartate rather than a threonine residue in the purine-binding pocket, which share a high overall similarity to the catalytic domains of mammalian ACs. One of these occurs in Trichodesmium and has an unusual N-terminus that is highly related to an ammonia permease (>40% identity over 500 amino acids). This putative regulatory domain has up to 12 TM helices (Fig. 3B).

The presence of GCs in prokaryotes has been controversial, but it has been reported that cyanobacteria contain relatively high levels of cGMP compared with other bacteria (Herdman and Elmorjani, 1988). In bacteria, all but one of the class III cyclases investigated have been shown to have AC activity. The exception to this is the product of the cya2 gene of the cyanobacterium Synechocystis, which is predicted to be a GC. The critical amino acid residues within the purine-binding pocket of Cya2 (glutamate and glycine) are compatible with GC activity (Fig. 1). Furthermore, gene disruption reduces the ability of mutants to produce cGMP (Ochoa De Alda et al., 2000). The N-terminal domain of Cya2 shares similarities with the corresponding region of the Dictyostelium cyclase ACG (Fig. 3A and B), which is an osmosensor (see below). In both cases, examination of the sequence identifies a potential transmembrane sensor that coincides with an extracellular sensory domain (CHASE) characteristic of a variety of proteins, including some ACs and histidine kinases. Two other putative class III cyclases can be identified in Synechocystis that have residues predictive of ACs. One of these enzymes (Cya1) has a role in motility. Cya1 mutants have decreased levels of cAMP and are immotile (Terauchi and Ohmori, 1999).

Other classes of prokaryotic AC

The other five classes of prokaryotic AC (Danchin, 1993; Cotta et al., 1998; Sismeiro et al., 1998; Tellez-Sosa et al., 2002) share little similarity apart from an ability to synthesize cAMP. Class I ACs (‘the enterobacterial class’) are highly conserved in the enterobacteria, which include human pathogens such as E. coli, Salmonella typhimurium and Yersinia pestis. They are also present in some related Gram-negative bacteria (e.g. Vibrio cholerae and Aeromonas hydrophila). Class I ACs are characterized by two functionally distinct regions: an N-terminal catalytic domain and a C-terminal regulatory domain, each with diagnostic signature sequences. The important residue in the regulatory domain is a histidine, which is thought to be phosphorylated, leading to activation (Danchin, 1993). In the well-characterized glucose-sensitive CyaA of E. coli, low levels of glucose lead to increased synthesis of cAMP, which binds to and activates the catabolite gene activator protein (CAP). The CAP–cAMP complex then binds to specific DNA sequences and promotes transcription of numerous genes, including those of the lac operon.

In Bordetella pertussis (a Gram-negative bacterium that causes whooping cough) and Bacillus anthracis (a Gram-positive bacterium that causes anthrax), a second class (‘the toxic class’) of AC forms components of the exotoxins that cause disease pathology. With B. pertussis infections, this AC is important in the early colonization of the respiratory tract and induces apoptosis of infected macrophages. It is a bifunctional protein that has an N-terminal cyclase domain and a large C-terminal haemolysin region. This haemolytic activity stems from an ability to form cation-selective membrane channels that facilitate translocation of the cyclase catalytic domain into the host cell cytoplasm. Here, it is activated by calmodulin and synthesizes cAMP (for a review, see Ladant and Ullmann, 1999). The related toxin of B. anthracis comprises three plasmid-encoded components: a calmodulin-activated AC, also known as the oedema factor (EF), reflecting the tissue swelling that it causes; the lethal factor (LF), which is a metalloprotease that inactivates a family of mitogen-activated protein kinase kinases (MAPKK); and the protective antigen PA, which forms a pore allowing access of the other two components into the host cell. EF can be activated by host-derived calmodulin, and the mechanism involved has been elucidated by structural studies (Drum et al., 2002). Calmodulin is essentially ‘hijacked’ by EF, thereby disrupting normal signalling pathways (Stubbs, 2002). The activated AC produces abnormally high levels of intracellular cAMP, with deleterious consequences. The catalytic site of EF is structurally distinct from that of class III ACs, although there are some similarities in residues involved in substrate binding and catalysis, suggesting convergent evolution (Drum et al., 2002).

Class II ACs were originally designated the ‘calmodulin-activated toxic class’ (Danchin, 1993). However, it is now clear that some members of this group are not activated by calmodulin, an example being an AC in Pseudomonas aeruginosa that is translocated by the type III secretion system into the host cell cytosol. This virulence determinant is activated by an alternative host cell factor (Yahr et al., 1998). P. aeruginosa also contains at least one class III AC, which is a receptor-like enzyme that is involved in activation of the type III secretion system (Wolfgang et al., 2003).

Two AC genes have been identified in the Gram-negative bacterium Aeromonas hydrophila. One (cyaA) is very similar to the class I enterobacterial ACs. The second (cyaB) is not similar to any proteins of known function and has been assigned to a fourth class of AC (Sismeiro et al., 1998). This enzyme has unusual biochemical properties with optimal activity at 65°C and pH 9.5. Disruption of both genes demonstrated a role for cAMP in motility, but deletion of the class IV cyaB alone did not give a detectable phenotype or reduce cAMP production. However, introduction of the cyaB gene into cyaA or cyaA/cyaB null mutants resulted in restoration of cAMP production.

Another structurally distinct type of AC (class V) has been isolated from the anaerobic bacterium Prevotella ruminicola by screening genomic clones in E. coli AC-deficient mutants (Cotta et al., 1998). The sequence has no similarity to other AC genes. Earlier work had shown that, of the anaerobic bacteria tested, P. ruminicola was the only species with detectable levels of cAMP. Finally, an additional form of AC (class VI) has recently been identified in Rhizobium elti. Cyclic nucleotides had previously been implicated in metabolic functions in the rhizobia family, and numerous class III AC genes have been identified (including a remarkable 26 in Sinorhizobium meliloti). This novel class of AC was identified by functional complementation, but no phenotype was observed on disruption of the gene (Tellez-Sosa et al., 2002). The deduced protein sequence is different from any other AC, but similar hypothetical genes are present in the genomes of several related bacteria. A recent database search has also shown a possible homologue in the genome of the thermophilic photosynthetic bacterium Chloroflexus (30% identity over 350 residues).

Protozoan class III cyclases

Representatives of each of the major topological types of class III cyclase (Fig. 2) have been identified in the protozoans. These enzymes share many structural and biochemical properties that have been conserved throughout evolution. However, they also display many unique features that reflect the extent of phylogenetic separation of these organisms and the diverse functions that have been assumed by the cyclic nucleotide signal transduction pathways. One of the best characterized protozoan regulatory systems is in the social amoeba Dictyostelium. Here, five distinct cyclase enzymes have been identified, and the corresponding signalling pathways have been shown to play a major role in a variety of functions ranging from differentiation to chemotaxis (for reviews, see Roelofs and Van Haastert, 2002; Saran et al., 2002).

One of these enzymes (ACA, aggregation-specific AC) is a G protein-dependent AC that is structurally analogous to its counterparts in higher eukaryotes with two sets of six TM domains, each followed by a catalytic domain (Fig. 2A). It is activated by the binding of extracellular cAMP to one of a family of G protein-coupled receptors. The use of cAMP as both a first and a second messenger appears to be unique to Dictyostelium. ACA is involved in the characteristic chemotactic aggregation of amoebae to form a multicellular slug under starvation conditions (Pitt et al., 1992). A second Dictyostelium isoform (ACG, germination-specific AC) has a single catalytic domain, two predicted TM helices and an extracellular receptor domain. Homodimers of ACG catalytic domains give rise to two identical active sites in a manner that resembles the mammalian receptor GCs and the receptor-type ACs of trypanosomes (Fig. 2E and F). ACG is involved in spore germination and is activated by high osmolarity; whether the CHASE domain is involved in this is not known (Fig. 3B) (van Es et al., 1996). The third AC (ACB or ACR) is a soluble enzyme with a single C-terminal catalytic region. The N-terminal regulatory sequences resemble those of certain cyanobacterial ACs and have similarity to histidine kinase and response regulator domains (Fig. 3A). This isoform is expressed mainly in the multicellular stage (Meima and Schaap, 1999) and is involved in spore maturation (Soderbom et al., 1999).

In Dictyostelium, evidence suggests that cGMP is involved in chemotaxis. Two distinct GCs have been identified, and both are activated indirectly by the chemoattractant folic acid. They are also activated in vivo by extracellular cAMP, indirectly through a G protein-coupled receptor and by folate via an unknown receptor (Roelofs et al., 2001a). The catalytic domain of one of these GCs (sGC) has a high level of similarity to those of cyanobacterial ACs and is most closely related to a small group of soluble ACs (with a pair of catalytic domains) that includes the mammalian bicarbonate-responsive enzyme (Figs 2C and 4). The amino acids in the two purine-binding motifs of the C2 catalytic domain of the Dictyostelium sGC are atypical (Fig. 1). A clustal analysis of all known examples of this form of ‘soluble’ cyclase identified the likely substrate-binding residues: a glutamine, rather than the glutamate that is normally diagnostic of GCs; and an alanine instead of the cysteine residue that is invariant in higher eukaryotic GCs. An alternative alignment has been reported, suggesting that an aspartate replaces the diagnostic glutamate residue (Roelofs et al., 2001a). Both these arrangements are consistent with GC activity, and gene disruption experiments have demonstrated that this enzyme is a functional GC (Roelofs et al., 2001a). sGC also possesses an extensive C-terminal region, which resembles complex bacterial kinases that have a modular structure and includes an ATPase-like domain (Fig. 3A).

Unusually, the second Dictyostelium GC (GCA) has a topology similar to that of G protein-dependent ACs, a configuration also found in the GCs of the alveolates such as the malaria parasite Plasmodium falciparum and the ciliates Paramecium and Tetrahymena (Fig. 2B) (see below). There are two of these GCs in P. falciparum (PfGCα and PfGCβ). They are closely related and probably arose by gene duplication. A unique feature of these enzymes is that the motifs required for catalysis and substrate binding present in the C2 domain of G protein-dependent ACs are present in the C1 domain of the protozoan GCs and vice versa. Heterologous expression of the protozoan cyclases has confirmed that these AC-like molecules have GC activity, as predicted from the sequence of the purine-binding motifs (Linder et al., 1999; Carucci et al., 2000; Roelofs et al., 2001b). A further unusual feature, present in the Plasmodium and ciliate GCs, but not in Dictyostelium GCA, is the presence of an N-terminal P-type ATPase-like domain (Figs 2B and 3B) with an additional 10 predicted TM helices. Certain key residues, required for a fully functional ion-transporting ATPase, are absent from this bifunctional form of GC. Therefore, this domain may act as a receptor or transporter, with a functional linkage to regulation of GC activity.

In Plasmodium, there is evidence that increased levels of cGMP can stimulate exflagellation. This is an important step in gametogenesis and involves the release of highly motile flagellated male gametes within the insect vector. The process is also triggered by a decrease in temperature, an increase in pH and by a mosquito-derived factor, xanthurenic acid. Increased levels of cGMP in gametocyte membranes treated with xanthurenic acid suggest that its gametocyte activation properties may be mediated by the cGMP signalling pathway (Muhia et al., 2001).

The cAMP signalling pathway has also been implicated in sexual stage development (gametocytogenesis) (Kaushal et al., 1980; Read and Mikkelsen, 1991) in P. falciparum, and initial studies on the ACs involved have revealed some interesting properties. There are two ACs in P. falciparum, but they are not closely related (Fig. 4). The first, PfACα, has a membrane-associated N-terminal domain, perhaps of regulatory function, that has up to six TM helices (Fig. 3B). Intriguingly, several mRNA splice variants derived from PfACα are detectable in gametocytes and have the potential to encode proteins with different numbers of TM helices (Muhia et al., 2003). The N-terminal domain of the protein has high levels of similarity to voltage-gated potassium channels (Weber et al., 2004), having the characteristic six membrane-spanning helices (S1–S6). The analogous S4 region of the Aeropyum pernix voltage-dependent K+ channel isoform (KvAP), which has been analysed by crystallography, is a voltage sensor containing a series of positively charged residues every third position (Jiang et al., 2003). This feature is conserved in PfACα, as is a putative pore region that has a signature sequence of K+ selective ion channels. However, the location of this pore region (after the S6 domain) is distinct from that of the other K+ channels, where it occurs between S5 and S6. This bifunctional structure implies that electrical fluctuations resulting from changes in environmental ion concentrations could be coupled to cAMP synthesis in these organisms.

Sequences analogous to PfACα are the only ACs so far identifiable in the ciliates, organisms in which cyclic nucleotides play an important role in motility. In Paramecium, increased levels of cAMP in the cilia affect the direction of ciliary beating and result in forward swimming, whereas increased levels of cGMP lead to ciliary reversal and backward swimming (for a review, see Schultz and Klumpp, 1993). The synthesis of cyclic nucleotides is intimately associated with ion currents, and hyperpolarization of the cilia membranes (and an outward K+ current) leads to increased AC activity. Furthermore, the purified ciliate enzyme has both AC and voltage-independent K+ channel activity (Schultz et al., 1992). The predicted bifunctional nature of the ciliate ACs is in agreement with these findings. In contrast, a depolarizing Ca2+ inward current is linked to increased cGMP formation (Schultz and Klumpp, 1994).

The second Plasmodium AC (PfACβ) is most closely related to the small subclass of soluble ACs (Figs 2C and 4). This enzyme has a pair of cyclase catalytic domains, although it appears to lack the C-terminal ATPase-like domain predicted in the soluble ACs of Chloroflexus, the spirochaete Leptospira, mammals and the sGC of Dictyostelium (Fig. 3A and B). However, experiments with biochemically active recombinant protein are required to confirm the substrate specificity of PfACβ. Orthologues of the P. falciparum AC genes are present in the genomes of several apicomplexans including other Plasmodium species, Toxoplasma, Eimeria, Theileria and Cryptosporidium.

A major goal of research in the cyclase field is to relate the structure of this diverse family of molecules to function, particularly the mechanisms of regulatory control. To date, structural analysis has been confined to the catalytic regions of the G protein-dependent ACs from mammals, the receptor-type ACs of trypanosomes and the B. anthracis EF (Tesmer et al., 1997; Zhang et al., 1997; Bieger and Essen, 2001; Drum et al., 2002). The structures of the catalytic domains of two ACs from the African trypanosome Trypanosoma brucei have been solved in a monomeric, catalytically inactive form (Bieger and Essen, 2001). Normally, these enzymes are active as homodimers (Fig. 2F). These studies found considerable similarity between the catalytic domains of the trypanosome and mammalian enzymes. However, there is a unique sequence insertion (the Δ-subdomain) near to the active site of trypanosome ACs that is thought to be involved in allosteric control. The non-physiological reducing agent dithiothreitol (DTT) can bind to this site although, in vivo, the site may be specific for other small regulatory molecules. All trypanosomatid ACs have this hydrophilic insertion (Taylor et al., 1999). Trypanosome ACs, which are encoded by multicopy genes (several hundred in the case of T. brucei), lack the residues required for binding to heterotrimeric G proteins. This, coupled with the overall structure of the proteins (Fig. 2F) and the absence of the relevant gene sequences, strongly suggests that these ACs are G protein independent.

Cyclic AMP has been implicated in the growth and differentiation of both T. brucei and the American trypanosome Trypanosoma cruzi. For example, two distinct peaks of AC activity have been detected in T. brucei during differentiation from the bloodstream form to the procyclic stage of the parasite life cycle (Rolin et al., 1993), a process that normally occurs in the insect vector. There is also strong evidence that cAMP might be involved in the development of long slender bloodstream forms into the stumpy forms that are preadapted for transmission to the tsetse fly (Vassella et al., 1997). In T. cruzi, there are data to suggest a role for cAMP in the transformation of the insect from epimastigotes to the infectious metacyclic trypomastigotes (Fraidenraich et al., 1993). However, some of the early experiments exploring the role of AC in the T. cruzi life cycle were designed on the assumption that the parasite enzyme was analogous to G protein-dependent ACs and would respond similarly to activators and inhibitors. These data may therefore be open to reinterpretation. None of the trypanosomatid cyclase genes examined so far encodes enzymes with motifs characteristic of GCs.

In another flagellated protozoan, Euglena gracilis, there are two cytosolic light-responsive ACs that influence locomotion (Fig. 3B). These photosynthetic organisms rapidly reverse swimming direction in response to blue light, and the paraflagellar body, an organelle situated at the base of the flagellum, is thought to be responsible. Two paraflagellar flavoproteins, related to a family that functions as blue light receptors in other species, have remarkable properties (Iseki et al., 2002). Not only do they each possess a pair of putative flavin-binding domains, but they both contain a region with high levels of similarity to some bacterial ACs. The purified proteins display high levels of AC activity after stimulation with blue light. The biological role of these molecules was further confirmed following downregulation of both genes by RNA interference. This resulted in the disappearance of both the paraflagellar body and the photophobic response. An association between increased cAMP levels and flagellar motility has also been observed in metazoan sperm and Chlamydomonas (Iseki et al., 2002).

Fungal and algal class III cyclases

cAMP is involved in the differentiation of many fungal species (for a review, see D’Souza and Heitman, 2001). The fungal ACs have a structure that is highly conserved between species (Fig. 3B), including a single C-terminal catalytic domain, a serine/threonine protein phosphatase-like domain, leucine zipper motifs and an N-terminal Ras-associating domain. Some of these enzymes are known to be activated by heterotrimeric G proteins, even though they lack the characteristic TM domains of mammalian ACs.

The Saccharomyces cerevisiae cAMP signalling pathway is involved in several important processes including nutrient sensing, growth, metabolism, stress resistance and morphogenic switching. The single AC gene (cyr1) is essential, and deletion causes arrest in the G1 phase of the cell cycle (Matsumoto et al., 1982). Increases in cAMP are stimulated in response to glucose and intracellular acidification. Activation by glucose occurs via a G protein-coupled receptor and a Gα subunit (Gpa2) (Kubler et al., 1997; Lorenz and Heitman, 1997). This pathway may also be activated by nitrogen starvation (Lorenz and Heitman, 1997; Xue et al., 1998). Unusually, the S. cerevisiae AC is regulated by Ras proteins (Ras 1 and Ras 2; Toda et al., 1985) (Fig. 3B), which sense environmental nutrient levels and regulate cell cycle progression. Ras 2 also interacts with the mitogen-activated protein kinase (MAPK) system, and a close association with the cAMP pathway seems to be a general feature of this system in fungi. In the fission yeast Schizosaccharomyces pombe, cAMP mediates the effects of glucose on gluconeogenesis and spore germination, although the pathway is not essential. Mutants lacking the cyr1 gene display aberrant mating behaviour, which is normally repressed by nutrient levels. The enzyme is activated by the Gα protein (Gpa2), but not by Ras proteins.

The single AC of the filamentous fungus Neurospora crassa (CR-1) has a similar predicted structure to that of the S. cerevisiae AC (Fig. 3B). Deletion of cr-1 prevents cAMP synthesis, and mutants exhibit a number of developmental irregularities, such as short aerial hyphae and increased thermotolerance (Ivey et al., 2002). In N. crassa, AC can be positively regulated by the Gα subunit, and this activity is involved in a range of functions including female fertility, stress responses and apical extension (Ivey et al., 2002).

cAMP signalling is also important in pathogenic fungi because of its role in pathogenesis and virulence. In Cryptococcus neoformans, this pathway is involved in sensing nutrient levels that control mating and in the production of virulence factors. Mutants that lack the single AC gene are viable and are still able to undergo budding growth. However, they are sterile, do not produce two inducible virulence factors and are avirulent in animal models (Alspaugh et al., 2002). The human pathogen Candida albicans transforms from a budding form to a hyphal form, a process that has been implicated in pathogenesis. Deletion of the AC gene reduces growth rate and results in an inability to transform to the hyphal form. Mutant cells are also avirulent in an animal model (Rocha et al., 2001). Like the situation in S. cerevisiae, this AC is also thought to be regulated by a Ras homologue.

In fungal plant pathogens, cAMP signalling has received considerable attention (for a review, see Lee et al., 2003). Well-characterized mutants are available for all the key players in this signal transduction pathway in the corn smut fungus Ustilago maydis. In this organism, cAMP has a role in the morphological transitions that accompany mating and pathogenesis. The U. maydis AC has a similar structure to the S. cerevisiae enzyme (Fig. 3B) and is also stimulated by a Gα subunit. In Magnaporthe grisea, which infects a number of important plant crops (including rice, barley and millet), cAMP mediates the formation of the appressorium, which is required for host attachment and penetration. AC mutants lose their ability to develop appressoria and are unable to penetrate rice leaves. They also show reduced vegetative growth and are sterile (Choi and Dean, 1997).

Unlike the fungi, which lack obvious GC genes, examination of a draft sequence from the unicellular algae Chlamydomonas reinhardtii genome has revealed the presence of at least 20 predicted GC paralogues that are surprisingly similar to those of higher eukaryotes. However, the C. reinhardtii GCs are structurally unrelated to any of the protozoan isoforms (Fig. 2B). The catalytic domains of all the algal enzymes are very similar, but the enzymes fall into at least two topological groups. Most are predicted to be soluble GCs, but at least two contain four TM domains. Furthermore, these membrane-associated isoforms also have GAF domains (Fig. 3A), which have not so far been detected in GCs of any other organism. The large number of potential GC isoforms suggests a complex and important regulatory role for the cGMP signalling pathway, although as yet there is little published information on functional aspects. The only AC that is identifiable in the C. reinhardtii genome is most related to the ciliate and apicomplexan enzymes that have six predicted TM domains (Fig. 3B). It is known that a flagellar AC activity is involved in activation and fertilization of gametes of opposite mating types in C. reinhardtii (Pasquale and Goodenough, 1987). Interactions between sex-specific adhesive molecules on the flagella of gametes are coupled to activation of a G protein-independent AC by flagellar protein kinases (Zhang et al., 1991; Zhang and Snell, 1993). However, linkage of this activity to the putative C. reinhardtii AC gene (Fig. 1) awaits experimental verification.

Evolution of class III cyclases

Microbial cyclases have a complex evolutionary history that cannot easily be inferred from taxonomic relationships. An understanding of the process must take into account instances of gene duplication and fusion, loss of isoform lineages, possible lateral gene transfer events, changes in substrate specificity and the effects of functional diversification. Whereas the six classes of prokaryotic cyclase (I–VI) do not have a common origin, all eukaryotic cyclases (both AC and GC) appear to have evolved from the class III bacterial ACs. Figure 4 shows a protein cluster analysis of a wide range of class III cyclases, based on catalytic domain sequences. In the case of heterodimers, we used the domain containing the asparagine/arginine catalytic pair that is ubiquitous to class III cyclases. Analysis of both chains of a heterodimer has been described elsewhere (Roelofs and Van Haastert, 2002). For illustrative purposes, we have restricted our analysis to a relatively small number of sequences. Four distinct clades can be identified: the ‘prokaryotic-type’ ACs, the ‘soluble’ ACs, the ‘fungal-type’ ACs and the ‘major eukaryotic cyclase lineage’. These monophyletic subgroups could have diverged from a common ancestor soon after the appearance of the first eukaryote, or they may have a deeper origin, with each being derived from distinct prokaryotic class III enzymes that were possessed by the eukaryotic progenitor.

The active sites of the ‘soluble’ ACs are formed from heterodimeric catalytic domains (Fig. 2C) that themselves display considerable evolutionary divergence (Roelofs and Van Haastert, 2002). These sequences have been identified in a small, but diverse collection of species: mammals, apicomplexans, Anopheles mosquitoes and the bacterium Chloroflexus (Muhia et al., 2003), although a related form is also present in the spirochaete Leptospira. The soluble ACs are also distinctive in that one of the diagnostic purine-binding residues is typically a threonine or serine residue rather than an aspartate (Fig. 1). The absence of genes encoding this subclass of cyclase from most organisms probably resulted from selective loss in situations where the enzyme did not perform an essential function (Roelofs and Van Haastert, 2002), although acquisition of the gene by lateral transfer cannot be excluded. The soluble GC of Dictyostelium (sGC) is a member of this subgroup and appears to have evolved independently of other eukaryotic GCs.

Cyclases belonging to the ‘prokaryotic-type’ subgroup are found mainly in bacteria and, as with the soluble ACs, are characterized by the presence of a serine or threonine residue in the purine-binding pocket, instead of an aspartate. Several eukaryotic microorganisms also possess members of this subgroup [e.g. Giardia (EAA39187), Chlamydomonas (genie 445.7), Plasmodium PfACα and Dictyostelium (ACB)], although these species belong to taxa that are not closely related. The Cya2 of the cyanobacterium Synechocystis also clusters with the prokaryotic type and is the only member predicted to have GC activity.

The ‘fungal’ and trypanosomatid ACs form a third subgroup (Fig. 4). These enzymes function as homodimers and have an aspartate residue in the purine-binding pocket (Fig. 1). Given that divergence of the phylogenetic groups that encompass fungi and trypanosomatids was an early event in eukaryotic evolution (Baldauf, 2003), the clustering of these ACs is perhaps unexpected. As with the soluble and prokaryotic-type ACs, a possible explanation is that these ACs are derived from a progenitor AC that has since been lost by most eukaryotic lineages. Interestingly, the trypanosomatids and fungi do not contain GCs, and members of this AC subgroup are the only type of cyclase that has been identified in these organisms. The fungal and trypanosomatid ACs do, however, differ in features including topology (the former are soluble and the latter membrane localized) and G protein or Ras dependence (see above).

The fourth class III subgroup is widespread in eukaryotes and has diverged further in terms of structure and function. This group encompasses mammalian soluble and membrane-bound GCs, as well as G protein-dependent ACs. The active site of this latter group is constituted by interaction of the C1 and C2 domains that are widely held to have originated by gene duplication. Subsequent fusion of the genes is thought to have given rise to these membrane-associated mammalian ACs (Fig. 2A). Gene fusion also offers an explanation for the origin of the alveolate membrane-localized GCs (Fig. 2B). This event appears to be distinct from the fusion that produced the mammalian ACs. This can be inferred from the way that the catalytic domains are reversed in the protozoan enzymes. We have also observed that the catalytic domains of Dictyostelium membrane-localized GC (GCA) do not cluster with those of the alveolate GCs, suggesting that this enzyme arose from another independent gene fusion event. The single catalytic domain of a small number of bacterial ACs (e.g. Mycobacterium Rv1625c and the cyanobacteria Nostoc and Trichodesmium) also has significant similarity to this subgroup of cyclases, including the purine-binding aspartate. This high level of similarity may identify a group of bacterial cyclases that is derived from the same progenitor as the major eukaryotic cyclase lineage.

The catalytic domains of ACs and GCs are closely related, and substrate specificity can be switched by mutating residues in just two positions (Sunahara et al., 1998; Tucker et al., 1998) (Fig. 4). However, during evolution, the conversion of AC to GC has arisen relatively infrequently, presumably because of constraints imposed by functional coupling to regulatory mechanisms (Roelofs and Van Haastert, 2002). There is evidence that GCs evolved independently from ACs on three occasions. Once to produce the major eukaryotic GC lineage, secondly from a ‘soluble AC’ to produce Dictyostelium sGC and, finally, in cyanobacteria to give rise to the cya2 gene of Synechocystis. The high level of sequence similarity between the cytosolic and membrane-localized receptor-type GCs of higher eukaryotes (Fig. 2D and E) suggests that they share a recent GC ancestor. However, the topological similarity between these receptor GCs and the ACs of trypanosomes (Fig. 2E and F) is not reflected in the sequences of the catalytic domains, which are not closely related.

An alternative classification system for class III ACs has been proposed recently (Linder and Schultz, 2003). This system contains similarities to that presented here (Fig. 4) (grouping of the ‘fungal’ and ‘major eukaryotic’ subclasses), but also some differences. Most notable is the grouping of the ‘soluble’ ACs, which contain the aspartate to serine/threonine substitution, with bacterial ACs that also have this substitution. Our analysis suggests that the soluble cyclases are phylogenetically distinct.

Concluding remarks

Microbial purine nucleotide cyclases display enormous diversity in terms of structure, function and regulatory specificity. This reflects a complex evolution in which the cyclases and their signal transduction pathways have been recruited to perform a myriad of different roles. It is clear from the output of the various genome projects that we are only now beginning to appreciate fully the widespread nature and complexity of these signalling systems. This is particularly the case with pathogenic microorganisms, in which cyclic nucleotide signalling pathways have been shown to impinge on virulence and pathology, either directly through their role within the pathogen or indirectly through interference with cyclic nucleotide levels in host cells. The major structural and biochemical differences that have been identified between the purine nucleotide cyclases of microbial pathogens and their hosts could offer considerable scope for drug development.

Acknowledgements

We are grateful to Spencer Polley for advice on phylogenetics and help with Fig. 4, and also to Martin Taylor, Pauline Schaap and Brendan Wren for constructive criticism of the manuscript. We acknowledge the support of the Wellcome Trust (University Award to D.B., Ref. 058038; Prize Fellowship to David Muhia, Ref. 062531 and Project grant to D.B. and J.K., Ref. 047241). Finally, we thank the sequencing centres (Sanger Institute, TIGR and Stanford University) who completed the Plasmodium falciparum genome project and have generated data on other apicomplexan genomes, and also others involved in the Chlamydomonas rheinhardtii genome project (These sequence data were produced by the US Department of Energy Joint Genome Institute, http://www.jgi.doe.gov/, and are provided for use in this publication only.)

Ancillary