By continuing to browse this site you agree to us using cookies as described in About Cookies
Wiley Online Library is migrating to a new platform powered by Atypon, the leading provider of scholarly publishing platforms. The new Wiley Online Library will be migrated over the weekend of March 17 & 18. You should not experience any issues or loss of access during this time. For more information, please visit our migration page: http://www.wileyactual.com/WOLMigration/
Correspondence: Martin Welker, Technische Universität Berlin, Institut für Chemie, AG Biochemie, Franklinstrasse 29, 10587 Berlin, Germany. e-mail: email@example.com
Cyanobacterial secondary metabolites have attracted increasing scientific interest due to bioactivity of many compounds in various test systems. Among the known structures, oligopeptides are often found with many congeners sharing conserved substructures, while being highly variable in others. A major part of known oligopeptides are of non-ribosomal origin and can be grouped into classes with conserved structural properties. Thus, the overall structural diversity of cyanobacterial oligopeptides only seemingly suggests an equally high diversity of biosynthetic pathways and respective genes. For each class of peptides, some of which have been found in all major branches of the cyanobacterial evolutionary tree, homologous synthetases and genes can be inferred. This implies that non-ribosomal peptide synthetase genes are a very ancient part of the cyanobacterial genome and presumably have evolved by recombination and duplication events to reach the present structural diversity of cyanobacterial oligopeptides. In addition, peptide synthetases would appear to be an essential part of the cyanobacterial evolution and physiology. The present review presents an overview of the biosynthesis of cyanobacterial peptides and corresponding gene clusters, the structural diversity of structural types and structural variations within peptide classes, and implications for the evolution and plasticity of biosynthetic genes and the potential function of cyanobacterial peptides.
In the last two decades, a high number of cyanobacterial metabolites has been isolated and characterized from cultured strains and field samples. So far, more than 600 peptides or peptidic metabolites have been described from various taxa. The continuous and rising interest stems both from the surveillance of aquatic systems, especially where toxic compounds in mass developments – so-called blooms – are rising concerns of public health, and from various and diverse bioactivities of unique structures with potential pharmacological implications.
Cyanobacterial secondary metabolites represent a vast diversity of structures (Moore, 1996; Burja et al., 2001; Staunton & Weissman, 2001; Harrigan & Goetz, 2002) isolated from a variety of taxa and geographic origins. The occurrence and structures of secondary metabolites among the subsections have been evaluated recently by applying multivariate statistical analyses (Guyot et al., 2004). More than 80 structural archetypes of compounds have been defined, occurring in more than 30 genera of all five subsections (Boone & Castenholz, 2001) (corresponding to orders in other taxonomic schemes). To date, most compounds have been isolated from Oscillatoriales and Nostocales, followed by Chroococcales and Stigonematales, whereas very few metabolites are yet known from Pleurocapsales. However, this distribution reflects the availability of strains and exploitable biomass from natural habitats rather than the actual potential of genera in the respective subsections to synthesize secondary metabolites. For example, Lyngbya sp. (Oscillatoriales) and Microcystis sp. (Chroococcales) are easily collected or cultured in amounts that allow the isolation of compounds in the ppm range, whereas for Pleurocapsa this would be much more laborious and time consuming.
A major part of cyanobacterial secondary metabolites are peptides or possess peptidic substructures. The majority of these oligopeptides are assumed to be synthesized by NRPS (non-ribosomal peptide synthetase) or NRPS/PKS (polyketide synthase) hybrid pathways on the basis of particular structures that are not achievable by ribosomal peptide synthesis. Recently, however, a biosynthetic pathway for a cyclic peptide with thiazole moieties, patellamides A–C, has been shown to start with a ribosomally synthesized peptide that is modified post-translationally (Schmidt et al., 2005). A similar biosynthetic pathway might produce other types of cyanobacterial peptides, as will be discussed below.
The present review presents a short introduction to NRPS principles, followed by an overview of currently known genes and gene clusters for peptide biosynthesis in cyanobacteria. In the second part, an overview is given of the structural diversity of cyanobacterial peptides that will be grouped in biologically meaningful classes. Thirdly, we review the data on peptide distribution in cyanobacterial taxa and in diverse habitats and discuss the hypotheses on the function of these metabolites.
Non-ribosomal peptide synthetases – a short introduction
Non-ribosomal peptide synthetase genes generally encode multi-module proteins, but genes encoding single modules or domains can be found as well. For a single minimal module, some 3–3.5 kbp (kilobase pairs) of genetic sequence is required, thus making some multi-module NRPS genes the largest known genes (Finking & Marahiel, 2004).
The recognition of domains in silico is generally easily achieved by a blast search and a further characterization by assigning conserved, domain-specific core motifs in the protein sequences (Konz & Marahiel, 1999).
Main functional domains
Adenylation domains catalyze the specific activation of carboxyl groups of amino acids, imino acids or hydroxy acids, as well as various carboxylic acids. Adenylation domains are the primary specification step for the amino acid sequence of the completed peptide. This is achieved by the geometry of a binding pocket in the enzyme that only allows a specific amino acid to enter into the catalytic site. An analysis of the phenylalanine binding pocket of the activating domain of the first module of gramicidin S synthetase has led to an amino acid contact residue code permitting the prediction of substrates in NRPS adenylation domains (Stachelhaus et al., 1999; Challis et al., 2000; Lautru & Challis, 2004). This specificity conferring code has been confirmed in a variety of correlations of NRPS genes with known peptide product structures, and may make it possible to predict unknown products (Challis & Ravel, 2000). Although this non-ribosomal code is a good predictor of substrate selection, it is not the only mechanism of control. Thiolation, as well as condensation domains, are also involved in the specific formation of a particular amino acid sequence (von Döhren et al., 1999; Lautru & Challis, 2004). Further support for this has been provided by studies on heterologously expressed adenylation domains on which amino acid specific adenylation can be tested by an ATP-PPi-exchange assay (Dieckmann et al., 1995). Examples are BarD, which incorporates l-leucine but activates 3-chloro-leucine and valine as well (Chang et al., 2002). The leucine specific adenylation domain of McyB of Microcystis aeruginosa activates isoleucine and valine as well, but these have never been observed in microcystins (Sielaff, 2004). Likewise, the first adenylation domain of NosA activates Val, Ile and Leu when it is expressed in Escherichia coli, but Leu is not found in nostopeptolide (Hoffmann et al., 2003).
In cyanobacteria, about 200 adenylation domains have been identified in nucleotide sequences so far. They are generally integrated in NRPS systems or represent acyl-CoA synthetases. Upon alignment, 10 core motifs (A1–A10) can be easily identified in most cyanobacterial adenylation domains representing consensus sequences that can also be found in fungal systems (Konz & Marahiel, 1999).
The key role of these domains is in the transport of intermediates, which requires specific interaction with the activating adenylation domain and the corresponding condensation domains for aminoacyl and peptidyl elongation cycles. In cases of intermediate modifications, the transport also requires interactions with epimerization domains, methyltransferase domains, oxidation domains, reduction domains, or thioesterase domains in terminating cyclization reactions (Weber et al., 2000).
Aminoacylation or acylation of the ‘swinging arm’ cofactor 4′-phosphopantetheine is considered the covalent transport principle in NRPSs and PKSs.
These domains are generally identified by the conserved 4′-phosphopantetheine attachment site as signature sequence, which is post-translationally modified by protein- phosphopantetheinyl transferases (see below).
The condensation domain of about 450 amino acids has been functionally characterized in the gramicidin S/tyrocidine synthetase systems (Stachelhaus et al., 1998). The current functional interpretation proposes, by analogy to the ribosomal system, that an aminoacyl and a peptidyl site (A-site and P-site) receive the activated intermediates (von Döhren et al., 1999). The aminoacylated carrier proteins (thiolation domains) resemble charged tRNAs, and the condensing site, the peptidyl transferase region.
As a prototype of a C-domain, the crystal structure of an isolated C-domain of the vibriobactin biosynthetic system, VibH, has been determined (Keating et al., 2002). The VibH-structure revealed a novel topology, and is a monomer consisting of two subdomains. Alignments confirm the structure to be representative of the NRPS condensation domains, the related epimerization domains, and cyclocondensation domains. The downstream carrier, which transports the initiating acyl residue or the peptidyl intermediate, will bind to the C-terminal face of this domain with the pantetheinyl arm extending into the solvent channel. The upstream carrier with the acceptor compound, usually an aminoacyl residue generally binding in trans to the condensation domain, would approach from the opposing open end of the domain, and both pantetheinyl arms would extend into the solvent channel to facilitate peptide bond formation (Keating et al., 2002).
A survey of about 160 cyanobacterial condensation domains reveals that their core sequences are very similar to those derived from Bacillus domains. Upon alignment by clustal, domains group into functionally related types, and not into subsections or genera. This has been observed before and correlates with similar analysis of adenylation and thiolation domains (von Döhren et al., 1999). Obvious clusters are the related heterocyclization and epimerization domains and functionally related domains of systems producing homologous peptides.
These domains catalyze the peptide bond formation and cyclization of cysteine, serine or threonine side chains to respective heterocycles. This cyclodehydration reaction requires either the N-acyl-aminoacyl or the respective peptidyl intermediate. This domain type was first identified in the bacitracin system (Konz et al., 1997) and the reactions have been studied in detail in the vibriobactin, pyochelin and epothilone systems (Patel et al., 2003). Peptides containing heterocycles are fairly common among cyanobacteria, e.g. in various Cys containing cyclopeptides, barbamide, curacins or cyclamides. The respective domains from the barbamide and curacin systems are known, as well as similar domains in several orphan biosynthetic clusters of Anabaena PCC 7120, Nostoc punctiforme ATCC 73102 and Crocosphaera watsonii. Thiazole formation, however, is not restricted to NRPS pathways as has been shown for patellamides (Schmidt et al., 2005).
Almost all cyanobacterial systems characterized so far, including PKS systems, contain this terminating domain. A comparative multiple sequence alignment analysis (with clustal) of the currently available domains reveals that microcystin (McyC) and nodularin (NdaB)-linked domains are a special group, presumably due to the unusual cyclization reaction catalyzed between Adda and the last amino acid in the linear peptide sequence (data not shown).
Integrated modifying domains
Epimerization domains and amino acid racemases
Epimerization domains largely resemble condensation domains and can be identified by the slightly different signature sequences (Konz & Marahiel, 1999). Their functions are to epimerize aminoacyl and peptidyl intermediates at the thioester stage. This reaction is reversible, and these intermediates are thus in an equilibrium state between both isomers. The following reaction, usually a condensation reaction, is involved in the control of stereospecificity to select the d-isomer (Stachelhaus & Walsh, 2000; Luo et al., 2001).
Not all d-configured amino acids, however, are transformed by this reaction. Some adenylation domains specifically accept d-residues, which have to be supplied by corresponding amino acid racemases. This is illustrated in microcystin biosynthesis, where d-Glu is supplied as a direct precursor, whereas Ala is epimerized by the respective module (Tillett et al., 2000; Sielaff et al., 2003).
A formylation domain was first identified according to sequence comparison in the anabaenopeptilide biosynthetic cluster (Rouhiainen et al., 2000). The N-terminal region of ApdA shows similarities to co-substrate formyl tetrahydrofolate-dependent methionyl-tRNA formyltransferases. The protein region following the circa 400 amino acids shows similarities to condensation domains and is linked to the first adenylation domain. Other formylated non-ribosomal peptides include linear gramicidin (Kessler et al., 2004).
N-methylated peptide bonds in non-ribosomally formed peptides originate from N-methyl transfer to thiol-attached amino acids by N-methyl transferase domains. This was first demonstrated in fungal systems by sequencing the enniatin synthetase gene. N-methyl-transferase domains are integrated in the adenylation domain between the core motifs A8 and A9 (Haese et al., 1993; Patel & Walsh, 2001). This domain with a size of about 450 amino acid residues (55 kDa) shares some sequence similarities with a heterologous family of S-adenosyl-l-methionine (SAM)-dependent methyltransferases, including DNA methyltransferases (Velkov & Lawen, 2003). N-methylation does not seem to be obligatory for the following condensation reaction (Glinski et al., 2001).
A comparative analysis of adenylation domains containing N-methyl-transferase inserts with homologous domains without inserts reveals a high sequence identity, also in the regions adjacent to the insert between core sequences A8 and A9. This implies that N-methylation can be regarded as a function to be gained by domain insertion, or lost as well (Schauwecker et al., 2000).
These domains of about 200 amino acids with homology to NAD-binding proteins are inserted in adenylation domains between the core motifs A8 and A9. Examples include myxobacterial systems forming epothilone (Polyangium cellulosum, EpoB; Julien et al., 2000), myxothiazol (Stigmatella aurantiaca, MtaC and MtaD; Silakowski et al., 1999) or tubulysin (Angiococcus disciformis, TubB; Sandmann et al., 2004). Respective homologous sequences can be found in two orphan NRPS genes in Anabaena PCC 7120 and Crocosphaera watsonii.
In epothilone biosynthesis, this domain catalyzes the oxidation of the methylthiazolinyl-intermediate to the methylthiazolylcarboxy-intermediate (Chen et al., 2001). In the barbamide biosynthetic system, no oxidation domain is present in the respective adenylation domain of BarG, although the peptide contains a terminal thiazole. It has been suspected that BarI and BarJ are involved in oxidative decarboxylation and conversion of thiazoline (Chang et al., 2002).
Various non-ribosomal peptides have been known to contain a reduced C-terminal carboxyl group, and respective terminal alcohol functions are also found in polyketide structures. These originate by a two-step reduction via the aldehyde catalyzed by an NADPH/NADH dependent catalytic domain, thus releasing the final carrier-bound thioester intermediate (Gaitatzis et al., 2001; Schracke et al., 2005).
Reductase domains of about 400 amino acids in size show significant similarity to several related proteins, such as nucleoside-diphosphate-sugar epimerases, flavonol reductase/cinnamoyl-CoA reductase and other NADPH dependent enzymes. In nostocyclopeptides, the final peptidyl intermediate is reduced to a linear aldehyde cyclizing with the N-terminal tyrosine to form a stable imine bond (Becker et al., 2004).
Phosphopantetheine-protein transferases (PPTs) are well known in bacterial systems, and their genes are often contained within biosynthetic clusters (Lambalot et al., 1996; Walsh et al., 1997). Sfp, a PPT located in the surfactin cluster of Bacillus subtilis (Quadri et al., 1998), is well characterized. The 26-kDa protein modifies a variety of carrier proteins and domains, including acyl carrier domains and aryl carrier domains (Reuter et al., 1999). It is thus possible directly to charge CoA-thioesters directly onto apo-enzymes to investigate, for example, the specificity of modification and condensation reaction or to generate new products in vitro (Weinreb et al., 1998; Belshaw et al., 1999; Sieber et al., 2003).
A blast survey of cyanobacteria based on the Sfp-structure shows PPTs in NRPS containing strains (Anabaena PCC7120, Anabaena variabilis ATCC 29413, C. watsonii WH 8501, N. punctiforme PCC 73102 and Trichodesmium erythraeum IMS101), but also in NRPS-free strains (Gloeobacter violaceus PCC 7421, Prochlorococcus marinus SS120, Synechococcus elongatus PCC 6301 and Synechocystis sp. PCC 6803).
Methylation of hydroxyl groups (O-methyltransferases) or methylene groups is catalyzed by autonomous methyltransferases, which can be readily identified by a set of sequence motifs involved in S-adenosyl-methionine (SAM) binding. The role of McyJ in O-methylation of Adda in microcystin biosynthesis has been confirmed by gene disruption, which led to the production of des-methyl-Adda-microcystin (Christiansen et al., 2003). O-methyl transferases involved in microcystin formation share more than 80% identity, whereas other methyl transferases like ApdE or an unidentified Prochlorococcus enzyme, both of unknown function, share only about 40% of the amino acid residues around the SAM-binding region, as can be inferred from a blast search with McyJ. This group of modifying enzymes is thus fairly diverse with respect to the substrates encountered.
These types of enzymes resemble Zn-dependent dehydrogenases and have been found in cyanobacteria in the nostopeptolide cluster and the nostocyclopeptide cluster, where they are involved in methyl-proline formation from leucine together with delta1-pyrroline-5-carboxylic acid reductase (P5C reductases; Luesch et al., 2003). Similar enzyme pairs are found in as yet unidentified clusters of N. punctiforme and A. variabilis.
Chlorine is found in 22% of cyanobacterial metabolites (Guyot et al., 2004), but little information is available about respective halogenases. So far, vanadium haloperoxidases known from bromination of metabolites from marine algae have not been found in cyanobacteria. Enzymes involved in the halogenation of aromatic side chains, especially of Tyr and Trp, contain an NAD binding motif and are known from Pseudomonas, Xanthomonas, Myxococcus and Streptomycetes.
The putative halogenase, ApdC, within the anabaenopeptilide synthetase cluster of Anabaena 90, is presumably involved in the chlorination of a tyrosine residue, but has so far no cyanobacterial homologs (Rouhiainen et al., 2000).
A set of enzymes of the barbamide biosynthetic cluster is involved in leucine chlorination, BarB1, BarB2 and BarC (Chang et al., 2002). They show similarities to Phytanoyl-CoA dioxygenase, belong to a group of putative 2-oxoglutarate iron-dependent halogenases and have also been identified in various strains of Oscillatoria spongeliae from the marine sponge Dysidea (Lamellodysidea) herbacea, known as a source of halogenated peptides (Faulkner et al., 1994).
Non-integrated thioesterases are involved in deacylation of pantetheine thiols, activating NRPS systems primed with acetyl CoA, or reactivating mischarged and thus stalled carrier domains (Yeh, 2004; Yeh et al., 2004; Sieber & Marahiel, 2005). Only McyT in the microcystin cluster of Planktothrix is directly linked to NRPS genes, though it is absent in other microcystin clusters. Six other similar thioesterases found so far in cyanobacterial genomes are not parts of the detected orphan NRPS biosynthetic clusters.
Non-ribosomal peptide synthetase and polyketide synthase genes in cyanobacteria
The high diversity of cyanobacterial secondary metabolites and their chemical structures indicates the presence of diverse NRPS and PKS gene clusters in cyanobacterial genomes, though only a minor part has been sequenced so far. A peculiarity of cyanobacterial secondary metabolite biosynthesis is the frequently observed mixing of NRPS and PKS genes, often within a single open reading frame (see below).
One approach to estimate the potential of secondary metabolite biosynthesis of cyanobacterial taxa is the search for NRPS and PKS genes by degenerate primer PCR. Conducting such a study, Christiansen et al. (2001) confirmed the presence of NRPS genes in 75% of 146 axenic cyanobacterial strains of all subsections. Nonetheless, no homologous genes were detected in a number of genera, mainly in the Chroococcales (Cyanothece, Gloeobacter and Gloeothece and the genetically diverse Synechococcus and Synechocystis strains). A similar analysis has been carried out for stromatolite communities (Burns et al., 2005) where diverse PKS gene fragments were identified. NRPS genes have been identified in a symbiotic Prochloron strain that could not be cultivated. This indicates that some metabolites that have been attributed to the ascidian host may in fact be produced by the symbiont (Schmidt et al., 2004).
Hence, it is reasonable to assume a wide distribution and a high diversity of NRPS and PKS clusters in cyanobacterial genomes. The increasing number of completed genomic sequences is also valuable for the study of natural product biosynthesis.
Genomic sequence data
A wealth of sequence data on cyanobacterial NRPS/PKS genes is available today, although the published sequences may represent only a small part of cyanobacterial genes for secondary metabolite synthesis. Fourteen complete or nearly complete genomes are available to date (Table 1).
Table 1. NRPS genes in cyanobacterial genomes. The numbers of genes containing at least one condensation domain (COG1020 or pfam00668) and the genome size in Mbp are given. In unfinished genomes, numbers are estimates
It is also remarkable that especially species with small genomes like Synechocystis do not contain NRPS/PKS genes, whereas large genomes like Nostoc or Crocosphaera contain numerous clusters. In such genomes, NRPS/PKS genes may constitute more than 5% of the genomic sequence, comparable to prominent actinobacteria (Streptomyces clavuligerus and Streptomyces avermitilis), firmicutes (Bacillus subtilis) or myxobacteria (Myxococcus xanthus, Sorangium cellulosum).
For most NRPS clusters in genomic sequences, however, a peptide product has not been identified. Only for N. punctiforme ATCC 29133 (syn. PCC73102) could a gene cluster be identified as the nostopeptolide cluster (nosA-D) previously described from another Nostoc strain (Hoffmann et al., 2003) by sequence homology and the chemical detection of the peptide (Hunsucker et al., 2004). Both clusters are nearly identical except for an epimerization domain in NosC of Nostoc ATCC 29133 that is lacking in Nostoc GSV224. For all other NRPS clusters, no product has been identified so far and the corresponding gene clusters can be considered orphan clusters.
Considering the high number of individual NRPS metabolite pathways, it is likely that the number of clusters inherited in a single cyanobacterial genome has a limit. On the basis of genome sequences, a number of three to five NRPS or NRPS/PKS clusters seems to be exceeded only rarely. This corresponds well with to up to four peptide classes detected in single colonies of Microcystis (Welker et al., 2004a) or strains of Planktothrix (Welker et al., 2004b), whereas at least a twice the number of peptide classes are produced by both genera as a whole.
Known cyanobacterial non-ribosomal peptide synthetases
The supply of two direct precursor amino acids of the product, d-Glu and N-Me-d-Asp, and the origin of phenylacetate has not been determined. The amino acid racemase McyF included in the cluster is apparently not involved in the process (Sielaff et al., 2003). The function(s) of McyI, a phosphoglycerate dehydrogenase homologue, are also unclear. Further, the mechanism for the formation of the important dehydro-Ala residue by dehydration of a seryl-intermediate is not known. By analogy with heterocyclization domains, a special condensation–dehydration has been suspected, but evidence is still missing. An ABC transporter, McyH, proved to be essential for microcystin production, linking export and synthesis (Pearson et al., 2004). Similar genes for export proteins that are usually required for the biosynthetic processes have been found in other NRPS systems (von Döhren, 2004; Finking & Marahiel, 2004; Sieber & Marahiel, 2005). Though essential for NRPS, the 4'phosphopantetheine protein transferase gene (PPT) was not part of the cluster.
The absence of a PPT gene or genes involved in precursor supply has parallels in NRPS clusters described in various prokaryotes and eukaryotes. Direct precursors utilized by NRPS systems are often primary metabolites and do not need to be provided by the respective cluster. On the other hand, it has frequently been observed that genes for the biosynthesis of precursors are cluster constituents, as the PKS system providing Adda for microcystin.
Cloning and sequencing of the microcystin biosynthetic clusters from Microcystis (Chroococcales), Planktothrix (Oscillatoriales) and Anabaena (Nostocales) revealed a highly conserved set of multidomain proteins accounting for the same basic reaction steps (Tillett et al., 2000; Christiansen et al., 2003; Rouhiainen et al., 2004). Differences in the clusters have been found with respect to the arrangement of genes, the localization and orientation of promoter regions and the content of genes not directly involved in the peptide assembly. However, the structural organization of the biosynthetic NRPS/PKS genes, including their modular arrangement, has been conserved (Fig. 1). A sequence analysis comparing key regions of the three microcystin synthetases from the different genera with the respective 16S rRNA gene sequences and a fragment of the DNA-dependent RNA polymerase (rpoC1) indicates the co-evolution of the complete gene set of these synthetases in different subsections (Rantala et al., 2004). This hints at an ancient existence of complete sets of biosynthetic genes, predating the eukaryote lineage. A comparison with nodularin biosynthetic genes supports the close relation of these systems and suggests that the nodularin biosynthetic cluster evolved from the microcystin cluster by domain deletion (Moffitt & Neilan, 2004; Rantala et al., 2004). Indeed, the two amino acids following the dehydro-residue in position 3 in microcystins are missing in nodularin. The respective modules corresponding to parts of McyA and McyB are lacking, and the remains are fused into the two-module synthetase NdaA. All other genes have orthologues in the microcystin cluster.
To explain the patchy occurrence of microcystins and other peptides, horizontal gene transfer has been discussed as a possible mechanism for the distribution of biosynthetic clusters. Horizontal gene transfer is well known within the frame of pathogenicity islands that often contain biosynthetic clusters (Dobrindt et al., 2004; Hochhut et al., 2005). The uptake of such DNA fragments dramatically increases the extent of pathogenicities and thus the range of host interactions. Another mechanism of diversity increase is the horizontal gene transfer of fragments of biosynthetic clusters like domains or sets of domains, which may be acquired by DNA uptake followed by recombination. Such modifications have been documented in the microcystin clusters from M. aeruginosa (Tanabe et al., 2004) as supposed genetic exchange within a species, but have also been conducted between diverse prokaryotes (Lopez, 2003) based on a careful analysis of the epothilone cluster.
The presence of transposase genes close to all three mcy-clusters is intriguing in this respect. Several insertion sequence (IS) elements have been described in cyanobacteria, including M. aeruginosa (Mlouka et al., 2004).
Anabaenopeptilide (cyanopeptolin) synthetase
Anabaenopeptilides (Fujii et al., 1996) are members of the cyanopeptolin class of cyanobacterial peptides (see below). The anabaenopeptilide synthetase gene cluster from Anabaena 90 contains three NRPS genes (apdA, B and D) and a putative halogenase (apdC) thought to be involved in chlorination of a Tyr residue (Rouhiainen et al., 2000).
Two remarkable features of the anabaenopeptilides are N-formylation and the unusual amino acid 3-amino-6-hydroxy-2-piperidone (Ahp). The initiating reaction, formylation of a Gln residue, is carried out by a formyl-transferase domain, first described in this system.
Gln is also the predicted substrate of the adenylation domain in position 2 (see section below on Cyanopeptolins), which is proposed to be linked to the nitrogen of the adjacent Thr by a new type of domain inserted into the Thr activating A-domain between the motifs A8 and A9, then generating Ahp. This insert has about 30% amino acid identity with protein arginine N-methyltransferases, but so far it has not been found in any other NRPS systems.
In addition, there are two genes of yet unclear functions, a methyltransferase (apdE) and a putative acyl carrier protein reductase gene (apdF). The methyltransferase gene shows 40–43% identity to genes of unknown insect and sponge symbionts and a similar SAM-binding protein in Prochlorococcus marinus. The microcystin biosynthesis-associated O-methyl transferase, McyJ, of Anabaena 90, M. aeruginosa and Planktothrix agardhii all have all only 27% identity when compared to ApdF.
Nostopeptolides are produced by the terrestrial cyanobacterium Nostoc sp. GSV224 and are branched acylated octapeptides with a heptapeptide lactone structure (Golakoti et al., 2000). An unusual component is leucyl acetate, which is derived from a leucyl intermediate by acetate addition, thus linking NRPS and PKS systems directly. The ring structure consists of 7 amino (imino) acids and one acetate unit. The nos gene cluster contains three NRPS genes (nosA, c and D), one PKS gene (nosB) for the acetate insertion and two genes, zinc-dependent long-chain dehydrogenase (nosE) and a delta(1)-pyrroline-5-carboxylic acid reductase (nosF), involved in the formation of 4-methyl-proline (Luesch et al., 2003), as well as an ABC transporter (nosG; Hoffmann et al., 2003). The unassigned orf5 encoding a 265-amino acid protein has been found in other NRPS clusters as well (nostocyclopeptide cluster and an orphan cluster in A. variabilis ATCC 29413). A detailed analysis of the amino acid binding sites of the peptide synthetases showed the co-linearity of the protein template with the peptide sequence. The first adenylation domain expressed as fragment in E. coli showed a relaxed specificity, accounting for the presence of Val or Ile in nostopeptolide A and B. Activation of additional analogues like Leu, which are not incorporated in the peptide, indicated additional control mechanisms.
Nostocyclopeptides are cycloheptapeptides produced by Nostoc ATCC 53789, isolated from a lichen (Golakoti et al., 2001). Interestingly, this Nostoc strain does not contain nostopeptolides, but it does produce a set of more than 25 cryptophycins, also produced by Nostoc GSV224. However, the respective 16S rRNA genes differ by 2.8%, reflecting a significant genetic distance. The 33-kb cluster contains two NRPS genes (ncpA and B) in a first operon, which assemble the peptide (Tyr-Gly-DGln-Ile-Ser-mPro-Leu/Phe)-S-NcpB. In a unique termination reaction, this peptide is reduced by the terminal reductase domain to a linear aldehyde that is subsequently captured intramolecularly by the amino group of the N-terminal amino acid residue tyrosine to form a stable imine bond (Becker et al., 2004). The second operon contains five additional genes, with ncpC, ncpD and ncpE being orthologues to the respective nos-genes involved in methyl-Pro supply, NcpF resembling an ABC-transporter (77 kDa) and the peptidase NcpG with homologies to D-amino acid specific hydrolases. A putative transposase is located downstream of ncpA.
Barbamide (Orjala & Gerwick, 1996) is one of about 200 bioactive cyanobacterial metabolites of marine origin (Burja et al., 2001; Gerwick et al., 2001), many of which have been isolated from Lyngbya majuscula samples. In the barbamide biosynthetic cluster (bar), 12 genes were identified and are transcribed in two coinciding polycistronic mRNAs (Chang et al., 2002). The biosynthesis of barbamide has several unique features, including a trichloro-Leu as starter unit, which is deaminated, extended by a diketide with E-double bond formation, and heterocyclization and decarboxylation of the terminal Cys.
The chlorination of Leu has been proposed to occur at the thioester intermediate level (attached to the carrier protein BarA) in a complex involving BarB1, BarB2 and BarC (Sitachitta et al., 2000a). The reaction is likely to be similar to the chlorination of Thr in syringomycin biosynthesis, thought to be catalyzed by the related proteins SyrB2 and SyrC (Guenzi et al., 1998). The substrate binding pocket of the respective stand-alone adenylation domain BarD has been shown to accept Leu, 3-chloro-Leu and Val. The BarE adenylation domain specifically accepts 3-chloro-Leu, but it is not clear whether 3-chloro-Leu is a free intermediate, or is channelled somehow between BarA and BarE (Chang et al., 2002).
Curacins (Gerwick et al., 1994; Yoo & Gerwick, 1995) are polyketides with a single cysteine converted to thiazolidine and are produced by strains of Lyngbya majuscula. Curacin A is a potent cancer cell toxin interacting with the colchicine drug binding site on microtubules (Wipf et al., 2004). The 64-kb cluster contains 14 genes, including eight monomodular PKS and one PKS-NRPS hybrid (CurF) with a heterocyclization domain, a Cys activating domain and a thiolation domain (Chang et al., 2004). Preceding the NRPS module is a unique gene cassette that contains an HMG-CoA synthase likely responsible for formation of the cyclopropyl ring. A highly unusual feature of CurA is three tandem acyl carrier proteins, followed by an adjacent module of autonomous domains (CurB-E). This particular region is similar to another Lyngbya polyketide, jamaicamide (Edwards et al., 2004).
The lyngbyatoxins are potent skin irritants with a prenylated indolactam structure derived from Val and Trp (Edwards & Gerwick, 2004). The biosynthetic gene cluster cloned from a field sample of Lyngbya majuscula spans 11.3 kbp and encodes for a two-module NRPS (LtxA), a P450 mono-oxygenase (LtxB), an aromatic prenyltransferase (LtxC) and an oxidase/reductase protein (LtxD). LtxC has been expressed in E. coli and shown to catalyze the transfer of a geranyl group to (–)-indolactam V as the final step in the biosynthesis of lyngbyatoxin A.
Ribosomal peptide synthesis of complex peptides
Though the majority of cyanobacterial peptides have been shown to be synthesized non-ribosomally, the recent characterization of the patellamide biosynthesis cluster and its heterologous expression (Long et al., 2005; Schmidt et al., 2005) indicates that complex and modified peptides may nonetheless be synthesized independently of NRPS enzymes.
Patellamides, pseudosymmetrical cyclo-octapeptides with each substructure having the sequence thiazole-nonpolar amino acid-oxazoline-nonpolar amino acid, are moderately cytotoxic and reverse multidrug resistant. In patellamides, the primary sequence is encoded in a rather small gene, patE, which has been identified by a tblastn search of the draft genome sequence, querying for all eight possible peptides that could lead to the formation of the cyclic structure. This gene contains both octapeptide sequences of patellamides A and C. The translated peptide of 71 amino acids is processed by proteolytic cleavage, cyclization and heterocyclization. Respective genes, a protease (patA), a possible adenylating enzyme-hydrolase hybrid (patD) and an oxidoreductase-protease hybrid (patG), immediately surround patE. These genes and the organization of the cluster are reminiscent of the lantibiotic and microcin biosynthetic machinery, which has been characterized in other bacteria (Garneau et al., 2002; Rebuffat et al., 2004; Chatterjee et al., 2005). A similar cluster has been found in the genome of Trichodesmium erythraeum IMS101 (Schmidt et al., 2005).
Structural diversity of cyanobacterial peptides
To date, some 600 cyanobacterial peptides have been described. New peptide structures have been given names that are not included in a naming system and thus many structurally similar peptides have names that do not reflect these similarities. Peptide names eventually chosen by the authors often refer to the taxon from which the new compound has been isolated or to the geographic locality where the sample was taken from (e.g. micro-, anabaeno-, kasumig-, banyas-) combined with suffixes referring to structural properties (e.g. -peptin, -peptilide, -cyclin, cyclamide). Thus, for clearly confined groups of similar peptides, a multitude of names can exist. For example, the peptides aeruginopeptin 917S-C (Harada et al., 2001), anabaenopeptilide 90-A (Fujii et al., 1996), cyanopeptolin S (Jakobi et al., 1995), symplostatin 2 (Harrigan et al., 1999), hofmannolin (Matern et al., 2003a), microcystilide A (Tsukamoto et al., 1993), micropeptin 88-A (Ishida et al., 1998a), nostocyclin (Kaya et al., 1996), oscillapeptilide 97-B (Fujii et al., 2000), oscillapeptin F (Itou et al., 1999a), scyptolin A (Matern et al., 2001), somamide A (Nogle et al., 2001) and tasipeptin A (Williams et al., 2003) are all cyclic depsipeptides of one peptide class – called cyanopeptolins in this review. The suffixes to the peptide names refer to the strain number (e.g. anabaenopeptilide 90-A from Anabaena 90; Fujii et al., 1996), the origin of a bloom sample (e.g. micropeptin T from lake Teganuma; Kodani et al., 1999), or the mass (e.g. micropeptin SF995; Banker & Carmeli, 1999), where the letters SF refer to the central pond of Tel Aviv Safari Park and 995 to the mass in Da), or are given in alphabetical order (e.g. anabaenopeptins A through K).
Part of the diversity of names for similar peptides is attributable to the nearly simultaneous publication of peptide structures. This is the case for anabaenopeptin A (Harada et al., 1995), ferintoic acid A (Williams et al., 1996) and oscillamide Y (Sano & Kaya, 1995), for example. In one case, the same name was given to two different structures that had been isolated from P. agardhii and published in the same year: anabaenopeptins G with one molecule having a mass of 908.5 Da and an amino acid sequence of [Tyr-MIle-Hty-Ile-Lys]-CO-Arg (Erhard et al., 1999) and the other having a mass of 929.5 Da and an amino acid sequence of [Ile-MHty-Hty-Ile-Lys]-CO-Tyr (Itou et al., 1999b). The opposite has occurred, too: aeruginopeptin 917S-A (Harada et al., 2001) has exactly the same structure as the previously published microcystilide A ([Tyr-Ahp-Leu-MTyr-Ile-O-Thr]-Gln-Hpla; Tsukamoto et al., 1993). Other misleading overlaps can arise with different types of peptides produced by other organisms like microcin SF608 (Banker & Carmeli, 1999), an aeruginosin from a Microcystis bloom, and microcin J25, a 21-residue, ribosomal peptide antibiotic from E. coli (Blond et al., 1999) or aeruginosin A, a pigment from Pseudomonas aeruginosa (Holliman, 1969) and cyanobacterial aeruginosins, linear tetrapeptides (Murakami et al., 1995).
All naming efforts are driven by the need to have a unique name for a unique structure and any system has to guarantee this. Unfortunately, there is currently no naming system for cyanobacterial peptides, except for microcystins, where an effort to standardize the naming of structural variants was made at a stage when the number of described variants was much lower than it is today (Carmichael et al., 1988). Nonetheless, this naming system proved to be very valuable and, most importantly, also applicable to new variants without the addition of too many prefixes and suffixes–at most like [Asp3,ADMAdda5,Dhb7]Mcyst-RR for a variant of Mcyst-RR isolated from a Nostoc strain (Beattie et al., 1998).
For other classes of peptides, it appears to be much harder to design a scheme corresponding to that used in other peptide classes as the number of main variable positions in the molecules is higher than the two in microcystins (see below) and amino acid modifications are much more variable. In anabaenopeptins (Harada et al., 1995), for example, a class of cyclic hexapeptides (see below), all positions are variable except for a conserved lysine. As will be discussed later, we have to be aware of the possibility that the number of known structural variants in any peptide class is only a minor proportion of the total number of naturally produced congeners. Thus, any naming system has to assure that further variants to be described will fit in the system.
In addition to the variability of amino acids at particular positions in the peptide molecules, various modifications of amino and other organic acids can occur, again complicating the introduction of a practical naming scheme. In microcystins in general, only N- or O-methylation is common as an amino acid modification, whereas in aeruginosins, for example, additional chlorination, sulphation and hydroxylation have been described. Further, the number of variable, non-proteinogenic organic acids is high in cyanopeptolins, whereas in microcystins the non-proteinogenic amino acids are in conserved positions. Since a one-letter code is restricted to 26 amino acids, it is not applicable when residues like formic, acetic, glyceric, butanoic, hexanoic, or octanoic acids and others have to fit into the same scheme.
In summary, we think that a simple but efficient naming system such as that used for microcystins is not applicable to most other peptide classes and that other systems have to be developed. This is, however, not the aim of the present paper, which is focused on the genetics and biochemistry of cyanobacterial oligopeptide synthesis. The basic classification of known peptide structures in biologically meaningful groups, as presented here, may be a good starting point for a comprehensive nomenclature.
Building blocks of cyanobacterial peptides
One distinct characteristic of NRPS biosynthetic pathways is the possibility to combine proteinogenic amino acids with non-proteinogenic amino acids, fatty acids, carbohydrates and other building blocks into complex molecules. It further allows modifications of proteinogenic amino acids from epimerization to the formation of heterocycles or dehydration.
As a basis of cyanobacterial peptide synthesis, all proteinogenic amino acids in l-configuration can be found. However, certain amino acids are rarely incorporated (like methionine) or have been reported only once (like histidine; Ishida et al., 2000b). All l-amino acids can, in principle, be epimerized and incorporated in d-configuration. Further, homo-variants of many proteinogenic amino acids are common, for example homotyrosine (Hty) or homoserine (Hse). Catalyzed by N-methyl transferase domains, N-methylation at the α-amino nitrogen is a common modification. N-methylation has been reported for all amino acids in cyanobacterial peptides (except proline), whereas O-methylation is less common and restricted to amino acids with free hydroxy groups.
A number of modifications of proteinogenic amino acids have been reported: hydroxylated derivatives like hydroxy-Leu or hydroxy-Pro; dehydrated and reduced amino acids like dehydrated Ser, then named dehydro-alanine (Dha) or Cys and Thr reduced to thiazoles and oxazoles by heterocyclization and reduction, respectively.
Halogenation has been reported for aromatic amino acids as well as for aliphatic structures. Aromatic chlorination is most common for Tyr and Trp and has not been reported for Phe. Bromination predominantly occurs – with one exception (Ishida et al., 1999) – in peptides from marine environments.
Amino acid derivatives that also occur in the primary metabolism have been reported for multiple types of peptides: hydroxy-phenyl lactic acid (Hpla, a tyrosine derivative), 2-hydroxy-4-methylvaleric acid (leucic acid, a leucine derivative), 2-hydroxy-4-methylpentanoic acid (2-hydroxy-3-methylvaleric acid or isoleucic acid), or agmatine (an arginine derivative). Unfortunately, identical building blocks are sometimes not abbreviated identically. For example, the acronyms Hmpa as well as Hmva are used for isoleucic acid. Similarly, a threonine derivative (see section on Microcystins and nodularins below) is designated as 2-amino-2-butenoic acid (Aba), dehydro-homoalanine (Dhha), or dehydrobutyrine (Dhb).
A large variety of fatty acids are building blocks of cyanobacterial peptides. The most simple cases are unbranched aliphatic fatty acids like hexanoic or octanoic acid (HA and OA, respectively). Modifications of these simple fatty acids include methylation (branching), hydroxylation, or amination. By amination, β-amino acids can be formed like the complex Adda moiety in microcystins.
Last but not least, the high diversity of modified fatty acids makes cyanobacterial peptides and non-ribosomal peptides in general a structurally extremely diverse type of metabolite. Fatty acids often have a pronounced influence on physico-chemical properties of peptides, e.g. by influencing the hydrophobicity.
The following classification is based on the molecular structures, irrespective of the original source of individual congeners (Table 2). It includes a major part of the known cyanobacterial peptides but by far not all. Many individual structures are, at present, the only representatives of other peptide classes, with more members potentially to be discovered.
Table 2. Main classes of cyanobacterial peptides as described in the text. Synonyms refer to names in original publications. As producing organisms, the taxa from which the respective peptides have been originally isolated are listed; homologue peptides can be found in other taxa. The number of variants reflects the structural variability of known congeners in early 2005
Homologue peptides from organisms other than cyanobacteria.
The peptides grouped in an individual peptide class are thought to be synthesized by homologous NRPS/PKS systems or ribosomal operons encoded in gene clusters with high sequence similarity. As shown for microcystins, the overall organization of a gene cluster coding for structural congeners in different taxa can be different even though the individual genes are clearly homologous (Rantala et al., 2004).
For each peptide class, the structure of a representative peptide is shown (generally the first one that has been described). In the flat formula, stereochemistry is not considered but it is mentioned in the text. For more detailed information, the reader is referred to the original publications.
Further, a schematic structure is given that lists all amino acid and other moieties that have been found at particular positions in the peptides of that class. When further modifications of amino acids have been observed, this is indicated in italics preceding the corresponding positions. All amino acids are abbreviated by standard three-letter codes and other abbreviations will be explained in the text. It has to be emphasized that not all possible combinations have been found in cyanobacterial samples. Representatives of described aeruginosins are, for example, aeruginosin 98-A (Murakami et al., 1995): ClHpla-Ile-SuChoi-Agmatine or aeruginosin 89-A (Ishida et al., 1999) Su,ClHpla-Leu-Choi-Argininal. New aeruginosins could be predicted to have, for example, the amino acid sequences ClHpla-Tyr-Choi-Argininol or ClHpla-Leu-Choi-Agmatine, but they have not yet been described.
Because of the high number of individual peptides in some peptide classes, we could not cite all original publications, only those articles that refer to particular features of individual peptides. This does not, of course, imply that the publications not cited individually are in any way less important; the selection was made solely for practical reasons.
The linear peptides of this class are characterized by a derivative of hydroxy-phenyl lactic acid (Hpla) at the N-terminus, the amino acid 2-carboxy-6-hydroxyoctahydroindole (Choi) and an arginine derivative at the C-terminus (Fig. 2) (Murakami et al., 1995). The biosynthesis is achieved putatively by an NRPS (K. Ishida & E. Dittmann, personal communication; N. Tandeau de Marsac & M. Welker, unpublished).
The C-terminal arginine derivatives are agmatine, derived from Arg by decarboxylation (e.g. in microcin SF608; Banker & Carmeli, 1999), argininol, derived from Arg by reduction of the carboxy group to an alcohol (e.g. in aeruginosin 298A; Ishida et al., 1999) or argininal, derived from argininol by cyclization (e.g. in aeruginosin 102A; Matsuda et al., 1996a). At position 2, variable amino acids such as Tyr (Hty), Phe, Leu, or Ile can be incorporated that are predominantly in d-configuration. In an individual strain, however, aeruginosins with a configuration of the amino acid in position 2 as L or as d can be found (Ishida et al., 1999).
Hpla is a compound that is readily available for NRPS from the tyrosine metabolism and has been found to be in d-configuration in most congeners. Choi has been synthesized in vitro from tyrosine but it is not yet clear whether tyrosine is the precursor during peptide biosynthesis (Valls et al., 2001). Recently, the chemical synthesis of aeruginosins has been accomplished (Valls et al., 2002). Chlorination (indicated by Cl) and sulphation (Su) can occur at the Choi (e.g. aeruginosin 205-A; Shin et al., 1997) or Hpla (e.g. aeruginosin 101; Ishida et al., 1999) residues, but have never been observed simultaneously at both positions in an individual peptide. In one aeruginosin, Hpla was found to be brominated (aeruginosin 98-C; Ishida et al., 1999), which is remarkable as bromine was not detectable in the natural environment or in the culture medium. Glycosylation with a pentose sugar (xylose) has been described for a variant isolated from Planktothrix (aeruginosin 205-A; Shin et al., 1997). In Planktothrix such a glycosylation seems to be common (Welker et al., 2004b), whereas it has not been observed in Microcystis so far. At present, 27 variants have been published (Table 2). However, mass spectral analyses of strains and bloom samples indicate a high number of further structural variants which differ only in chlorination and sulphation and less in amino acid sequences, as is the case in other peptides. Aeruginosins have been isolated from Microcystis and Planktothrix and variants with mPro instead of Choi (spumigin Fujii et al., 1997b) from Nodularia. Aeruginosins also have similarities to dysinosins, linear tetrapeptides from a dysideid sponge (Carroll et al., 2002) and to suomilide (Fujii et al., 1997a) and banyasides (Ploutno & Carmeli, 2005), peptides from Nodularia and Nostoc, respectively.
This class of linear peptides is characterized by a decanoic acid derivative, 3-amino-2-hydroxy-decanoic acid (Ahda) and a predominance of two tyrosine units at the C-terminus (Fig. 3) (Okino et al., 1993a). Microginins vary in length from four (e.g. microginin 91A; Ishida et al., 2000a) to six (e.g. microginin 299C; Ishida et al., 1998b) amino acids with the variability occurring at the C-terminal end. Position 2 is most variable with seven different amino acids reported, while in the two following positions, three to four different amino acids have been reported. N-methylation can occur at positions 1, 3 and 4 (Ishida et al., 1998b). Aliphatic chlorination has been reported for Ahda (Kodani et al., 1999) and in some cases also as dichlorination at the terminal carbon atom (Ishida et al., 1998b). Aromatic chlorination has not been observed at any of the (homo)tyrosine units. The putative gene cluster coding for microginin synthetase has been sequenced in strain Microcystis HUB 5.3 (Kramer et al., 2000) but a corresponding knock-out mutant could not be produced until now. Ahda formation is achieved by a PKS enzyme complex and presumably is the starting unit of microginins.
Microginins sensu stricto have been found in blooms and strains of Microcystis and Planktothrix so far. Two peptides, carmabins A and B, isolated from Lyngbya are similar but have different decanoic acid derivatives (dimethyl decanoic acid and oxo-dimethyl decanoic acid; Hooper et al., 1998). Nostoginins have been isolated from Nostoc with an N-terminal 3-amino-2-hydroxy-octanoic acid, Ahoa (Ploutno & Carmeli, 2002).
These cyclic peptides are characterized by a lysine in position 5 and the formation of the ring by an N-6-peptide bond between Lys and the carboxy group of the amino acid in position 6 (Fig. 4) (Harada et al., 1995). A side chain of one amino acid unit is attached to the ring by an ureido bond formed between the α-N of Lys and the α-N of the side chain amino acid. All other positions in the ring and side chain are variable, with three to five amino acids reported for the respective positions. The amino acid in position 5 is N-methylated. Methionine in position 3 has been reported as an S-oxygenated variant (nodulapeptin B) (Fujii et al., 1997b). Homo-variants of tyrosine and phenylalanine can be found in positions 4 and 5 (Murakami et al., 1997; Reshef & Carmeli, 2002). A putative respective NRPS gene cluster has been found in the genome of Anabaena strain 90 (K. Sivonen & L. Rouhiainen, personal communication).
Biosynthesis presumably starts with the side chain amino acid that forms a pseudo-C-terminus when the ureido bond is formed. Ring closure is then accomplished by the peptide bond formation between the C-terminal carboxy-group in position 6 and the 6-amino group of lysine.
The mass range of anabaenopeptins spans from 759 Da for anabaenopeptin I ([Leu-MAla-Hty-Val-Lys]-CO-Ile; Murakami et al., 2000) to 956 Da for Oscillamide C ([Phe-MHty-Hty-Ile-Lys]-CO-Arg; Sano et al., 2001).
All amino acids, except the Lys in position 2, are in l-configuration.
Anabaenopeptins have been reported from cyanobacteria isolated from a variety of habitats: freshwater (Harada et al., 1995), terrestrial (Reshef & Carmeli, 2002) and brackish water (Fujii et al., 1997b) and also from marine sponges (konbamide and keramide A from Theonella sp.; Kobayashi et al., 1991a, b). As sponges host a broad variety of prokaryotic symbionts, these peptides may well be produced by cyanobacteria rather than by the sponge itself (Harrigan & Goetz, 2002; Piel, 2004). Indeed, two congeners of anabaenopeptins differ only in hydroxylation of tryptophan: mozamide A ([Phe-MhoTrp-Leu-Val-Lys]-CO-Ile; Schmidt et al., 1997) was isolated from a theonellid sponge, whereas plectonemid A ([Phe-MTrp-Leu-Val-Lys]-CO-Ile; Müller et al., 2005) originated from a culture of Plectonema sp.
This class of cyclic peptides is characterized by the amino acid 3-amino-6-hydroxy-2-piperidone (Ahp) and the cyclization of the peptide ring by an ester bond of the β-hydroxy group of threonine with the carboxy group of the terminal amino acid (Fig. 5) (Martin et al., 1993). In two cases, the threonine unit is substituted by a hydroxy methyl proline unit and the ring-closing ester bond is formed with this hydroxy group (Nostopeptins; Okino et al., 1997). The general type of this peptide class is thus a branched peptidolactone.
A side chain of variable length is attached via the amino group of the threonine unit. Two major types of side chains are common: one consisting of one or two amino acids and an aliphatic fatty acid from formic (e.g. in anabaenopeptilide 202-A; Fujii et al., 1996) to octanoic acid (e.g. in micropeptin A; Okino et al., 1993b) and one with a glyceric acid unit at the N-terminus (e.g. cyanopeptolin S; Jakobi et al., 1995). The glyceric acid can be attached directly to the threonine in position 1 or to an amino acid side chain (e.g. in A90720A; Bonjouklian et al., 1996). Sulphation and O-methylation of the glyceric acid have been observed (e.g. in Oscillapeptins A–C; Itou et al., 1999a). A branched side chain has been reported for scyptolins B where two alanine-butanoic acids are joined to a threonine in position S1 (Matern et al., 2001). In several variants, hydroxyphenyl lactic acid (Hpla) in the side chain has been reported (microcystilide A; Tsukamoto et al., 1993), aeruginopeptin 95A/B; Harada et al., 1993). Other non-proteinogenic amino acids or hydroxy acids in cyanopeptolins are: tetrahydro-tyrosine (H4Tyr, also called hydroxy-cyclohexenyl alanine, HcAla) in position 2 (e.g. aeruginopeptin-95B, Harada et al., 1993; micropeptin 88-D, Ishida et al., 1998a); kynurenine in position 5 (micropeptin SD999; Reshef & Carmeli, 2001); hydroxy-methyl valeric acid (Hmv) in the sidechain (hofmannolin; Matern et al., 2003a); amino-butenoic acid (Aba or Dhb) in position 2 (somamide A; Nogle et al., 2001).
The biosynthetic gene cluster has been sequenced in Anabaena 90 (Rouhiainen et al., 2000) and another one is present in the Microcystis PCC7806 genome (Martin et al., 1993; Bister et al., 2004) (N. Tandeau de Marsac, personal communication). The arrangement of the genes and the structure of the peptides suggest that initiation of biosynthesis starts with the side chain and that the final step is the ring closure between the amino acid in position 6 and threonine in position 1. All amino acids are in l-configuration in all cyanopeptolins described so far. With regard to the highly variable side chain, the gene clusters expectedly would show much variation in the corresponding genes.
All positions in the ring, except threonine and Ahp, can be occupied by variable amino acids. However, the number of amino acids that have been reported for individual positions varies from two in position 6 to 9 in position 2. In position 5, an aromatic amino acid is found in all variants; position 6 is occupied by neutral amino acids. Position 2 of the ring can be occupied by a broad variety of amino acids like aromatic, basic, aliphatic and hydroxy amino acids. In this position, Dhb has also been reported that is common to the microcystins of Planktothrix (Harrigan et al., 1999).
In all variants, the amino acid in position 5 is N-methylated. Derivatization by O-methylation can occur when a free hydroxy group is available, as in tyrosine (e.g. anabaenopeptilide 90-A; Fujii et al., 1996). Chlorination has been reported for position 5 when it is occupied by tyrosine but no dichlorination has been found so far (micropeptin 478-A, Ishida et al., 1997a; scyptolin A/B, Matern et al., 2001).
The high structural variability of cyanopeptolins is also reflected by the wide range of molecular masses spanning from 770 Da for tasipeptin A (Williams et al., 2003) to 1181 Da for oscillapeptin B (Itou et al., 1999a).
Cyanopeptolin type peptides have been isolated from Chroococcales, Oscillatoriales and Nostocales. Further, one congener was reported from Dollabella, a marine herbivorous gastropod, suggesting a cyanobacterial origin (Harrigan et al., 1999).
The total synthesis of one congener, micropeptin T-20, has been recently achieved (Yokokawa et al., 2005).
Microcystins and nodularins
Microcystins (originally described as cyanoginosins; Botes et al., 1984, 1985) and nodularins are characterized by the amino acid (2S,3S,8S,9S)-3-amino-9-methoxy-2,6,8-trimethyl-10-phenyldeca-4,6-dienoic acid (Adda, position 5), glutamate and an aspartate derivative at positions 5, 6 and 3, respectively, of the ring (Fig. 6). The aspartate derivative is referred to as d-erythro-2-methyl-iso-aspartate (DmiA). Other d-amino acids in most structural variants are d-Ala in position 1 and d-Glu in position 6. The numbering of the particular positions was assigned before the biosynthetic pathway had been discovered (Carmichael et al., 1988) and thus does not reflect the sequence of chain elongation during biosynthesis (Tillett et al., 2000). Two positions show high variability, namely positions 2 and 4, whereas all other positions are more conserved. For this reason, the nomenclature of microcystins has been revised in an early stage and it was proposed to name variants according to the two most variable positions by applying the one-letter code for amino acids, e.g. microcystin-LR for the variant with leucine in position 2 and arginine in position 4 (Carmichael et al., 1988). Position 7 is occupied in most variants by dehydro alanine (Dha) or, when methylated, methyl-dehydro alanine (Mdha), which originates from dehydration of a seryl-intermediate. Several variants still have a native Ser in this position (Namikoshi et al., 1992b). In Planktothrix and Nostoc, an analogous threonine derivative, 2-amino-2-butenoic acid (Aba or Dhb), can frequently be found in this position. While N-methylation is observed in many Dha7-variants (then Mdha), it has never been found for Dhb-variants in microcystins, indicating a lack or deletion of the N-methyl-transferase domain in the respective module of McyA. In nodularins, the Dhb moiety is methylated and an N-methyl-transferase domain has been reported for NdaA (Moffitt & Neilan, 2004). Dhb can have E- or Z-configurations, by which toxicity is influenced (Blom et al., 2001). Similar isomers have been reported for the conjugated double bond in the Adda side chain, though only in photochemical experiments and not as compounds in vivo, so far (Harada, 1996).
At the Adda side chain, the O-methyl group can be lacking (Namikoshi et al., 1992a) or be substituted by an acetyl group (Namikoshi et al., 1990; Sivonen et al., 1992). Considering all possible variability at the individual moieties, it is not surprising that new variants can still be found in various strains (Grach-Pogrebinsky et al., 2004; Oksanen et al., 2004; Welker et al., 2004b), although nearly 90 structural variants have already been described. The methylation at two positions alone allows four possible variants of any microcystin-XZ: [Asp3]microcystin-XZ, [Asp3,Dha7]microcystin-XZ, [Dha7]microcystin-XZ, together with the ‘native’ compound. With the possible variations in other positions, like O-methylation of the Glu6-moiety, the potentially high number of structural variants is evident. Nonetheless, despite the structural variability in field samples as well as in isolated strains, a few variants are dominant and most structural variants occur only in low concentrations (Fastner et al., 1999; Welker et al., 2004b). Chlorination or sulphation has never been observed in microcystins.
Biosynthesis in microcystins and nodularins starts with the formation of the Adda moiety by a NRPS/PKS hybrid enzyme, presumably with phenylacetate as starter unit (Moore et al., 1991) (and thus with position 5). The d-configuration of alanine in position 1 is achieved by an epimerization domain in McyA while the d-configuration of glutamate (position 6) and DmiA is achieved by a separate racemase (Sielaff et al., 2003).
Various structures have been described with characteristic thiazole and oxazole moieties thought to be cysteine and threonine derivatives, respectively (Fig. 7), as shown by nuclear magnetic resonance techniques for another type of peptide, barbamide from Lyngbya (Williamson et al., 1999). Corresponding moieties are most likely formed from native amino acids by dehydration and reduction to form the heterocycle. In typical peptides of this class, e.g. nostocyclamide (Todorova et al., 1995), thiazole/oxazole units occur in alternation with unmodified amino acids to form a cyclic hexapeptide. In one case, westiellamide (Prinsep et al., 1992), the molecule is built exclusively of alternating oxazole and valine residues, whereas in other congeners, all six moieties are different from each other (as in raocyclamides) (Admi et al., 1996). In the hexapeptides with thiazole moieties, not all cysteine/threonine units are dehydrated as in banyascyclamides A and B (Ploutno & Carmeli, 2002), where the threonine moiety in position 1 is unaltered. In other peptides, only one thiazole moiety is found, as in ulongamides A–F (Luesch et al., 2002), and instead of further thiazole/oxazole units, proteinogenic amino acids or lactic acid and amino methyl-hexanoic acid are incorporated. The naming of this peptide class is also very incoherent and the names of the peptides are linked either to the producing organism (e.g. nostocyclamide) or to the origin of the sample or strain (e.g. banyascyclamide). The biosynthesis and respective genes are not known and therefore the numbering of amino acid residues is arbitrary.
The first peptides of this class, however, have been described from the marine ascidian Lissoclium bistratum. Bistratamides (Degnan et al., 1989a; Foster et al., 1992) were thought to be produced by the symbiotic Prochloron sp. rather than by the ascidian itself. Recently, Schmidt et al. (2004) found gene sequences in Prochloron indicating the presence of NRPS clusters that might be involved in the biosynthesis of cyclic peptides.
Peptides recently isolated from the ascidian Didemnum molle, didmolamides A and B (Rudi et al., 2003), have exactly the same (flat) structure as banyascyclamides A and C (Ploutno & Carmeli, 2002), respectively.
Further peptides with similar alternating thiazole/oxazole units, patellamides (Ireland et al., 1982) and lissoclinamides (Degnan et al., 1989b), have been isolated from Lissoclinium. In patellamides A–C, four thiazole/oxazole units and four alternating amino acids form a cyclic octapeptide, whereas the heptapeptides lissoclinamides 1–4 are built of three thiazole/oxazole and four amino acid units. Recently, it has been shown that patellamides are synthesized ribosomally from a linear octapeptide by post-translational modification (Schmidt et al., 2005) and thus a similar biosynthetic pathway may be responsible for cyclamide formation (see section on Ribosomal peptide synthesis of complex peptides, above).
The largest known cyanobacterial oligopeptides are the microviridins (Fig. 8) (Ishitsuka et al., 1990). This group is characterized by the multicyclic structure established by secondary peptide and ester bonds and a side chain of variable length. The main peptide ring consists of seven amino acids with an ester bond between the 4-carboxy group of aspartate (position 10) and the hydroxy group of threonine (position 4) and a peptide bond between the 6-amino group of lysine (position 6) and the 4-carboxy group of glutamate (position 7).
The members of this class of peptides all share these features and variations are primarily due to substitutions in the side chain and at position 5 in the ring. However, in many natural samples and isolated strains, peptides with fragment mass spectra similar to those of microviridins have been detected. This indicates a much greater structural variability than suggested by the number of isolated congeners so far (Fastner et al., 2001; Welker et al., 2004a, b). A recently isolated variant, microviridin J, proved to be toxic to the planktonic crustacea Daphnia, while other structural variants were inactive (Rohrlack et al., 2004).
All amino acids in microviridins are in l-configuration and the only non-proteinogenic unit is the N-terminal acetic acid. Therefore, it could be that microviridins are synthesized ribosomally and that the tri-cyclic structure is completed by post-translational modifications similar to those that have been described for other prokaryotic peptides, e.g. microcin J25 (Blond et al., 1999).
This class of cyclic depsipeptides is special in that most of the reported congeners have been isolated exclusively from Nostoc and the majority from a single strain (Fig. 9a) (Schwartz et al., 1990; Golakoti et al., 1995, 1994) and not, as for other peptide classes, from a wide taxonomic range of cyanobacteria. Cryptophycins are composed of two hydroxy-acid units – a valeric acid derivative and a derivative of phenyl-octenoic acid – and two amino acid units – a tyrosine derivative and a β-amino acid. Structural variability arises mainly from the chlorination of the tyrosine unit and an optional epoxy group at the phenyl-octanoic acid.
Cryptophycins show cytotoxicity toward various tumor cell lines and are potential candidates for anticancer drugs (Edelman et al., 2003).
Microcolins and mirabimids
These linear peptides have been isolated from Lyngbya and Scytonema, respectively, and are characterized by a C-terminal pyrrolin-2-one moiety (Fig. 9b) (Carmeli et al., 1991a; Koehn et al., 1992). Mirabimids have an N-methylated or acetylated N-terminal amino acid. Mirocolins and majusculamid D (Moore & Entzeroth, 1988) possess a modified octanoic acid (dimethyl-OA).
Tantazoles and mirabazoles
Tantazoles A, B, F and I and mirabazoles A–C (Fig. 9c) (Carmeli et al., 1990, 1991b) have been isolated from Scytonema mirabile. These peptides are composed nearly exclusively of (methylated) thiazole and oxazole units forming linear tetra- and pentapeptides, respectively. A similar compound, thiangazole, has been isolated from the myxobacterium Polyangium (Kunze et al., 1993).
More than half of the known peptides can be assigned to the major peptide classes described above, whereas the remaining peptides cannot be grouped in larger classes with many structural variants. For most of these peptide types, only a few congeners are known and these often have been isolated as minor compounds from the same strain or sample. It is beyond the scope of this review to present all known peptides in detail and several peptide types will be mentioned as examples with a focus on structural peculiarities.
Thiazole and oxazole moieties are reported for several types of cyanobacterial peptides with structural properties too specific to assume a homology to the cyclic thiazole/oxazole peptides mentioned above. Nonetheless, the thiazole formation may well be homologous in the corresponding NRPS enzymes. Lyngbyabellin B (Luesch et al., 2000a) is a cyclic hexapeptide containing two thiazoles and a modified octanoic acid (2-dimethyl,3-hydroxy,7-dichloro-OA). Aeruginosinamide (Fig. 10a) (Lawton et al., 1999b), a linear tetrapeptide, contains a C-terminal thiazole and an N-terminal leucine with a di-isoprenylated amino group. These features are similar to those of virenamide A, a peptide isolated from the ascidian Diplosoma (Carroll et al., 1996). The linear tetrapeptide barbamide (Fig. 10b) contains a C-terminal thiazole (Orjala & Gerwick, 1996) and a triply chlorinated fatty acid moiety probably derived from a tri-chloro leucine (Sitachitta et al., 2000a). Apramides A–G (Luesch et al., 2000b) are linear nonapeptides characterized by a C-terminal thiazole and a modified, N-terminal 7-octenoic or 7-octynoic acid unit. Wewekazole from Lyngbya is a cyclic undecapeptide with three (methyl-)oxazole moieties (Nogle et al., 2003).
A number of cyclic deca- and undecapeptides have been reported to possess a Dhb moiety and hydroxy amino acids. Examples are puwainaphycins A–E from Anabaena (Gregson et al., 1992), lipopeptides with a modified stearic or palmitic acid; laxaphycins A–E from Anabaena laxa (Frankmölle et al., 1992) (Fig. 11a) with hydroxy-amino acids; hormothamnin A from Hormothamnion (Gerwick et al., 1992), which has the same structure as laxaphycin A except for the configuration of Dhb; and lobocyclamides A–B (MacMillan et al., 2002), which differ from similar laxaphycins in two amino acids. Calophycin from Calothrix sp. (Moon et al., 1992) possesses a 2-hydroxy-3-amino-4-methylpalmitic acid (Hamp) similar to that in puwainaphycin E.
Further cyclic deca- and undecapeptides are, for example, kawaguchipeptins A (Fig. 11b) and B, undecapeptides from Microcystis that differ in two Trp moieties modified by prenyl-groups (Ishida et al., 1996, 1997b). Oscillatorin (Sano & Kaya, 1996) (Fig. 11c) is a cyclic decapeptide with an unusual amino acid, oscillatoric acid, which is a prenylated tryptophan derivative.
Some 50 peptides have been isolated that do not fit in any of the peptide classes or types described above. The smallest known cyanobacterial peptide is radiosumin, built of two acetylated amino acids derived from p-aminophenylalanine (Fig. 12a) (Matsuda et al., 1996b). A particular type of peptides is the aeruginoguanidines, tripeptides built of two arginine moieties and a tyrosine-like amine. The tyrosine-like amine is triply sulphated whereas the arginine moieties are modified by prenyl or geranyl groups (Ishida et al., 2002). Kasumigamide (Fig. 12b), a linear pentapeptide, is built entirely from non-proteinogenic amino acids like β-alanine or the tryptophan derivative Ahipa (4-amino-3-hydroxy-5-indolylpentanoic acid; Ishida & Murakami, 2000).
In a number of cyclic peptides, it is not possible to find any particularly characteristic feature. When several congeners are described, they have often been isolated from a single sample. The smallest peptides of this incoherent group are antanapeptins (A–D, Fig. 12c) isolated from Lyngbya with a characteristic 3-hydroxy-2-methyl-octynoic acid and a 2-hydroxyisovaleric acid unit (Nogle & Gerwick, 2002). The largest mono-cyclic peptide is malevamide C (Fig. 12d) (Horgen et al., 2000) with 14 (amino) acid units. In malevamide C, a modified octanoic acid is also present, 3-amino-2-methyl-7-octynoic acid, which was also found in pitipeptolides A and B, cyclic heptapeptides with dihydroxy octynoic and octenoic acid, respectively (Luesch et al., 2001). A hydroxy-dimethyl octynoic acid is also found in yanucamides A and B together with a hydroxy-isovaleric acid (Sitachitta et al., 2000b), peptides isolated from a Lyngbya/Schizothrix assemblage that resemble kulolides, peptides isolated from a nudibranch gastropod (Reese et al., 1996).
Distribution and function of peptides in cyanobacteria
Regarding the structural diversity of cyanobacterial peptides the question arises as to how common particular peptides or peptide types are with respect to their taxonomic and geographic distributions. When this question is asked, it always has to be kept in mind that the biosynthesis of non-ribosomal peptides requires a significant part of the cell's energy and nutrient resources. As a rule of thumb, any single amino acid incorporated in a non-ribosomal peptide requires genetic information of about 4–5 kbp. For highly modified amino acids or building blocks that are synthesized by PKS-systems, this number can be substantially higher. The share of peptide synthetase enzymes in the cellular protein pool is unknown, but it can be assumed that the translation of the enzymes has significant costs for the cell.
Further, only little data are available on the actual taxonomic and geographic distribution of individual peptides or peptide classes. Therefore, the data reviewed below should be considered as a first insight.
Taxonomic distribution of peptides
For most individual peptides, the distribution among cyanobacterial taxa is basically unknown and the only existing reference is the taxon from which the respective peptide has been isolated for the first time. However, summarizing the data available from the original publications already indicates that oligopeptides that are synthesized by NRPS can be found in genera from all sections of cyanobacteria (Christiansen et al., 2001). As can be expected for true secondary metabolites, these biosynthetic activities have a patchy distribution.
Soon after the structure elucidation of microcystins that made detection and quantification possible in many laboratories, it became evident that microcystins can be found worldwide and are produced by a broad variety of cyanobacterial genera. Today, it is of no surprise when microcystins are detected in field samples containing Microcystis, Planktothrix, Anabaena, or Nostoc, independent of the geographical origin of the samples (Chorus & Bartram, 1999). Higher microcystin concentrations most often are associated with higher biomass of toxigenic taxa and thus are more likely found in eutrophic than in clear lakes (Svrcek & Smith, 2004). A number of studies have dealt with the cellular microcystin content in cyanobacterial strains under various growth conditions (Sivonen, 1990; Utkilen & Gjolme, 1992; Rapala et al., 1997; Orr & Jones, 1998; Oh et al., 2000; Wiedner et al., 2003), with the presence of mcy-genes (Tillett et al., 2001; Bittencourt-Oliveira, 2003; Hisbergues et al., 2003; Via-Ordorika et al., 2004; Mbedi et al., 2005), or with both factors (Mikalsen et al., 2003). An important conclusion of these studies is that mcy-genes are present nearly exclusively in those strains in which microcystins can actually be detected. Exceptions to this general rule are rarely found in Microcystis (Kaebernick et al., 2001) but seem to be more common in Planktothrix rubescens, where natural mutants can make up to 10% of a population (Kurmayer et al., 2004). Globally, microcystins are constitutively present in mcy+-strains and not only when the synthesis is triggered by distinct environmental signals, such as can be observed for most other microbial NRPS systems (Du & Shen, 2001). As the structural variants actually synthesized as well as the cellular concentrations are more or less constant, independent of growth conditions, it is evident that there is a genetic rather than a physiological control of peptide production (Mikalsen et al., 2003). Moreover, the presence or lack of mcy-genes does not correspond to any phylogeny based on housekeeping genes (Neilan et al., 1995, 1997). Discussing the organization and sequences of the mcy-gene cluster in different taxa and the distribution within potentially toxigenic genera, Rantala et al. (2004) concluded that the mcy-gene cluster is a very ancient unit dating back to the common ancestor of modern Anabaena, Microcystis, Nostoc and Planktothrix. Within these genera, the distribution of mcy-genes was interpreted as the result of repeated and independent losses. Remarkably, in genera closely related to the ones mentioned above, no microcystins have been found yet, e.g. in Aphanizomenon, Synechocystis, or Limnothrix.
A similar distribution of biosynthesis gene clusters and respective peptides among cyanobacterial taxa and strains can reasonably be assumed for at least the main classes and types of cyanobacterial peptides. The data available are only fragmentary at present but they fit well into this picture. Anabaenopeptins, for example, are produced by strains of the genera Microcystis, Planktothrix and Aphanizomenon belonging to sections I, III and IV, respectively, but not in all strains of these genera (Fastner et al., 2001; Welker et al., 2003, 2004a, b). Interestingly, in distantly related taxa exactly the same structural variants can be found (for example anabaenopeptin B). The cellular concentration in a producing strain, Anabaena 90, showed only moderate response to varying growth factors, in a range comparable to that found for microcystins (Repka et al., 2004).
Another peptide class with a very wide distribution are the cyanopeptolins (Table 2), which have been isolated from various environments and from diverse taxa (Anabaena, Microcystis, Planktothrix, Scytonema, Symploca, Nostoc, Lyngbya, Oscillatoria). Most of the structural variants originate from Microcystis and Planktothrix but this most likely does not reflect the global distribution of cyanopeptolins among cyanobacteria. However, considering the data available, the highest diversity of cyanopeptolins occur in planktonic freshwater taxa. As with microcystins, strains without and with cyanopeptolin synthetases exist that are very closely related, e.g. in Microcystis (N. Tandeau de Marsac, personal communication; own unpublished data).
Other peptides such as microviridins, aeruginosins and microginins have been reported from a similar discontinuous array of cyanobacterial genera.
Figure 13 summarizes data on the distribution of major classes of cyanopeptides in sections, genera of section I, and strains of Microcystis. In sections I, III and IV, all classes of peptides can be found while for sections II (Pleurocapsales) and V (Stigonematales) little or no information is available, mainly due to the low number of available strains. Within section I, only the genus Microcystis has been found to produce oligopeptides so far. This is in accordance with Christiansen et al. (2001) and the data available from cyanobacterial genomes (see above). In individual strains of Microcystis, oligopeptides can be found in various combinations, already giving an impression of the chemotype diversity within a single genus. When individual peptides are considered rather than peptide classes, the number of possible chemotypes seems endless and indeed, when clones were analyzed as single colonies or filaments, or as cultured isolates, the number of peptide chemotypes by far exceeds the number of morphotypes (Fastner et al., 2001; Welker et al., 2004a, b).
Also in Microcystis, some strains do not produce any peptides [by high-performance liquid chromatography analysis supported by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MS) or liquid chromatography-MS/MS], whereas in others, peptides of up to four classes can be found (Czarnecki et al., 2006).
Among marine cyanobacteria in the genera Lyngbya and Symploca, a similar chemotype diversity is to be expected. Though respective studies on chemotype diversity are only in a beginning phase (Guyot et al., 2004; Thacker & Paul, 2004), the sheer number of peptides (and other metabolites) isolated from Lyngbya underlines the potential metabolic diversity in this genus (Shimizu, 2003).
The taxonomic distribution of oligopeptides in cyanobacteria, as assessed by chemical analyses, might not give a complete picture, partly because the number of known and characterized peptides is still low compared to the expected total structural diversity. Nonetheless, the data indicate that the production of oligopeptides is concentrated in certain genera among taxonomic sections. Within these genera, a multitude of peptide chemotypes exist where we assume that the chemotype directly reflects the genotype with respect to NRPS/PKS gene clusters.
Geographic distribution of peptides and peptide classes
As with the taxonomic distribution, only few data are available on the geographic distribution of cyanobacterial peptides, with the exception of microcystins. In microcystins the available data indicate that the biosynthesis of these peptides is not restricted to any climatic zone or other geographic range. Microcystin producing cyanobacteria can be found in tropical waters (Cuvin-Aralar et al., 2002) as well as in Antarctic samples (Hitzfeld et al., 2000), large lakes (Brittain et al., 2000) as well as in shallow ponds (Welker et al., 2005), high altitude (Mez et al., 1997) as well as coastal waters (Henriksen, 2001).
For other classes and types of peptides, the few available data indicate a similar global distribution. Certain peptides that have been isolated from a particular cyanobacterial taxon can be potentially found in samples and strains of this taxon, independently of the geographical origin. For example, peptides isolated from Japanese samples or strains, like kasumigamide or anabaenopeptins, can be found in Microcystis samples from several locations in Europe and Canada (Williams et al., 1996; Fastner et al., 2001; Barco et al., 2004; Welker et al., 2004a; own unpublished data). This does not exclude the possibility that other peptides might be restricted within certain geographic limits.
Co-production of peptides in individual strains
A major aspect of cyanobacterial peptides is the production of multiple peptide classes and congeners by individual strains. Combinations of individual peptides can be considered peptide fingerprints typical of individual clonal strains, allowing the distinction of morphologically undistinguishable strains as chemotypes (Fastner et al., 2001; Welker et al., 2004a, b). Congeners of a peptide class produced by one strain vary in a few amino acid positions, whereas other positions are conserved (Martin et al., 1993; Czarnecki et al., 2006). This indicates that certain positions apparently need to be preserved to retain bioactivities, whereas natural selection likely prevents the persistence of non-active peptide variants. In addition to variable amino acids (and other organic acids) in peptides produced by an individual strain, further structural diversity can arise from modifications, like methylation (Harada et al., 1991) or halogenation (Murakami et al., 1995; Ishida et al., 1998b; Rouhiainen et al., 2000).
The positional control of amino acid specificity is well known from various examples of peptides found in other bacterial phyla or the superkingdom fungi (Kleinkauf & von Döhren, 1997). A positional control apparently acts on different levels of fidelity for a specific position. Thus, for example, an l-Leu residue in a specific position of a peptide might be exchanged with similar residues (such as Ile and Val), or with unrelated amino acids (such as Tyr and/or Arg) as in the case of microcystins. This is a remarkable property of cyanobacterial peptide synthetases because, in various heterotrophic bacteria and fungi, only exchanges by similar amino acids have been documented (e.g. aromatic amino acid Tyr, Phe and Trp, or aliphatic branched amino acids Leu and Ile; Kleinkauf & von Döhren, 1997). It has to be underlined that available data clearly indicate that there is only a single enzyme system in each strain producing a set of congeners. Besides the recognition and activation of specific amino acids by adenylation domains, the processing of the aminoacyl intermediates by condensation domains is also important for the selective incorporation of specific amino acids during chain elongation. The corresponding mechanisms, however, have not been intensively studied. A study by Belshaw et al. (1999) showed that the incorporation of the ‘wrong’ amino acid slowed down the speed of the further peptide synthesis to a point where the production of corresponding variants became extremely unlikely. The production of congeners is thus related to the level of control in activation and processing reactions of each step in a biosynthetic pathway.
Amino acid modifications catalyzed by N-methyl transferases or halogenases (see above) are reactions that are not absolutely required for the biosynthetic process. Therefore, non-methylated or non-halogenated analogues are frequently observed. However, as has been shown in fungal systems, the rate of processing of non-methylated intermediates can be substantially reduced, so levels of respective analogues are quite variable (Billich & Zocher, 1990).
The number of congeners of a single peptide class that can be found in an individual strain can reach more than 10, although often a few congeners are dominant. This is well known for microcystins. The most toxigenic strains of Microcystis, for example, produce the variants Mcyst-LR, -RR and -YR, whereas other strains either have single variants or combinations of variants (Lawton et al., 1999a; Rohrlack et al., 2001). In addition, the respective unmethylated variants (e.g. [Dha7]Mcyst-RR) may be present. Multiple microcystin variants have also been detected in Nostoc sp. (six variants in strain IO-102-l; Oksanen et al., 2004) or in P. agardhii (12 variants in strain Max 06; Welker et al., 2004b). Similar structural diversity is also observed for other peptide classes. At least 13 cyanopeptolins are produced by Microcystis HUB08B03 (Czarnecki et al., 2006), varying in three positions with a maximum of three different building blocks each. In many strains of Microcystis and Planktothrix, the same structural variants of anabaenopeptins are produced, namely anabaenopeptins A, B, F and oscillamide Y (own unpublished data).
In most cyanobacterial strains that produce oligopeptides, there is more than one class of peptides. Peptides of two or three classes can frequently be found in Microcystis, Planktothrix, Anabaena or Nostoc strains (Harada et al., 1993, 1995; Fujii et al., 1997b; Kodani et al., 1998). The number of peptide classes produced by an individual strain multiplied by the number of congeners actually produced results in a number of individual peptide structures of several dozens. In natural populations, the co-existence of distinct chemotypes can dramatically increase the number of peptides that can be found in a bloom sample.
Functions of cyanobacterial peptides
From the distribution of particular peptides among clones of Microcystis, for example, it is evident that none of the peptides (peptide classes) is required by non-producing clones either for growth in the laboratory or to be competitive in natural environments. On the other hand, strains that grow under laboratory conditions for decades do not lose the ability to produce the peptides typical for that strain. Microcystis PCC7806, for example, was isolated in 1973 from a Dutch lake, deposited in the Pasteur Culture Collection as axenic strain in 1978 and since then been the object of many studies on peptide production (Martin et al., 1993; Dittmann et al., 1997; Rohrlack et al., 1999a; Kaebernick et al., 2000; Tillett et al., 2000; Wiedner et al., 2003; Pearson et al., 2004). Under all culture conditions and in all laboratories, the strain produced the same peptides (microcystins and cyanopeptolins) except, of course, mcy-knockout mutant. This contrasts with the production of gas vesicles, for which spontaneous mutants that lack the ability to produce them frequently occur (Mlouka et al., 2004). Paradoxically, the production of peptides in an individual strain seems to be selectively stabilized, even in axenic cultures, whereas other strains that completely lack the corresponding biosynthetic genes or the respective mutants do not seem to exhibit any severe disadvantage (Hesse et al., 2001; Kaebernick et al., 2001). Natural populations and communities are mixtures of producers and non-producers with respect to particular peptides or peptide classes (Fastner et al., 2001; Rohrlack et al., 2001; Welker et al., 2004a, b).
Several hypotheses on the function of peptides in the physiology and ecology of cyanobacteria have been discussed, mostly related either to grazing protection or to allelopathy.
The bioactivities exhibited by many cyanobacterial peptides towards mammalian (or vertebrate) test systems are often similar to effects observed in invertebrate animals that might be potential consumers of cyanobacteria. The toxicity of microcystins to mammals is caused in principle by the inhibition of protein phosphatases 1 and 2a, which are important enzymes in the intracellular regulatory mechanisms (Honkanen et al., 1994; Dawson, 1998). A similar inhibition has been demonstrated for protein phosphatases of Daphnia, the most important grazer in pelagic freshwater systems (DeMott & Dhawale, 1995). In due course, it has been shown that an intoxication of Daphnia upon ingestion of cyanobacterial cells is largely dependent on the microcystin content of the cells (Rohrlack et al., 1999a, b). On the other hand, no clear evidence has been produced that Daphnia intoxication plays a major role in plankton dynamics, whereas grazing resistance due to colony or filament formation has been recognized as important grazing protection (Hansson et al., 1998; DeMott et al., 2001; Kagami et al., 2002). Other cyanobacterial peptides have been reported to inhibit protein phosphatases (Sano et al., 2001), too, but respective IC50-values were at least an order of magnitude higher compared to that of microcystins (Honkanen et al., 1994).
Many cyanobacterial peptides have been studied for potential pharmaceutical applications and, in many cases, protease inhibitory activity has been found. Protease inhibition is known as a (inducible) grazing protection in terrestrial plants (Pena-Cortés et al., 1995; Bowles, 1998). For a number of cyanopeptolin-type peptides, inhibitory activity against serine/threonine proteases has been reported. In one case, hofmannolin (a cyanopeptolin), the interaction with elastase has been demonstrated by co-crystallization and X-ray spectroscopy, which revealed the importance of the Ahp-moiety for the inhibitory activity (Matern et al., 2003b). However, cyanopeptolins are not the only protease inhibitors among cyanobacterial peptides, and inhibitory activity has been reported for aeruginosins and microviridins (Shin et al., 1997, 1995; Ishida et al., 1999). For microviridin J, it has been shown that this peptide inhibits the molting of Daphnia and thus might reduce the grazing pressure of a population efficiently without directly killing the grazers (Rohrlack et al., 2004). Several studies demonstrated the effective inhibition of Daphnia proteases by cyanobacterial peptides (Agrawal et al., 2001, 2005; Rohrlack et al., 2003; von Elert et al., 2004; Czarnecki et al., 2006). From marine cyanobacteria, other feeding deterrents have been isolated that are highly modified peptides, like ypaoamide (Nagle & Paul, 1998).
Inhibition of photosynthetic activity by microcystins has been observed (Pflugmacher, 2002), indicating an allelopathic nature of microcystins. However, effective concentrations were generally higher than those expected in natural waters (LeBlanc et al., 2005). Indeed, microcystins are released to the surrounding water only in small amounts by vital cells (Welker et al., 2001). Other peptides have been tested for their allelopathic capacity, e.g. kasumigamide (Ishida & Murakami, 2000), but effective concentrations were also rather high compared to concentrations that can be assumed under field conditions. In freshwater systems, the most important adverse effect of cyanobacterial blooms on macrophytes and eukaryotic phytoplankton might be the reduction of light by shading (Casanova et al., 1999).
In terrestrial or benthic cyanobacteria, allelopathic metabolites could be more important as, in their respective habitats, the diffusion driven dilution is much slower. Such compounds have been isolated but they have mostly non-peptidic structures (Klein et al., 1995; Hagmann & Jüttner, 1996). Antifungal and anti-algal peptides have been isolated from terrestrial strains, but it remains obscure whether these peptides act as allelochemicals in situ (Todorova et al., 1995; Neuhof et al., 2005).
A further hypothesis relates cyanobacterial peptides to bacterial quorum sensing mechanisms (Kaebernick et al., 2000).
At present, only a few experimental studies have been published to resolve the ecological role of cyanobacterial oligopeptides and most of these studies were related to microcystins simply because these peptides were best studied and it is easier to obtain funding due to their problematic role in drinking water hygiene. For an understanding of the ecological/evolutionary role of cyanobacterial oligopeptides, two general observations are probably essential: firstly, the biosynthetic pathway of NRPS/PKS is a very ancient part of the (cyano)bacterial metabolism, i.e. cyanobacteria have synthesized oligopeptides long before higher plants or animals existed. Secondly, natural selection did not minimize the pool of peptide structures to a few very efficient ones but, on the contrary, obviously favoured the production of a vast array of individual structures.
Although a wealth of data on cyanobacterial peptides and the respective biosynthetical pathways has become available in the last decade, it is likely that we still know very little about these metabolites. Even less is known about their ecological or physiological functions, which are obscure at present or have only been hypothesized for a few peptide types. The homology of various NRPS genes and suspected intergenic recombination events suggest a similar function of various peptides, or at least a tight co-evolution.
New congeners of known peptide classes as well as entirely new peptides remain to be discovered together with their respective biosynthesis genes. Structural and chemical data will improve our understanding of the biosynthetical potential of cyanobacteria and the distribution of peptide types at the taxonomic and geographic levels. Genetic data will provide insights into the evolution of gene clusters responsible for the production of myriads of peptides that are often variations on a single theme–and will help to unveil the mechanisms of Nature's own combinatorial biosynthesis.
The collation of peptide, sequence and biochemical data and references was supported by Marcel Erhard (AnagnosTec GmbH, Germany), Jutta Fastner (Umweltbundesamt) and Karina Hesse (TU Berlin), and we are very thankful for this support. The work on this review was largely made possible through funding by the EU-project ‘Bioactive Peptides from Cyanobacteria’ (PEPCY). Comments on earlier drafts of the manuscript by Elke Dittmann and Annick Wilmotte were highly appreciated as were the comments made by two anonymous reviewers.