Ecology and genetic diversity of marine cyanobacteria
The question of whether cyanobacteria, the oldest oxygenic phototrophs on Earth, arose first in marine or freshwater ecosystems has long been debated. A recent analysis combining paleobiological data and phylogenetic comparisons of a wide range of cyanobacteria favoured the second hypothesis. Indeed, Blank & Sanchez-Baracaldo 2010 suggested that ancestral cyanobacteria, which were presumably unicellular and exhibited small cell sizes (<2.5 μm), appeared about 2.7 Gyr ago in freshwater and/or endolithic environments, while the colonization of coastal, brackish and marine environments occurred only about 2.4 Gyr ago (Fig. 1). This diversification of habitats was however a critical event for the evolution of life on Earth, as it is thought to have triggered the sudden rise in atmospheric oxygen that occurred 2.3 Gyr ago (Bekker et al. 2004). Nonetheless, observation that marine groups i) do not form a monophyletic clade but are interspersed with freshwater species within the cyanobacterial radiation (Fig. 1) and ii) display a wide variety of morphology, ecology and habitats (Blank & Sanchez-Baracaldo 2010; Larsson et al. 2011) provides compelling evidence that several independent colonization events of saline waters by different cyanobacterial lineages has occurred during evolution.
Figure 1. Ancestral state reconstruction-relaxed molecular clock chronogram using a 16S rRNA-RpoC tree, showing that early cyanobacteria were nonmarine. Branch lengths, calculated by penalized maximum likelihood using nucleotide sequence alignments of SSU and rpoC, are proportional to age (the age scale in Gyr, at right-hand side, shows geologic eras). The approximate date of global oxygen rise (GOR) is denoted by a dotted line. Reconstructed ancestral states using maximum parsimony are indicated by circles at nodes, the colour of which corresponds to habitat (as specified in the insert at bottom left). SPM, clade containing Synechocystis, Pleurocapsa and Microcystis; PNT, clade containing Pseudanabaena, Nostocales, Trichodesmium; LPT, clade containing Leptolyngbya, Plectonema, Phormidium and Synechococcus sp. PCC7335; SynPro, clade containing Synechococcus, Prochlorococcus and Cyanobium. Sequenced genomes are indicated by stars. Redrawn and slightly modified from Fig. 5C in Blank & Sanchez-Baracaldo (2010), with permission from authors and publisher.
Download figure to PowerPoint
The taxonomic diversity of free-living, planktonic cyanobacteria in present-day marine waters is surpringly low, with only four major genera known so far, the N2-fixers Trichodesmium and Crocosphaera and the nondiazotrophs Prochlorococcus and Synechococcus, and cultivated representatives of each of these genera have been sequenced (Table 1). Although less ubiquitous, it is worth noting that the planktonic cyanobacterium Nodularia spumigena, another N2-fixing species, which forms toxic surface blooms in brackish coastal waters such as the Baltic Sea (Sivonen et al. 1989) has also been recently sequenced, but its genome is yet to be described.
Table 1. Characteristics of sequenced marine photosynthetic organisms
|Lineage||Species||Strain (a.k.a.)||Genome size (Mbp)||G+C%||Sequencing centre||Genome status||GenBank accession no.||References|
|Cyanobacteria|| Acaryochloris marina ||MBIC11017 (AM1)||8.36||47.0||TGen||Complete|| CP000828 ||Swingley et al. (2008)|
|Acaryochloris sp.||CCMEE 5410||7.88||47.0||JCVI||WGS|| AFEJ01000511 ||Miller et al. (2011)|
|Acaryochloris sp.||HICR111A||8.37||47.0||UF GSC||WGS||JN585763 (partial)||Mohr et al. (2010a) Pfreundt et al. 2012|
| Calothrix rhizosoleniae ||SC01||11.50||44.0||JCVI||WGS||N/A||Unpublished|
| Crocosphaera watsonii ||WH0003||5.89||37.7||UCSC GSC||WGS|| AESD00000000 ||Bench et al. (2011)|
| Crocosphaera watsonii ||WH8501||6.24||37.1||JGI||WGS|| AADV00000000 ||Bench et al. (2011)|
|UCYN-A||Flow sorted cells||1.44||31.0||454 Life Sciences||Complete|| CP001842 ||Tripp et al. (2010)|
|Cyanobium sp.||PCC 7001||2.83||68.7||JCVI||WGS||ABSE00000000||Unpublished|
|Cyanothece sp.||ATCC 51142||5.46||37.9||WUSL GSC||Complete|| CP000806 ||Welsh et al. (2008)|
|Cyanothece sp.||ATCC 51472||5.40||37.9||JGI||WGS|| AGJC01000000 ||Unpublished|
|Cyanothece sp.||CCY0110||5.88||36.7||JCVI||WGS|| AAXW00000000 ||Unpublished|
|Leptolyngbya sp.||PCC 7375||8.903.93||47.8||JGI||Complete||N/A||Unpublished|
| Lyngbya majuscula ||3L||8.5||44.0||UCSD GSC||WGS|| AEPQ00000000 ||Jones et al. (2011)|
|Lyngbya sp.||PCC 8106 (CCY9616)||7.04||41.1||JCVI||WGS|| AAVU00000000 ||Unpublished|
| Microcoleus chthonoplastes ||PCC 7420||8.65||45.4||JCVI||WGS|| ABRS00000000 ||Unpublished|
| Nodularia spumigena ||CCY9414||5.32||41.3||JCVI||WGS|| AAVW00000000 ||Unpublished|
| Prochlorococcus marinus ||AS9601||1.67||31.3||JCVI||Complete|| CP000551 ||Kettler et al. (2007)|
| Prochlorococcus marinus ||MIT 9202||1.69||31.1||JGI||WGS|| ACDW00000000 ||Unpublished|
| Prochlorococcus marinus ||MIT 9211||1.70||38.0||JCVI||Complete|| CP000878 ||Kettler et al. (2007)|
| Prochlorococcus marinus ||MIT 9215||1.74||31.1||JGI||Complete|| CP000825 ||Kettler et al. (2007)|
| Prochlorococcus marinus ||MIT 9301||1.64||31.3||JCVI||Complete|| CP000576 ||Kettler et al. (2007)|
|Prochlorococcus sp.||MIT 9303||2.70||50.0||JCVI||Complete|| CP000554 ||Kettler et al. (2007)|
| Prochlorococcus marinus ||MIT 9312||1.71||31.2||JGI||Complete|| CP000111 ||Coleman et al. (2006) Kettler et al. (2007)|
|Prochlorococcus sp.||MIT 9313||2.40||50.7||JGI||Complete|| BX548175 ||Rocap et al. (2003) Kettler et al. (2007)|
| Prochlorococcus marinus ||MIT 9515||1.70||30.8||JCVI||Complete|| CP000552 ||Kettler et al. (2007)|
| || Prochlorococcus marinus ||NATL1A||1.86||35.0||JGI||Complete|| CP000553 ||Kettler et al. (2007)|
| Prochlorococcus marinus ||NATL2A||1.84||35.1||JGI||Complete|| CP000095 ||Kettler et al. (2007)|
| Prochlorococcus marinus ||SS120 (CCMP1375)||1.75||36.4||JGI||Complete|| AE017126 ||Dufresne et al. (2003) Kettler et al. (2007)|
| Prochlorococcus marinus ||MED4 (CCMP1986)||1.66||30.8||JGI||Complete|| BX548174 ||Rocap et al. (2003) Kettler et al. (2007)|
| Prochlorococcus marinus ||UH18301||1.65||31.0||JCVI||WGS||N/A||Unpublished|
| Prochloron didemni a ||Cell sample P1: Palau||6.33||42.0||TIGR and IGS||WGS|| AGRF00000000 ||Donia et al. (2011a,b)|
| Prochloron didemni a ||Cell sample P2: Fiji||7.55||41.5||IGS||WGS|| AFSJ00000000 ||Donia et al. (2011a,b)|
| Prochloron didemni a ||Cell sample P3: Solomon||5.89||41.9||IGS||WGS|| AFSK00000000 ||Donia et al. (2011a,b)|
| Prochloron didemni a ||Cell sample P4: Papua New Guinea||5.69||41.9||IGS||WGS|| AGGA00000000 ||Donia et al. (2011a)|
|Synechococcus sp.||BL107||2.28||54.3||JCVI||WGS|| AATZ00000000 ||Dufresne et al. (2008)|
|Synechococcus sp.||CB0101||2.67||64.1||JCVI||WGS|| ADXL00000000 ||Unpublished|
|Synechococcus sp.||CB0205||2.43||62.3||JCVI||WGS|| ADXM00000000 ||Unpublished|
|Synechococcus sp.||CC9311||2.61||52.4||JGI||Complete|| CP000435 ||Palenik et al. (2006) Dufresne et al. (2008)|
|Synechococcus sp.||CC9605||2.51||59.2||JGI||Complete|| CP000110 ||Dufresne et al. (2008)|
|Synechococcus sp.||CC9902||2.23||54.2||JGI||Complete|| CP000097 ||Dufresne et al. (2008)|
|Synechococcus sp.||PCC 7002||3.40||49.2||PSU GSC||WGS|| CP000951 ||Unpublished|
|Synechococcus sp.||PCC 7335||5.96||48.2|| ||WGS|| ABRV00000000 ||Unpublished|
|Synechococcus sp.||RCC307||2.22||60.8||Genoscope||Complete|| CT978603 ||Dufresne et al. (2008)|
|Synechococcus sp.||RS9916||2.66||59.8||JCVI||WGS|| AAUA00000000 ||Dufresne et al. (2008)|
|Synechococcus sp.||RS9917||2.58||64.5||JCVI||WGS|| AANP00000000 ||Dufresne et al. (2008)|
|Synechococcus sp.||WH5701||3.04||65.4||JCVI||WGS|| AANO00000000 ||Dufresne et al. (2008)|
|Synechococcus sp.||WH7803||2.37||60.2||Genoscope||Complete|| CT971583 ||Dufresne et al. (2008)|
|Synechococcus sp.||WH7805||2.62||57.6||JCVI||WGS|| AAOK00000000 ||Dufresne et al. (2008)|
| Synechococcus sp ||WH8016||2.69||54.1||JGI||WGS||N/A||Unpublished|
|Synechococcus sp.||WH8102||2.43||59.4||JGI||Complete|| BX548020 ||Palenik et al. (2003) Dufresne et al. (2008)|
|Synechococcus sp.||WH8109||2.12||60.1||JCVI||WGS|| ACNY00000000 ||Unpublished|
| Trichodesmium erythraeum ||IMS101||7.75||34.1||JGI||Complete|| CP000393 ||Unpublished|
|Prasinophytes|| Ostreococcus lucimarinus ||CCE9901||13.2||60||JGI||WGS|| CP000581_CP000601 ||Palenik et al. 2007|
| Ostreococcus tauri ||OTH95||12.6||59||Montpellier France||WGS|| CR954201_CR954220 ||Derelle et al. 2006|
|Ostreococcus sp.||RCC809||12|| ||JGI|| || ||Unpublished|
|Micromonas sp.||RCC299||20.90||64||JGI||WGS|| ACCO00000000 ||Worden et al. 2009|
| M. pusilla ||CCMP 1545||21.90||65||JGI||WGS|| ACCP00000000 ||Worden et al. 2009|
| Bathycoccus prasinos ||BBAN7||18|| ||Genoscope||WGS|| ||Unpublished|
|Stramenopiles|| Phaeodactylum tricornutum ||CCP1055/1||27||53.7||JGI||WGS|| ABQD01000000 ||Bowler et al. 2008|
| Thalassiosira pseudonana ||CCMP1335||32||47||JGI||WGS|| AAFD02000000 ||Armbrust et al. 2004|
| Fragilariopsis cylindrus ||CCMP1102||81|| ||JGI||WGS|| || http://genome.jgi-psf.org/Fracy1/Fracy1.home.html |
| Pseudo-nitzschia multiseries ||CLN-47||250|| ||JGI||Ongoing|| || http://www.jgi.doe.gov/genome-projects/ |
| Ectocarpus siliculosus ||Ec32||214||53.6||Genoscope||WGS||CABU01000001–CABU01013533, FN647682–FN649242, FN649726–FN649760||Cock et al. (2010a,b) |
| Aureococcus anophagefferens ||CCMP1984||56.70|| ||JGI||WGS|| ACJI00000000 || |
| Nannochloropis gaditana ||CCmP526.||29|| || ||WGS|| AGNI00000000 ||Radakovits et al. 2012|
|Rhodophytes|| Chondrus crispus || ||105|| ||Genoscope||WGS|| ||unpublished|
| Porphyra umbilicalis || ||300–400|| ||JGI||Ongoing|| || http://www.jgi.doe.gov/genome-projects/ |
|Haptophytes|| Emiliania huxleyi ||CCMP1516||168|| ||JGI||WGS|| ||Unpublished|
Trichodesmium can form large colonies (typically 1–5 mm in diameter) composed of tens to hundreds of aggregated filamentous cells (5–20 μm in length; Capone et al. 1997; Post et al. 2002). This genus is also known to form dense, widespread blooms and remote sensing observations have shown that these are most frequent in the northern Arabian Sea, the western Indian Ocean and the southeastern Pacific, while such blooms occur less than 5–10% of the year in other tropical and subtropical waters (Westberry & Siegel 2006). Moreover, molecular studies have demonstrated that Trichodesmium is the dominant N2-fixing cyanobacterium in several areas such as the Atlantic ocean (Langlois et al. 2008; Goebel et al. 2010) and the South China Sea (Moisander et al. 2008), but may be outnumbered by other diazotrophs in the North and South Pacific ocean gyres (Church et al. 2005; Halm et al. 2012). Phylogenetically, this genus encompasses at least 4 distinct clusters, one of them comprising the noncolonial filamentous taxon Katagnymene, which was erroneously classified as a different genus based on a number of phenotypic differences with Trichodesmium (Lundgren et al. 2005; Hynes et al. 2012). Only one strain of this group, T. erythraeum IMS101, has been sequenced thus far (Table 1), but no formal description of this genome is available to date.
Crocosphaera can also form small colonies, with individual cells ranging from 2 to 8 μm in size. It is usually found in warm (>27°C) oligotrophic subsurface waters (Mazard et al. 2004; Campbell et al. 2005) and its abundance, as estimated by the number of taxon-specific nifH gene copies, is generally low (e.g. 61–460 cells/L in the tropical Atlantic; Goebel et al. 2010), although a record concentration at 8 × 106 cells/L was reported in the S Pacific ocean (Moisander et al. 2010). Interestingly, phylogenetic studies of natural populations and strains of Crocosphaera isolated from diverse areas revealed a very low level of genetic divergence, despite a significant variability at the phenotypic level (Zehr et al. 2007; Webb et al. 2009), and this taxon therefore seemingly encompasses a single species ocean-wide, C watsonii. Two strains (WH8501 and WH0003) with distinct cell sizes, growth temperature range and N2 fixation rate, isolated respectively from the south Atlantic and the north Pacific oceans, have been sequenced thus far (Table 1) and comparison of their genomes confirmed a remarkably high similarity at the nucleotide level, genome-wide (over 80% of each genome was >98% identical to the other strain), despite a large number of genome rearrangements, insertions or deletions (Bench et al. 2011).
Prochlorococcus and Synechococcus co-occur in the 45°N/S latitudinal band and are by far the most abundant cyanobacteria (and phytoplanktonic organisms in general) in the ocean. Prochlorococcus, the smallest known free-living phototroph (0.6–1.0 μm), dominates in warm, oligotrophic areas, with typical concentrations of 1–3 × 108 cells/L in the subsurface (Chisholm et al. 1988; Zubkov et al. 1998; Johnson et al. 2006). In contrast, Synechococcus is most abundant in near coastal waters and in areas enriched by local upwellings, often reaching concentrations as high as or higher than Prochlorococcus in these areas, while its cell concentrations dramatically decrease in nutrient-poor waters. These two picocyanobacteria are phylogenetically closely related to one another (Scanlan et al. 2009). Several pieces of evidence suggest that the monophyletic Prochlorococcus group arose fairly recently (possibly only 150 Myr old; Dufresne et al. 2005) and was derived from a Synechococcus-like ancestor, the main phenotypic traits distinguishing these two groups being their strikingly different light-harvesting antenna systems and pigmentation (Partensky et al. 1999; Ting et al. 2002). The intrageneric diversity within each of the latter two taxa is wide (Rocap et al. 2002; Fuller et al. 2003; Mazard et al. 2012) and genome sequences have been obtained for most of the major clades/ecotypes identified so far in both groups (Dufresne et al. 2003, 2008; Palenik et al. 2003, 2006, 2009; Rocap et al. 2003; Kettler et al. 2007; Scanlan et al. 2009). Prochlorococcus comprises distinct ecotypes physiologically and genetically adapted to either high light (HL) or low light (LL) and occupying different light niches within the euphotic layer in stratified, open ocean waters. Furthermore, each light niche may shelter several distinct lineages, namely HLI (a.k.a. eMED4) and HLII (a.k.a. eMIT9312) in the upper mixed layer, LLII, LLIII and LLIV (a.k.a. eSS120, eMIT9211 and eMIT9313, respectively) at the bottom of the euphotic zone and LLI (a.k.a. eNATL) ecotype at intermediate depth (Johnson et al. 2006; Malmstrom et al. 2010). The mechanisms of maintenance of LLII-IV ecotypes in seemingly identical niches are still unclear. For HL clades, however, it was shown that they exhibit distinct growth temperature ranges and geographic distributions, with HLII preferentially thriving in warm, tropical and subtropical waters and HLI preferring cooler waters and extending to higher latitudes.
Marine Synechococcus spp. also display a wide genetic diversity, with three deeply branching groups (called subclusters 5.1–5.3) subdivided into a number of clades (Dufresne et al. 2008; Scanlan et al. 2009), with at least one sequenced representative for most of them (Table 1). Molecular studies have shown that the most abundant groups in tropical and temperate waters of the Atlantic and Indian Ocean belong to subcluster 5.1, with four dominant clades (I–IV; Zwirglmaier et al. 2008; Mella-Flores et al. 2011). Clades I and IV co-occur in temperate waters at high latitude (>30°N/S) often predominating in coastal waters, while clade II is found in warm waters at low latitudes (<35°N/S) and clade III in open ocean waters, but with seemingly no latitudinal restriction. Interestingly, members of subcluster 5.2 were recently found to dominate in subpolar waters of the North Pacific ocean (Huang et al. 2011b). Several other Synechococcus clades also occur in the field, but generally as minor components of the Synechococcus community, although they may sometimes occur at high concentrations at some sites suggesting that they are adapted to specific niches (Zwirglmaier et al. 2008; Choi et al. 2011; Huang et al. 2011b; Mella-Flores et al. 2011). However, field and culture data are often too scarce to precisely set their geographical distributions and ecological preferenda.
Compared to their planktonic counterparts, benthic cyanobacteria are much more diverse phylogenetically, due to the large variety of available niches in coastal environments, including intertidal or infralittoral areas that can be rocky, sandy or muddy. For instance, the much studied Cyanothece sp. ATCC5112 has been isolated from intertidal sands of the Texas Gulf coast (Reddy et al. 1993), Cyanothece sp. CCY010 at the bottom of shallow waters around Zanzibar (L. J. Stal, pers. comm.) and Synechococcus sp. PCC7335 from a snail shell collected in the intertidal zone in the Gulf of California (Mexico). Many benthic cyanobacteria form dense microbial mats, while others live on the surface of macroalgae, seagrasses or mangrove roots (epiphytes) or even inside limestone rocks (endoliths). Intertidal mats are generally composed of filamentous, N-fixing cyanobacteria either possessing cells specialized for this function (i.e. heterocysts, such as e.g. Calothrix or Scytonema) or nonheterocystous (e.g. Lyngbya, Microcoleus, Phormidium or Schizothrix; Hoffmann 1999). Some species like Lyngbya aestuarii and Microcoleus chthonoplastes, the genomes of which were recently sequenced but are as yet unpublished (Table 1), have a ubiquitous distribution, while most others are found over a narrower latitudinal range. Because of their large diversity, benthic cyanobacteria have particularly good potential as sources of novel secondary metabolites (cyclic and linear peptides, guanidines, phosphonates, purines, lipids, macrolides, etc.) of industrial (biofuels) or pharmacological interest (e.g. cytotoxicity, inhibition of proteases).
Several marine cyanobacteria are also involved in symbiotic associations, often with benthic invertebrates. The endosymbiotic association between Prochloron and either ascidians or sponges have been particularly well studied (Munchhöff et al. 2007; Usher 2008). Phylogenetic analyses revealed a low specificity of this cyanobacterium for its hosts and a low genetic variation between individuals retrieved from different hosts, suggesting a lateral transmission of Prochloron cells between hosts (Munchhöff et al. 2007). Recently, several near complete Prochloron didemnii genomes (Table 1), obtained by squeezing cells out of Lissoclinum patella didemnids collected from four remote islands of the South Pacific, were shown to display a remarkable level of synteny and more than 97% DNA sequence identity across 90% of genome length (Donia et al. 2011b). This strongly suggests that the Prochloron life cycle includes a free-living stage, during which cells are transported over long distances, a phenomenon that would contribute to the genetic homogenization of the population (Donia et al. 2011a). Interestingly, different cyanobacterial species may inhabit the same invertebrate host species and evolve complementary pigmentations, a strategy that is likely to reduce competition for light (Hirose et al. 2009). Indeed, Synechocystis trididemni, a close relative of Prochloron, contains large amounts of phycoerythrin and phycocyanin that absorb green and red-orange light, respectively, whilst the main pigments in Prochloron are chlorophylls (Chls) a and b, which absorb blue and red light. Acaryochloris, another atypical cyanobacterium frequently found in association with Prochloron contains yet another major pigment, Chl d, a unique chromophore that absorbs near-infrared light (Miyashita et al. 1997). It is worth noting that Acaryochloris was initially thought to be an endosymbiont of ascidians (Miyashita et al. 2003), but a more recent analysis showed that is in fact a free-living epiphyte of those invertebrates (Kuhl et al. 2005) and it has been retrieved subsequently in a variety of benthic environments, including from underneath the crust of coralline algae living in coral reefs (Behrendt et al. 2010; Mohr et al. 2010a). While Prochloron has never been cultivated, several Acaryochloris has been successfully brought into culture and three strains have been sequenced to date (Swingley et al. 2008; Mohr et al. 2010a; Pfreundt et al. 2012).
Besides the genomes of cyanobacterial isolates, a number of genomes of uncultivated cyanobacteria have recently been obtained using NGS technologies, including the above mentioned endosymbiont Prochloron, the largest metagenome assembled so far (Donia et al. 2011a,b). Also noteworthy are the sequences of two novel Prochlorococcus HL subclades (HNLC1 and 2), characterized by a reduced set of genes encoding Fe-containing proteins. These strains were found to specifically thrive in iron-limited, equatorial and tropical oceanic waters (Rusch et al. 2010; West et al. 2010). Only a ‘consensus genome’ was obtained for each of these lineages by assembling metagenomic data collected during the Global Ocean Sampling (GOS) expedition of the Sorcerer II (Rusch et al. 2010). A more sophisticated approach, combining flow cytometric cell sorting, whole genome amplification and massively parallel pyrosequencing of paired-end reads, was used to characterize the genetic information of an atypical, uncultivated, nitrogen-fixing planktonic cyanobacterium, called UCYN-A (i.e. unicellular cyanobacterial N2-fixer group A; Tripp et al. 2010; Zehr et al. 2008). Although free UCYN-A cells can be observed in seawater by flow cytometry, evidence suggests that this organism in fact lives in symbiotic (or epiphytic) association with a protist (Tripp et al. 2010; Larsson et al. 2011). As for Crocosphaera, the genome of UCYN-A seems to be highly conserved (>97% nucleotide identity) across ocean basins (Tripp et al. 2010). Other associations of cyanobacteria with protists are known in the marine plankton, such as that associating the filamentous, heterocystous species Richelia intracellularis with different diatoms again either as an endosymbiont or as an epiphyte (Gomez et al. 2005), but so far no genome nor metagenome has been reported for this taxon.
Structure and evolution of marine cyanobacterial genomes
Genomes of free-living marine cyanobacteria vary greatly in size from 1.64 Mbp for Prochlorococcus marinus MIT9301 to 8.65 Mbp for Microcoleus chthonoplastes PCC7420 (Table 1). The latter genome size is only slightly smaller than that of the largest nonmarine cyanobacteria genome sequenced so far, i.e. 9.05 Mbp for Nostoc punctiforme PCC73102, a symbiont of cycads (Larsson et al. 2011). Furthermore, no free-living cyanobacteria with smaller genome sizes than 2.5 Mbp have been reported to date from terrestrial or freshwater habitats. Marine cyanobacteria therefore exhibit almost the full range of genome sizes observed in the Cyanobacteria phylum as a whole. It is worth noting that the uncultivated group U-CYNA has an even smaller genome (1.44 Mbp) than Prochlorococcus, but is apparently unable to sustain a free-living lifestyle (Tripp et al. 2010). Metabolic reconstructions suggest that UCYN-A is dependent upon other organisms for essential compounds, such as amino acids and purines. Also, even though UCYN-A is phylogenetically affiliated to cyanobacteria, absence of essential components of the photosynthetic machinery, including photosystem II, carboxysomes (i.e. the cyanobacterial microcompartments where carbon fixation takes places, thanks to RuBisCo) as well as enzymes of the Calvin-Benson cycle, makes it paradoxically unable to perform oxygenic photosynthesis, a unique trait among this phylum (Zehr et al. 2008). Nonetheless, having kept complete photosystem I and ATP synthase, U-CYNA is capable of capturing solar energy that is probably used to generate ATP and reducing power (Tripp et al. 2010). From an evolutionary viewpoint, comparison of the U-CYNA genome with the 5.46 Mbp genome of Cyanothece sp. ATCC 51142 suggests that it shares a recent common ancestor with members of this genus, from which U-CYNA seemingly evolved by a drastic genome reduction (Welsh et al. 2008; Tripp et al. 2010).
Most Prochlorococcus lineages also have a streamlined genome (size range: 1.64–1.86 Mbp), associated with a low GC content (31–38 G+C%), the only exception being the LLIV (or eMIT9313) lineage, which is located at the base of the Prochlorococcus radiation and that has retained a genome size (2.41–2.68 Mbp) and GC content (50–51 G+C%) similar to those found in the closely related marine Synechococcus group (2.22–2.62 Mbp; 52–66 G+C%; Table 1). The progressive decrease in genome size and GC content that occurred during the evolution of these lineages was seemingly associated with an acceleration of the mutation rate of protein-coding genes (Dufresne et al. 2005). Comparative genomics studies have shown that different clades possess distinct gene complements and that one of the functional categories that was most differentiated between ecotypes was DNA replication, recombination and repair (Kettler et al. 2007; Partensky & Garczarek 2010). It has been suggested that the loss of genes involved in the repair of GC to AT transversions in some genotypes may have caused them to become ‘mutators’, i.e. cells with an increased mutation rate (Marais et al. 2008). This type of transversion appears to have occurred at least twice during evolution, once before the differentiation of LLI-III ecotypes and a second time before the differentiation of HL ecotypes (Partensky & Garczarek 2010). Each time, this event must have been followed by the restoration of a normal mutation rate, as recently verified in the HLI strain MED4 (Osburne et al. 2011). A differential genome streamlining process among lineages, only partially compensated for by acquisition of novel genes by horizontal transfer, has created genetically distinct ecotypes, each with a minimalist genome optimized for life in a specific ecological niche (Johnson et al. 2006; Kettler et al. 2007; Partensky & Garczarek 2010). However these niches, particularly the upper nutrient-poor layer of oceanic waters occupied by HL ecotypes, are among the most stable and the largest ecosystems on Earth reached by solar radiations.
At the opposite side of genome size, an inverse trend, i.e. genome expansion, is thought to have caused a progressive metabolic complexification in some cyanobacterial lineages (Larsson et al. 2011). This phenomenon was crucial for organisms needing to acquire novel metabolic traits that are required for colonization and survival in variable environments such as microbial mats or intertidal zones. The two main mechanisms for genome size increase are gene duplication and horizontal gene transfer (HGT). Gene duplications may create either gene redundancy, useful for increasing the number of transcripts in highly transcribed genes, or may be followed by the genetic divergence of one of the copies that eventually becomes a paralog. This derived gene copy often encodes a novel protein/enzyme with a slightly different function/location/activity from the original molecule and thus increases the physiological plasticity of the organism. A recent comparative study of 58 cyanobacterial genomes showed that the proportion of paralogs and the total paralogous gene copies are correlated to genome size (Larsson et al. 2011). The 8.36 Mbp genome of the marine, unicellular cyanobacterium Acaryochloris marina exhibits the highest numbers in both these categories, but Crocosphaera watsonii and Microcoleus chthonoplastes also have many paralogs, mostly belonging to the functional categories ‘DNA replication, recombination and repair’ and ‘signal transduction’. Members of the former category are mostly transposases (Swingley et al. 2008). These are particularly abundant in the two genomes of C. watsonii sequenced so far (strains WH8501 and WH0003), with e.g. a total of 1,211 in the former strain, including 292 copies of a single transposase sequence (Bench et al. 2011; Larsson et al. 2011). Consequently, in this species, most of the genetic diversity is seemingly generated by transposition of genes or genome fragments (Zehr et al. 2007; Bench et al. 2011). Genotypic variations nevertheless exist among C. watsonii strains, as for instance some strains are capable to synthesize exopolysaccharides, while others are not, and this is directly related to the presence or absence of genes involved in this process. These accessory genes are most often isolated or located in small gene islets across genomes (Bench et al. 2011). This contrasts with the large genomic islands observed in Prochlorococcus and Synechococcus, which constitute privileged insertion sites for laterally transferred genes (Coleman et al. 2006; Kettler et al. 2007; Dufresne et al. 2008). These hypervariable genomic regions are generally thought to be critical for adaptation to local niches (see below).
Although this may seem the ultimate degree of sophistication for prokaryotic organisms, the ability of some cyanobacterial species to undergo cellular differentiation, a phenomenon that occurs in response to environmental stress, is surprisingly not restricted to the largest genomes, as it has been observed in strains with genomes as small as 3.2 Mbp for the freshwater, filamentous strain Raphidiopsis brookii D9 or 5.3 Mbp for the brackish, heterocystous strain Nodularia spumigena CCY9414 (Stucken et al. 2010; Larsson et al. 2011). Cell differentiation includes the transformation of vegetative cells into hormogonia (short filaments used for organism dispersal), akinetes (resistant forms) or heterocysts (cells specialized in the fixation of dinitrogen; Herrero et al. 2004). A less sophisticated form of differentiation also occurs in the marine cyanobacterium Trichodesmium, in which about 15% of the cells are specialized in nitrogen fixation. These so-called diazocytes are located in the central part of filaments (or trichomes) and mainly differ from vegetative cells by their less granular aspect and the presence of nitrogenase (El-Shehawy et al. 2003). Examination of the Trichodesmium erythraeum IMS101 genome shows that it indeed lacks genes involved in the synthesis of the thick cell envelope surrounding heterocysts, which is composed of polysaccharides and glycolipids. However, it possesses hetR, the key regulatory gene in heterocyst differentiation and a few early heterocyst differentiation genes that seem to be critical for diazocyte differentiation.
Genome organization varies greatly depending on species and genome size. The smallest genomes, in particular all isolates of Prochlorococcus and marine Synechococcus sequenced thus far, possess only one chromosome and no plasmids. Metagenomic analyses have however suggested that cells from natural populations of Synechococcus from coastal waters of California might in fact possess one or several small plasmids (Palenik et al. 2009). Plasmids are the rule for larger cyanobacterial genomes. For instance, the marine Cyanothece sp. ATCC51142 possesses six separate DNA elements, a 4.93 Mbp circular chromosome, a 0.43 Mbp linear chromosome and four plasmids ranging in size from 10 to 40 Kbp (Welsh et al. 2008), while Acaryochloris marina MBIC11017 has one circular 6.50 Mbp chromosome and nine plasmids (2–374 kbp in size; Swingley et al. 2008). By allowing rapid lateral transfer of large DNA chunks via conjugation with organisms belonging to the same or other species, the presence of plasmids, combined with an efficient transposition machinery, likely confer on those strains a much higher capacity to acquire new, sophisticated functions than their picocyanobacterial counterparts, which depend mainly on cyanophages for the acquisition of new genes (Clokie et al. 2003; Lindell et al. 2004; Zeng & Chisholm 2012). One of the most striking examples of this phenomenon is likely the recent discovery a diazotrophic Acaryochloris strain (HICR111A) that would have acquired the capacity to fix N2 by lateral transfer of a 60-gene cluster, including 22 nif genes necessary for this process (Pfreundt et al. 2012). It is worth noting in this context that genes specialized in a given function (e.g. ATPase biosynthesis, carbon fixation, light harvesting, etc.) are often clustered in operons or gene regions in cyanobacteria, an organization which allows not only an efficient control of genes participating in the same process but may also permit lateral transfer of several genes at a time in a single step.
Adaptation of marine cyanobacterial genomes to a dynamic environment
The availability of complete or near complete genomes for a number of marine cyanobacteria makes it possible not only to have a global overview of the genetic potential of these organisms, but also, using postgenomic approaches, to study genome (or metagenome) dynamics in response to natural fluctuations of physico-chemical parameters or to biotic and abiotic stresses.
One of most regular phenomena that natural populations of cyanobacteria have to cope with in nature is the alternation of day and night. For these photosynthetic cells, this phenomenon implies strong and fast variations of incident visible irradiance, which in the uppermost layer of the ocean are associated with concomitant fluctuations of UV radiation fluxes and, to some extent, water temperature. Several laboratory studies have dealt with the effect of light-dark (L/D) cycles on the transcriptome of marine cyanobacteria, but have most often neglected the concomitant effects of UV and temperature variations. An interesting example of how cyanobacterial cells deal with L/D cycles is provided by the diazotrophs Cyanothece sp. and Crocosphaera watsonii, which perform oxygenic photosynthesis and dinitrogen fixation in the same cell, two processes that are mutually exclusive, given the high sensitivity of nitrogenase activity with regard to oxygen. To cope with this incompatibility, both organisms fix nitrogen at night, while photosynthesis is restricted to the light period (Toepel et al. 2008; Mohr et al. 2010b; Shi et al. 2010; Aryal et al. 2011; Stockel et al. 2011). This is made possible by a tight synchronization of the whole metabolism triggered by the circadian clock, a molecular mechanism that can maintain a robust diel rhythmicity of the whole transcriptome, even under continuous light conditions (Toepel et al. 2008; Pennebaker et al. 2010). Elvitigala et al. (2009) suggested that the majority of diurnally regulated genes, i.e. those genes that are maximally expressed during the middle of the light or dark periods, are light responsive, while genes that are up-regulated at the beginning of the dark (or subjective dark) period are under circadian control. Another notable example with regard to L/D cycles is the nondiazotroph Prochlorococcus, which lacks kaiA, one the three genes necessary to synthesize the circadian clock. It was shown in P. marinus PCC 9511 that the remaining two genes (kaiBC) are sufficient to make up a minimalist clock that needs to be reset every morning (Holtzendorff et al. 2008; Axmann et al. 2009). This so-called ‘hourglass’ is seemingly robust enough to trigger fine-tuned orchestration of the whole transcriptome over a 24-h period when cells are grown under a L/D cycle (Zinser et al. 2009), but oscillations of the transcriptome and the whole metabolism disappear in a few hours when cells are shifted to continuous light (Holtzendorff et al. 2008). If cyclic visible light is supplemented with UV radiation, the cell cycle timing is affected, as shown by a 2-h shift of the peak of cells in the DNA synthesis phase into the dark period, a strategy that is likely to decrease the risk of UV-induced mutations to DNA (Kolowrat et al. 2010). A comparable shift was also observed for WH7803, a strain representative of the closely related genus Synechococcus (Mella-Flores et al. 2012). However, growth under a modulated L/D cycle with high photon fluxes of visible light (supplemented or not with UV radiations) induced very different diel expression patterns in WH7803 and PCC 9511 for most genes involved in photosynthesis, response against UV and oxidative stress and several other metabolic processes, a difference likely due to distinct light-controlled regulation systems. For instance, genes involved in the biosynthesis of antenna complexes, ATP synthase or in CO2 fixation showed a maximal expression during the day in Synechococcus and the opposite pattern in Prochlorococcus (Mella-Flores et al. 2012). All these studies on the effect of L/D cycles on marine cyanobacteria clearly indicate that (i) transcript levels of most protein-coding genes vary significantly over the day and (ii) diel patterns differ from one pathway, and sometimes from one gene, to another. Consequently, analyses of field metatranscriptomes comprising only one or two data points per day must be interpreted with great care, especially in the case of cyanobacteria such as Crocosphaera which comprise only a small fraction of the total cell abundance and therefore may additionally have only poor metagenomic coverage (see e.g. Hewson et al. 2009).
Several studies have dealt with the effects of nutrient stresses on the transcriptome of marine cyanobacteria. Although nitrogen is the main limiting factor in most oceanic areas, several regions, such as the Mediterranean Sea or the Sargasso Sea, are known to be limited by the availability of phosphorus (hereafter P; Martiny et al. 2009), while others, such as the equatorial Pacific Ocean, are iron-depleted (Cavender-Bares et al. 1999). Natural populations of marine cyanobacteria therefore face a variety of nutrient stresses. P is necessary for the synthesis of ATP, nucleotides and phospholipids and is also required in all regulatory processes involving phosphorylation. To cope with low P availability in open ocean waters, Prochlorococcus cells, which are often the dominant organisms in these waters, have developed a variety of strategies, including the preferential synthesis of sulfolipids and glycolipids over phospholipids (Van Mooy et al. 2006). Furthermore, both natural Prochlorococcus populations thriving in P-depleted waters and strains isolated from these areas were shown to possess a much larger set of genes involved in P uptake and assimilation than populations/strains from P-replete areas (Martiny et al. 2006, 2009; Coleman & Chisholm 2010). This variability, which is independent of the phylogenetic distance between strains, translates into a wide diversity of P stress responses within the Prochlorococcus genus. For instance, P depletion provoked the upregulation of 30 genes in the HL-adapted MED4 vs. 176 in the LL-adapted MIT9313, but only seven were common to both strains (Martiny et al. 2006). This common set encodes proteins involved in P metabolism, including the response regulator PhoB, a transport system for orthophosphate (PstABCS) and PhoE, which is involved in the transport of orthophosphate across the outer membrane. Surprisingly, MIT9313 lacks an ortholog of the alkaline phosphatase gene phoA, the most highly up-regulated gene in MED4. Furthermore, MIT9313 has no functional PtrA (a transcription factor of the cyclic AMP receptor family) nor sensor kinase PhoR, two regulators of P metabolism (Scanlan et al. 2009), although there might be compensation mechanisms (Martiny et al. 2006). Altogether, MIT9313, which was isolated from the Gulf stream at 135 m, i.e. in the vicinity of the phosphacline, was seemingly less well adapted to P starvation than MED4, isolated from surface waters of the Mediterranean Sea, a P-limited area. The effects of P depletion was also studied in the marine Synechococcus strain WH8102 and caused a strong upregulation (>2-fold) of 36 genes and the downregulation of 24 others (Tetu et al. 2009). Several transiently upregulated genes were involved in transport (outer membrane porins, P-specific ABC transporters and solute-binding proteins), P uptake (alkaline phosphatases) or regulation of P metabolism (see also Ostrowski et al. 2010). Interestingly, two upregulated genes (swmA and B) coded for outer membrane proteins potentially involved in swimming motility, suggesting that P stress may either result in a reorganization of the cell envelope, perhaps to accommodate P-specific porins, or in an increased cell motility that may help Synechococcus cells to more efficiently scavenge P. Both Prochlorococcus and Synechococcus possess genes for taking up phosphonates as an alternative P source (Ilikchyan et al. 2009) and at least one Prochlorococcus strain (MIT9301) can also use phosphite (Martinez et al. 2011). Like these two picocyanobacteria, Crocosphaera watsonii possesses a high-affinity phosphate transport system (PstSCAB), but unlike those it also has the capacity to hydrolize phosphomonoesters (Dyhrman & Haley 2006). In contrast, no clear homologs of genes for phosphonate uptake and hydrolysis could be identified in the WH8501 genome, as confirmed by the absence of growth of this strain on this P source.
Like P stress, iron depletion is thought to have deeply influenced the structure of the Prochlorococcus genome. For instance, the divinyl-chlorophyll a/b-binding antenna complex protein Pcb is closely related to IsiA, a chlorophyll a-binding protein induced during iron deficiency in typical cyanobacteria (La Roche et al. 1996). Both Pcb and IsiA antenna form an 18-molecule ring around photosystem I (Bibby et al. 2003). Furthermore, the only gene encoding this PSI antenna in Prochlorococcus sp. MIT9313 and one of the two such genes in P. marinus SS120 are also iron-induced, while this gene has been lost in MED4. It is most likely that the constitutively expressed gene(s) coding the 10-molecule Pcb antenna surrounding photosystem II (present in 1–6 copies, depending on strains), arose by duplication then divergence of this isiA-like gene (Garczarek et al. 2001). The effect of low Fe stress on the whole transcriptome was compared between P. marinus MED4 and MIT9313 (Thompson et al. 2011). Surprisingly, the latter strain could grow at 10-fold lower dissolved inorganic Fe concentrations than the former strain, possibly due to a more efficient iron transport system, a better protection mechanism against the deleterious effects of iron depletion, such as oxidative stress and/or a lower cellular iron requirement than MED4. Again, only a handful of the 1159 orthologs common to MED4 and MIT9313 were found to be differentially expressed in both strains, including the downregulated petF gene (encoding ferredoxin) and the upregulated isiB (encoding flavodoxin), idiA (coding for an iron-deficiency-induced gene) and one of the many hli genes coding for HL-induced proteins (Thompson et al. 2011). The expression of over a hundred additional genes also changed in response to Fe stress in both strains but were not the same in the two strains (Thompson et al. 2011). This highlights the tremendous variability of response to iron stress (and, more generally, nutrient stress) that may exist within a single genus, a variability that may again complicate interpretations of metatranscriptomic data.
Despite its fairly recent discovery (<25 years ago; Chisholm et al. 1988) Prochlorococcus is clearly one of the marine cyanobacteria that has most benefited from the advent of genomics both in culture and in the field. Besides the availability of many genomes in public databases (including 14 accessible to date and a hundred more genomes from single wild Prochlorococcus cells currently in progress; Kelly et al. 2012), a wide set of Prochlorococcus-specific phages has been isolated and sequenced, and a number of studies have enlightened the intimate relationships that link them to their hosts (Lindell et al. 2005, 2007; Avrani et al. 2011; Zeng & Chisholm 2012). This, combined with its high natural abundance in tropical and temperate oceanic waters (Partensky et al. 1999), makes Prochlorococcus one of the rare model organisms that can be studied at all scales of organization, by a so-called ‘cross-scale systems biology’ approach (Coleman & Chisholm 2007). Indeed, it is possible with this organism to link gene content and genome dynamics not only to the local biotic or abiotic environment of the cell (or population of cells), but also to the ecosystem and even to the global ocean. Although the lack of a reliable genetic system somewhat hinders the determination of gene function in this genus, it must be noted that Prochlorococcus possesses only a handful of specific genes (Partensky & Garczarek 2010) and heterologous expression approaches have already been used successfully to characterize some of them (Stickforth et al. 2003; Satoh & Tanaka 2006; Wiethaus et al. 2010). Furthermore, for other genes, inactivation is feasible in the closely related genus Synechococcus (Brahamsha 1996). The latter cyanobacterium also represents a good candidate for cross-scale approaches, given its abundance, ubiquity and complementarity in terms of ecophysiology with regard to Prochlorococcus (Partensky et al. 1999; Scanlan et al. 2009). Like for the latter genus, cyanophages also play a key role in the genome evolution of Synechococcus spp. and many phages with various degrees of host specificity are available in culture (Sullivan et al. 2003; Millard et al. 2009).