• Open Access

Genomics of microalgae, fuel for the future?


  • Rob J. W. Brooijmans,

    1. B-Basic, 2628 BC Delft, the Netherlands.
    2. Bioprocess Technology, Delft University of Technology, 2628 BC Delft, the Netherlands.
    Search for more papers by this author
  • Roland J. Siezen

    Corresponding author
    1. Kluyver Centre for Genomics of Industrial Fermentation, TI Food and Nutrition, 6700AN Wageningen, the Netherlands.
    2. NIZO food research, 6710BA Ede, the Netherlands.
    3. CMBI, Radboud University Nijmegen, 6500HB Nijmegen, the Netherlands.
    Search for more papers by this author

E-mail r.siezen@cmbi.ru.nl; Tel. (+31) 2436 19559; Fax (+31) 2436 19395.

Rising sea levels, global pollution, economic meltdown and general prophecies of doom and gloom are inspired by the pending depletion of the world fossil carbon stores. With modern lifestyle on the rise, and an ever increasing world population driven on by economic advantage, we burn up earth's stores of mineralized corpses and vent CO2 as if there is no tomorrow. The billion dollar question for our society is: Where to get the next big, and preferably clean, fuel injection? Fossil carbon compounds, the incompletely biodegraded remains of animals and plants long gone, originate from atmospheric CO2 via photosynthesis. Therefore, photosynthesis as main driver for generation of de novo fuels has gained attention. First- and second-generation biofuels envision the conversion of plant biomass via the action of microorganisms, into usable organic compounds (alcohols and fats) and hydrogen. However, water and land misuse, deforestation, and rising food prices, for the sake of growing choice biomass-producing crops, have raised major concerns (Frow et al., 2009; Young, 2009). Third-generation biofuels (and chemicals) produced with microalgae have now come forward as an answer to many of these concerns (Tredici, 2010). In general, microalgae do not comprise an evolutionarily related group, and may refer to cyanobacteria (blue-green algae) or eukaryotic algae. Even the eukaryotic algae do not form a single evolutionary branch as it can be used to indicate ‘plant algae’ such as red algae (Rhodophyta), brown algae (Phaetophyta) and green algae (Chlorophyta) or diatoms (Fig. 1).

Figure 1.

Evolutionary relationships of 20 species with sequenced genomes used for comparative analyses, including cyanobacteria and non-photosynthetic eubacteria, archaea and eukaryotes from the oomycetes, diatoms, rhodophytes, plants, amoebae and opisthokonts (Keeling et al., 2005; Ciccarelli et al., 2006). Endosymbiosis of a cyanobacterium by a eukaryotic protist gave rise to the green (green branches) and red (red branches) plant lineages respectively. The presence of motile or non-motile flagella is indicated at the right of the cladogram. Reprinted from Merchant and colleagues (2007) with permission from AAAS.

Some species of microalgae naturally accumulate vast quantities of oils, some above 80% dry-weight (Fig. 2). In comparison, agricultural oil-producing crops such as palm oil and soybean rarely produce more than 5% dry-weight in oils. Furthermore, algae can be grown everywhere where there is plenty of water and sun (including lakes or in the sea) and thus are not necessarily restricted to (or compete with) areas with arable land. Combined with their fast growth rate, microalgae are considered one of the few realistic sources for the production of biofuels and superior to agricultural crop-derived bioethanol (Chisti, 2007; Tredici, 2010). Indeed, in 2009 ExxonMobil pledged $600 million for research in support of photosynthetic algae biofuels programmes, including Craig Venter's Synthetic Genomics. This signals a growing interest in de novo photosynthesis-derived fuels even by established oil-producing companies. Here we present a genomics update of some of Mother Nature's finest microorganisms which can make ‘something from nothing’ with a breath of CO2, a gulp of water, and all the while basking in the sun. Meet the cyanobacteria, green algae and (photosynthetic) diatoms that may lend a helping hand to fuel the human race.

Figure 2.

The green algae Botryococcus braunii lives as a colony of individual cells held together by an extracellular matrix. In this microscopic image, hydrocarbon oils are being released as large droplets from the matrix. Many more smaller oil droplets can be seen as tiny spheres inside each cell. Source: http://newenergyandfuel.com/http:/newenergyandfuel/com/2010/03/16/the-algae-that-makes-petroleum-story/; photo credit: Texas AgriLife Research.


Cyanobacteria, commonly referred to as blue-green algae, are found in diverse habitats (land, salt and fresh water, and extreme environments) and have been intensely studied as model photosynthetic organisms. They are an ancient group whose evolution of photosynthesis led to the initial oxygenation of earth's atmosphere, some 2.5 billion years ago. Even plastids, the essential organelles for photosynthesis in eukaryotic microalgae and higher plants trace back to the engulfment(s) of cyanobacteria. The phylogenetics of the major groups of cyanobacteria have been established by 16S rRNA and now also by using gene order similarity (Markov and Zakharov, 2009). There is a dedicated genome database called Cyanobase that contains exclusively information on completely sequenced genomes of cyanobacteria. An update of Cyanobase with improved data-mining tools was published recently (Nakao et al., 2010). All in all, 156 genomes of cyanobacteria have been sequenced, of which 36 completely (Table 1) (Liolios et al., 2010).

Table 1.  Publicly available complete genomes of cyanobacteria (adapted from the GOLD Database (http://www.genomesonline.org; February 2010).
GoldstampSpecies/strainSize (Mb)Reference or database
  • a. 

    And six other strains.

  • When no literature reference is available the associated sequence database is provided.

  • DOE-JGI, Department of Energy, Joint Genome Institute; JCVI, J Craig Venter Institute.

Gc00667Acaryochloris marina MBIC110176.5Swingley et al. (2008)
Gc00299Anabaena variabilis ATCC294136.4DOE-JGI/NCBI
Gi00176Crocosphaera watsonii WH 85016.2Shi et al. (2010)
Gc01194Crocosphaera watsonii UCYN-A1.4Hewson et al. (2009b)
Gc00760Cyanothece sp. ATCC511425.5Welsh et al. (2008)
Gc00904Cyanothece sp. PCC74245.9NCBI
Gc00933Cyanothece sp. PCC74255.4NCBI
Gc00905Cyanothece sp. PCC88014.7NCBI
Gc01095Cyanothece sp. PCC88024.7NCBI
Gc00160Gloeobacter violaceus PCC74214.7Nakamura et al. (2003)
Gc00706Microcystis aeruginosa NIES-8435.8Kaneko et al. (2007)
Gc00777Nostoc punctiforme ATCC291338.2DOE-JGI/NCBI
Gc00070Anabaena sp. PCC71206.4Kaneko et al. (2001)
Gc00679Prochlorococcus marinus MIT92111.7JCVI/NCBI
Gc00651Prochlorococcus marinus: MIT92151.7Kettler et al. (2007)a
Gc00317Prochlorococcus marinus MIT93121.7DOE-JGI/NCBI
Gc00149Prochlorococcus marinus SS1201.8Dufresne et al. (2003)
Gc00151Prochlorococcus marinus MIT93132.4Rocap et al. (2003)
Gc00152Prochlorococcus marinus pastoris MED41.7Rocap et al. (2003)
Gc00566Prochlorococcus sp. WH 78032.4Genoscope/NCBI
Gc00313Prochlorococcus sp. CC96052.5DOE-JGI/NCBI
Gc00311Prochlorococcus sp, CC99022.2DOE-JGI/NCBI
Gc00319Synechococcus elongates PCC79422.7DOE-JGI/NCBI
Gc00243Synechococcus elongates PCC63012.7Sugita et al. (2007)
Gc00150Synechococcus sp. WH81022.4Palenik et al. (2003)
Gc00416Synechococcus sp. CC93112.6Palenik et al. (2006)
Gc00344Synechococcus sp. JA-2-3B'a(2-13)3.0NCBI
Gc00343Synechococcus sp. JA-3-3Ab2.9NCBI
Gc00746Synechococcus sp PCC70023.0NCBI
Gc00565Synechococcus sp RCC3072.2Genoscope/NCBI
Gc00003Synechocystis sp. PCC68033.6Kaneko et al. (1996)
Gc00096Thermosynechococcus elongates BP-12.6Nakamura et al. (2002)
Gc00398Trichodesmium erythraeum IMS1017.8DOE-JGI

Cyanobacteria have proven to be a rich source for (toxic) bioactive metabolites, many of which are lipopeptides that display antibacterial, antifungal, antialgal, antiprotozoan and even antiviral activity (Rastogi and Sinha, 2009). Some produce polyhydroxyalkanoates (PHAs) as carbon storage compound, a potential precursor for biodegradable plastics. Furthermore, N2-fixating cyanobacteria may be useful as natural fertilizers to stimulate (marine) bioremediation efforts (Abed et al., 2009). In terms of biofuels, the production of hydrogen gas with cyanobacteria, which is released in some conditions as metabolic by-product, has been given most scientific attention (Fig. 3). There are at least three enzymes directly involved in hydrogen production: the nitrogenase, the uptake hydrogenase (associated with N2 fixation encoded by hupSL) and the bidirectional hydrogenase (hoxEFUYH) involved in fermentation and/or photosynthesis. The latter two hydrogenases (NiFe hydrogenases) and the nitrogenase are not present in all cyanobacterial species (Tamagnini et al., 2007; Allahverdiyeva et al., 2010). Mutation of hupL, normally involved in removal of H2 produced by the nitrogenase, led to increased hydrogen production in Nostoc punctiforme ATCC29133 and Anabaena sp. PCC7120 (Lindberg et al., 2002; Masukawa et al., 2002).

Figure 3.

Production of potential biofuels with photosynthetic cyanobacteria. Schematic representation of the metabolism underlying ‘photofermentation’, based on the introduction of a fermentation pathway or a hydrogen evolution pathway (i.e. a hydrogenase) from a chemotrophic organism into a cyanobacterium. Coupling between the endogenous metabolism of the phototrophic organism and the (heterologously encoded) pathways may occur through central metabolites like glyceraldehyde-3-phosphate or NADPH (and ATP). Reproduced from Angermayr and colleagues (2009) with permission from Elsevier.

Nitrogen fixation is notoriously oxygen sensitive, which is problematic for a cyanobacterium that produces oxygen during the day as a by-product of photosynthesis. In 2008, the first complete genome sequence of the nitrogen-fixing cyanobacterium Cyanothece 51142 was published, revealing a large continuous cluster of nitrogen-fixation genes (Welsh et al., 2008). A recent paper studied the gene expression in response to the circadian rhythm (or how oxygen-producing photosynthesis and oxygen-sensitive nitrogen fixation are temporally regulated) in Crocosphaera watsonii WH 8501 (Shi et al., 2010).

In general, for efficient hydrogen production, metabolic engineering to minimize ‘loss of electrons’ to competing pathways such as respiration and the Clavin cycle is envisaged. In particular, the well-studied, fast-growing and genetically malleable Synechocystis sp. PCC6803 has received attention in this regard (Angermayr et al., 2009). In fact, mutants of Synechocystis with a defective ndhB gene or Thermosynechococcus elongatus with a fused hydrogenase to the peripheral PSI subunit Psa showed that it is feasible to stimulate hydrogen production by redirection of electron flow (Tamagnini et al., 2007). Recently, the proof-of-principle that ethanol can be produced photosynthetically with Synechocystis was shown as well, by introducing the pdc and adh genes (pyruvate decarboxylase and alcohol dehydrogenase respectively) from Zymomonas mobilis. In this study, a genome-scale metabolic model was constructed for Synechocyctis PCC6803, which will further catalyse metabolic engineering efforts, including other metabolic products (Fu, 2009).

Two other genome sequences were recently published of Cyanothece sp. ATCC51142 and Acaryochloris marina MBIC11017 (Table 1). The latter is of particular interest as it uniquely uses chlorophyll d as predominant photosynthetic pigment and produces α-carotene instead of β-carotene. The presence of chlorophyll d enables it to more efficiently absorb light of slightly longer (∼30 nm) wavelengths, giving it a competitive advantage in certain lighting conditions (low visible, high infrared light intensity). Genes associated with chlorophyll d production could not be clearly identified, however. The gene encoding for lycopene cyclase (crtL) was proposed to be involved in α-carotene production (Swingley et al., 2008).

Green algae (Chlorophyta)

Most eukaryotic unicellular ‘plant’ microalgae of biotechnological interest belong to the Chlorophyta (green algae). About 31 genomes of chlorophytes have received sequencing efforts and five genomes (four species) have been completely sequenced (Table 2) (Liolios et al., 2010).

Table 2.  Completely sequenced genomes of green algae and diatoms (adapted from the GOLD Database (http://www.genomesonline.org; February 2010).
GoldstampSpecies/strainSize (Mb)Reference
Green algae (Chlorophyta)   
 Gc01017Micromonas pusilla CCMP154522Worden et al. (2009)
 Gc01017Micromonas pusilla RCC29921Worden et al. (2009)
 Gc00664Chlamydomonas reinhardtii100Merchant et al. (2007)
 Gc00592Ostreococcus lucimarinus CCE990113Palenik et al. (2007)
 Gc00419Ostreococcus tauri OTH9513Derelle et al. (2006)
Diatoms (Bacillariophyta)   
 Gi01575Fragilariopsis cylindrus CCMP110281DOE-JGI
 Gc00872Phaeodactylum tricornutum CCAP1055/130Bowler et al. (2008)
 Gc00223Thalassiosira pseudonana CCMP133525Armbrust et al. (2004)

The metabolism of Chlorophyta is studied in detail as platform to convert CO2 and water to hydrogen, bioethanol or biodiesel (lipids). Ostreococcus sp. were a logical first choice for sequencing efforts as they are relatively simple (no flagella and a single mitochondrion and chloroplast) and have small genomes. Short intron sequences, gene fusions, chromatin reduction and lack of thiamin and vitamin B12 synthesis all contribute to this genome compaction. Both Ostreococcus genomes encode a relatively large number of putative selenoproteins, containing selenocysteine. These selenoproteins are more catalytically active than their cysteine counterparts potentially reducing the required protein levels (Palenik et al., 2007). Interestingly Ostreococcus tauri appears to lack many genes that otherwise encode subunits associated with the light-harvesting complex of photosystem II. Instead, it appears to have paralogues of a prasinophyto-specific antenna (Derelle et al., 2006). Chlamydomonas reinhardtii has been completely sequenced and is genetically accessible with a molecular toolbox. Thus it can be metabolically engineered for optimal biofuel production (Merchant et al., 2007; Beer et al., 2009). Chlamydomonas reinhardtii is of general interest as it has retained components of the animal-plant ancestor, such as flagella and chloroplasts. Furthermore, it has a specialized organelle, called the eyespot that enables it to sense light and respond accordingly. The number of genetic C. reinhardtii mutants that exhibit interesting phenotypes with regard to hydrogen, biodiesel or bioethanol production is still fairly limited. Disruption of the hydEF genes, encoding a hydrogen maturase enzyme, disabled hydrogen production but also led to accumulation of succinate (Dubini et al., 2009). Production of hydrogen can be stimulated under sulfur-limiting conditions. Mutants that carried an amino acid substitution of protein D1 of the PSII reaction centre were found to yield substantially higher levels of hydrogen under sulphur-limited conditions (Torzilloa et al., 2009).

A good understanding of lipid metabolism is essential to provide handles to modify lipid storage composition and for flux rerouting. A good review of the genes and enzymes involved in lipid metabolism in C. reinhardtii has been recently published (Moellering et al., 2009). In general, fatty acids are formed in the plastids and converted to di- and triglycerides in the cytosol. The lipids in microalgae typically contain triglycerides and unsaturated bonds which are sensitive to oxidation, and thus negatively affect fuel storage. Thus altering lipid composition may be of interest. Although fatty acid biosynthesis enzymes are typically encoded by single genes, a recent mutant bank of C. reinhardtii provided 80 mutants with altered fatty acid biosynthesis activity, which demonstrates the complexity of the underlying genetic machinery involved (Beer et al., 2009). Nevertheless, initial strides are being made to alter and reroute the hydrocarbon storage pools in green algae.

A recent paper describes the genomes of Micromonas pussila strains RCC299 and CCMP1545 that were isolated from opposing parts of the globe, one from the Pacific Ocean near Australia (RCC299) and the other from the North Sea area (CCMP1545). Their genome sequences were compared in the light of ecological divergence and evolution. Together with the available Ostreococcus genomes, they shed light on the conserved genome content that was thus likely also present in the ancestral proto-prasinophyte (the ancestor of simple green algae and plants) (Worden et al., 2009).


Diatoms (Bacillariophyta) are a group of protist microalgae that inhabit oceans, rivers and lakes in a bewildering number of species and exquisite shapes (Fig. 4).

Figure 4.

Various shapes of diatoms. Reproduced from UW-Madison Department of Botany (http://botit.botany.wisc.edu/images/130/Protista_I/Diatom_Images/Grouped_diatoms_MC_.jpg.html).

Blooming when nutrient and light conditions are favourable, mainly in boreal and temperate seas, they are believed to account for one-fifth of the primary production on earth. Via the aquatic food chain, fish stocks ultimately rely on diatom photosynthesis, both at the surface and at greater depths. In fact, the relatively large proportion of diatomic biomass that sinks to the bottom is only partially consumed by the deep-sea inhabitants, the remaining part contributing to the formation of petroleum deposits (Armbrust et al., 2004). Diatoms are thought to have acquired their photosynthetic machinery by engulfing a red alga (secondary endosymbiosis) rather than by engulfing a photosynthetic cyanobacterium. Many genes, however, of green algae origin have also been identified, in the genomes of Thalassiosira and Phaeodactylum, indicating more complex gene acquisitions (Moustafa et al., 2009). A specific database, the Diatom EST Database, is devoted to information about diatom expressed-sequence tags and the genome sequences (Maheswari et al., 2009). Eleven diatom genomes have received sequencing attention, but to date only three genomes of diatoms have been completely sequenced (Table 2) (Liolios et al., 2010). Nevertheless, the two major classes of diatoms are represented: the bi/multipolar centrics (Thalassiosira pseudonana) and the pennates (Phaeodactylum tricornutum). About 57% of the genes found in P. tricornutum have homologues in T. pseudonana and both have acquired a remarkable number of bacterial genes (after secondary endosymbiosis), a degree of magnitude higher than found in other free living eukaryotes (Bowler et al., 2008).

The frustule or silica shell of diatoms represents one of Nature's finest examples of precision architecture. It is composed of hydrated SiO2, encased in a small amount of organic matter. The genomes revealed an array of genes that are putatively involved in silicon biochemistry. These include genes encoding silicic acid transporters, many spermidine and spermine synthase-like enzymes, silaffins and frustulins (casing glycoproteins). Up to fourfold more genes putatively encoding spermidine and spermine synthases can be found in diatom genomes than in other organisms. Spermidine and spermine synthases are thought to be involved in long-chain polyamines formation that control silica deposition. Furthermore, at least 22 putative chitinases were identified in T. pseudonana. Chitin fibres, extending from the silica cage, are thought to limit sinking and can account for up to 40% of the biomass (Armbrust et al., 2004). Many diatom-specific cyclins were also found in the genomes of P. tricornutum and T. pseudonana. It is easy to speculate that replication within a glass cage necessitates some unique regulatory controls (Bowler et al., 2008). For all its beauty, the actual function of the silica shell remains somewhat unclear. It may reduce predation by grazers. Nevertheless, the shell has a density which is greater than the surrounding seawater, which makes diatoms prone to sinking, and with diminishing access to sunlight, to starvation. For diatoms, increasing buoyancy to compensate for the density of the silica shell may be one of many reasons why diatoms produce oil stores (Ramachandra et al., 2009). Not much genetic engineering has been done to alter or increase fatty acid production by diatoms, although mutagenesis has been successfully applied to obtain Nannochloropsis oculata mutants with altered unsaturated fatty acid composition (Chaturvedi et al., 2004). In addition to growing diatoms as algae for biofuels, the discovery of the genes that are associated with the silica cage formation may provide handles for future manipulation of the silica nanostructure to catalyse nanobiotechnological applications. These could include applications in microelectronic devices, biological and chemical sensing and nanofiltration (Bozarth et al., 2009). Indeed, the unique features of diatoms suggest many applications, from simple ones such as using the natural iridescence of diatom shells in cosmetic products to exploiting the silica structures as drug-delivery vehicles (Gordon et al., 2009).


Despite the growing number of completed microalgae genome sequences, only a few examples of genetic engineering of the metabolism for the production of biofuels are reported. The complexity of fatty acid metabolism may be one reason, the fact that many isolates already exist in nature with superior production characteristics another. For example, several new isolates of cyanobacteria from the Baltic Sea and Finnish lakes were found to produce more hydrogen than specifically engineered hydrogen producers [mutants of Anabaena PCC7120 and N. punctiforme ATCC29133 (Allahverdiyeva et al., 2010)]. Furthermore, a recent review suggested Amphora (Bacillariophyta), Ettlia oleoabundans (Chlorophyta), Ankistrodesmus falcatus (Chlorophyta), Chlorella sorokiniana (Chlorophyta) and Tetraselmis suecica (Chlorophyta) as interesting microalgae, at least in terms of lipid productivity (Griffiths and Harrison, 2009). Note that many of the sequenced species were included in this study, but deemed not to be choice commercial lipid producers. Nevertheless, metabolic engineering in model strains of microalgae will likely provide important leads for proof-of-principle studies in the future. Furthermore, the complete genome sequences provide a valuable framework for the huge amounts of marine metagenome sequences which are currently being generated (Bowler et al., 2009; Hewson et al., 2009a).

Biofuel production with microalgae needs to be economically competitive with regards to the time-refined oil industry which simply ‘pumps oil from the ground’. But even with lipid accumulations reaching as high as 80% dry-weight for some microalgae, economic viability and environmental impact remain a concern (Clarens et al., 2010; Tredici, 2010; van Beilen, 2010). At AlgaePARC of the Wageningen University Research Centre (the Netherlands) a pilot plant centre is being set up which reflects the present development of several reactor concepts at laboratory scale, and will enable a rigorous comparison between systems, selection and, ultimately, the development of new, more competitive and efficient systems and strategies for scaling up (Fig. 5). What further increase in lipid content by metabolic engineering can be expected? The immediate future for metabolic engineering with microalgae may lie in the (over)production of high-value chemicals or biomass components. For example, the unique bioactive compounds of cyanobacteria that keep swimmers from summer lakes may find pharmaceutical interest. In green algae the production of the ‘omega fatty acids’, a popular food supplement with high economic value, could be increased (Pulz and Gross, 2004). With more completed diatom genome sequences, key genes may be identified that dictate the building plans of the myriad of silica–dioxide structures. Indeed, fine tuning of high-precision cheap bioproduction of nanoscale components reads like a ‘patent pending’. Therefore, microalgae have much more to offer us than a cheap hit of diesel. Let us savour them for their unique metabolism and their beauty now, and take comfort in the thought that they will be there to save us when the last barrels of oil are hauled from the earth.

Figure 5.

WUR AlgaeParc: the park will initially comprise four large outdoor pilots (25 m2) that are operated simultaneously with the same strains and feeds, allowing a direct comparison between long-term performance of the systems. The four systems will comprise a horizontal tubular reactor, a vertical tubular reactor, a flat panel and an open pond which will serve as control, since it is the most used system worldwide. The other three systems (photobioreactors) are based on state of the art technology and were chosen to allow addressing the most fundamental aspects of photobioreactor design (oxygen accumulation and light intensity). Adapted from http://www.algae.wur.nl/UK/projects/AlgaePARC/.


This project was carried out within the research programme of the Kluyver Centre for Genomics of Industrial Fermentation which is part of the Netherlands Genomics Initiative/Netherlands Organization for Scientific Research.