Decoding algal genomes: tracing back the history of photosynthetic life on Earth




The last decade has witnessed outstanding progress in sequencing the genomes of photosynthetic eukaryotes, from major cereal crops to single celled marine phytoplankton. For the algae, we now have whole genome sequences from green, red, and brown representatives, and multiple efforts based on comparative and functional genomics approaches have provided information about the unicellular origins of higher plants, and about the evolution of photosynthetic life in general. Here we present some of the highlights from such studies, including the endosymbiotic origins of photosynthetic protists and their positioning with respect to plants and animals, the evolution of multicellularity in photosynthetic lineages, the role of sex in unicellular algae, and the potential relevance of epigenetic processes in contributing to the adaptation of algae to their environment.


The term ‘algae’ designates predominantly aquatic photosynthetic eukaryotes that vary from unicells a few microns in diameter to complex multicellular forms such as the giant kelps that can reach more than 30 m in length (Bhattacharya and Medlin, 2004). Most algae are photosynthetic organisms that acquired the trait some 1.8 billion years ago after an unknown non-photosynthetic unicellular eukaryote engulfed or was invaded by a photosynthetic cyanobacterium. This trait ultimately resulted in a cell containing a photosynthetic plastid surrounded by two membranes with a highly reduced cyanobacterially derived genome (Gould et al., 2008; Kutschera and Niklas, 2005; Margulis, 1993; Parker et al., 2008). This event, known as the primary endosymbiosis or symbiogenesis (Figure 1), gave rise to the extant Archaeplastida, which include the Glaucophyta, Rhodophyta (red algae) and Chloroplastida (green algae and land plants) (Simon et al., 2009).

Figure 1.

 Schematic diagram summarizing endosymbiotic events and their consequences [adapted from Archibald and Keeling (2002) and Keeling (2010)].
The first photosynthetic eukaryotes evolved from the engulfment or invasion of a photosynthetic prokaryote by an aerobic heterotrophic eukaryote. This primary endosymbiosis has given rise to rhodophytes (red algae), chlorophytes (green algae and land plants), and glaucophytes (a minor group of freshwater algae that has retained a peptidoglycan cell wall around their plastid). Serial secondary endosymbiotic events subsequently occurred involving red and/or green algae that were engulfed by other heterotrophic eukaryotes, ultimately giving rise to a wide range of other algal groups. The events leading to the heterokonts, and perhaps all members of the SAR group, have been proposed to involve serial secondary endosymbioses with a green and subsequently a red alga (Moustafa et al., 2009). Numerous studies have shown that nuclear-encoded proteins that function in mitochondria and plastids are of diverse endosymbiotic evolutionary origins (Reumann et al., 2005; Richards et al., 2006; Moustafa et al., 2009).

It appears that primary endosymbiosis occurred only once, so all plastids are believed to have a monophyletic origin (McFadden and van Dooren, 2004; Rodriguez-Ezpeleta et al., 2005). Notwithstanding, ultrastructural, biochemical and genetic evidence have revealed that additional secondary endosymbioses have occurred multiple times and have given rise to a variety of other photosynthetic eukaryotes, including cryptophytes, haptophytes, heterokonts, and dinoflagellates (Delwiche and Palmer, 1996; Kutschera and Niklas, 2005, 2008). Such events involved a second heterotrophic eukaryote that engulfed not a cyanobacterium but a green or red photosynthetic eukaryote (Figure 1), and resulted in plastids with three or four membranes around them. Both the primary and secondary endosymbioses caused massive losses of genes from the engulfed genome, many of which have been transferred to the host genome (see review by Green in this issue). This process of endosymbiotic gene transfer (EGT) has contributed widely to the evolution of algae.

The fragile protists and their place in the eukaryotic super groups

Historically, the unicellular algae were classified together with other unicellular life forms into a broad kingdom first defined by Ernst Haeckel (1834–1919) as the Protista (Haeckel, 1866). The kingdom conveniently grouped together everything that could not be classified as fungi, animals or plants, but it subsequently became clear that the grouping was highly unsatisfactory because it incorporated organisms with no phylogenetic affiliations, in fact belonging to 30–40 disparate phyla (Simonite, 2005).

Conversely, the grouping of eukaryotes into four supergroups is now gaining some acceptance. Using extensive taxon sampling and sequence comparisons, Burki et al. (2007) proposed the following classification: (i) Unikonts, which comprise Opisthokonts (including animals and fungi) and Amoebozoa; (ii) Archaeplastida; (iii) Stramenopiles; Alveolates, and Rhizaria (SAR); and (iv) Excavates and Discicristates (Figure 2) (Baldauf, 2008). Molecular phylogenetics has revealed that the protists are deeply rooted within many of these groups, so even though the term has no phylogenetic meaning, Haeckel’s view of Protista as ‘a kingdom of primitive life forms’ (Rothschild, 1989) was nonetheless correct. Consequently, although controversies are likely to continue about when a protist stops being a protist and becomes an animal, plant, or fungus, and about how to position macroalgae with respect to them, the concept of a protist as having ancient origins is valuable, with the caveat that modern day protists may not necessarily be good proxies for their ancestors (Figure 3).

Figure 2.

 Eukaryote phylogenetic tree derived from different molecular phylogenetic and ultrastructural studies (adapted from Baldauf, 2008).
Stars indicate algal species for which genome sequences or expressed sequence tags (ESTs) are available, and images from the representative species (Volvox carteri, a representative of chlorophytes (from the Joint Genome Institute (JGI) website:, Cyanophora paradoxa for glaucophytes, Micromonas for prasinophytes, Chroomonas for cryptophytes, Emiliania huxleyi for haptophytes, Ectocarpus siliculosus for brown algae (all images courtesy of CCMP, the Culture Collection of Marine Phytoplankton at Bigelow Laboratory for Ocean Sciences) and Phaeodactylum tricornutum for diatoms (image courtesy of Atsuko Tanaka). Numbers 1–4 represent the different super groups of eukaryotes (see text for further information). Red arrows point to the plant, animal and fungal kingdoms.

Figure 3.

 Scheme emphasizing the ancestral nature of protists.
Protists are eukaryotes because they all have a nucleus, but they do not represent a single clade. They display different combinations of diverse trophic modes, motility mechanisms, and life cycles. They are either single celled or multicellular but do not display tissue differentiation.

As we highlight below, analysis of algal genomes has indeed shed light on the early origins of photosynthetic eukaryotes in more striking ways than have studies of plant genomes. The focus of this review is the information that has been derived from analysis of whole genome sequences. Several issues will be discussed such as algal genome and epigenome organisation, the origins of multicellularity, and algal adaptations to their environment.

Update on algal genome sequencing

In spite of their relative simplicity, algae have been the poor relations of higher plants due to the small size of the algal research community and the lack of genetic and molecular tools for assessing gene function. The only alga that has attained the status of model system over the last decades is the green alga Chlamydomonas reinhardtii, studies with which have revealed fundamental insights into the mechanisms of photosynthesis and the evolutionary origins of flagella (Proschold et al., 2005; Merchant et al., 2007). More recently, other algae have become accessible to molecular investigations thanks to the development of tools for reverse genetics. For example, the diatom Phaeodactylum tricornutum and the prasinophyte Ostreococcus tauri can be transformed and are accessible to gene knockdown techniques that exploit RNA interference (RNAi) (Corellou et al., 2009; De Riso et al., 2009).

Why is there this renewed interest in the algae? By generating oxygen through photosynthesis, sequestering large amounts of atmospheric CO2 in the ocean interior (Field et al., 1998), and providing food for other organisms, marine algae have affected the Earth’s geochemistry and climate, and are important buffers against global warming (Bowler et al., 2010). By altering the atmosphere and climate, they permitted the evolution of more complex life forms including ourselves (Finazzi et al., 2010), and because of their primitive origins they are of interest for tracing back the evolution of life on Earth. Algae are also receiving interest as a potential source of biofuel (Chisti, 2008; Radakovits et al., 2010; Scott et al., 2010; Khozin-Goldberg and Cohen, 2011). Furthermore, some algae form inorganic exoskeletons made of amorphous silica or calcium carbonate that display unique combinations of structural, mechanical, chemical and optical features. They are, therefore, of interest for building a variety of devices in large quantities and at low cost with potential applications for drug delivery, biomolecule separation, and computer chip manufacturing (Kroger and Poulsen, 2008; Gordon et al., 2009). Single-celled algae are also valuable model systems for studying cellular functions such as motility and cell division.

As a direct result of this revived interest in the algae, several whole genome sequences have become available over the last decade. The first reported genome sequence of an alga, from the red extremophile Cyanidioschyzon merolae, was published in 2004 (Matsuzaki et al., 2004). During subsequent years, a number of other algal genomes have been reported, from the diatoms Thalassiosira pseudonana (Armbrust et al., 2004) and P. tricornutum (Bowler et al., 2008), from the brown macroalga Ectocarpus siliculosus (Cock et al., 2010), from four different prasinophytes [two strains of Ostreococcus (Derelle et al., 2006; Palenik et al., 2007) and two strains of Micromonas (Worden et al., 2009)] and three chlorophytes C. reinhardtii (Merchant et al., 2007), Volvox carteri (Prochnik et al., 2010) and Chlorella variabilis (Blanc et al., 2010) (Table 1). Several other algal genomes have been completed, the polar diatom Fragilariopsis cylindrus (, the haptophyte Emiliania huxleyi (, and the pelagophyte Aureococcus anophagefferens (, or are near completion (e.g., the cryptomonad Guillardia theta and the toxic diatom Pseudo-nitzschia multiseries). A range of expressed sequence tag (EST) projects have also been launched (Table 1). As algal genome sequences have become available, comparative genomics approaches have been used to investigate fundamental questions such as algal metabolisms, evolutionary histories, and the origins of multicellularity.

Table 1.   Update on available algal genome sequences and ongoing and future genome sequencing projects Thumbnail image of

Unicellular algae, the beginning

Because the symbiogenic events that gave rise to the algae occurred so long ago it is a major challenge to reconstruct the key events. But while it is considered that the primary endosymbiosis occurred only once, evidence for a second more recent primary endosymbiosis has recently been shown for Paulinella chromatophora, an amoeba that contains photosynthetic inclusions of cyanobacterial origin known as chromatophores (Bhattacharya and Archibald, 2006; Theissen and Martin, 2006; Nowack et al., 2010). Although not bona fide chloroplasts, evidence has been found for gene transfer from the chromatophores to the Paulinella nuclear genome and so they have been proposed to be a photosynthetic organelle in an early stage of evolution (Nakayama and Ishida, 2009; Nowack et al., 2010; Reyes-Prieto et al., 2010).

Another kind of association that can assist us in understanding the early events of endosymbiosis is a phenomenon known as kleptoplasty, meaning literally to ‘steal’ chloroplasts. More precisely, algal cells are ingested by heterotrophic eukaryotes and the chloroplasts are retained in a functional state for considerable periods of time, during which the host can benefit from the photosynthetic products generated by the sequestered chloroplast. A quite remarkable case is seen in the dinoflagellate Dinophysis acuminata, which forms a ‘ménage à trois’ with the ciliate Mesodinium rubrum and the cryptomonad Geminigera cryophila. This rather complex association took some time to understand, but it was ultimately found that the dinoflagellate obtains its plastids by ingestion of the ciliate that had previously obtained its plastid by kleptoplasty from the cryptomonad (Gustafson et al., 2000; Park et al., 2006; Johnson et al., 2007; Wisecaver and Hackett, 2010). Remarkably, transcriptome analysis subsequently led to the identification of proteins complete with plastid-targeting peptides in the nuclear genome of D. acuminata, suggesting that it has some functional control of its plastid (Wisecaver and Hackett, 2010).

The above examples represent valuable experimental systems for understanding the process of EGT even in the absence of whole genome sequence information. Following the availability of algal genome sequences, it has been possible to identify genes in the process of EGT (see Green review in this issue). A broader question that can be addressed is whether whole genome sequences can provide additional support for the endosymbiotic origins of the different algal groups, beyond the ultrastructural, biochemical, and genetic evidence on which the theories are based (Margulis, 1993; Kutschera and Niklas, 2004, 2005, 2008). The answer was a resounding yes for the green and red algae (Matsuzaki et al., 2004; Merchant et al., 2007; Worden et al., 2009). For the diatoms, however, a surprise was in store. Whereas the secondary endosymbiotic origins of diatoms and other heterokonts were thought to involve a red alga and a eukaryotic heterotroph, phylogenomic evidence has now emerged for a third partner, a green alga, that most likely preceded the red algal endosymbiont (Moustafa et al., 2009) (Figure 1). In particular, around 1700 of the approximately 10 000 genes present in diatoms appear to have green algal origins, whereas only around 300 have clear red algal affiliations. The origins of diatoms (and perhaps all heterokonts), therefore appears to have arisen from another case of ‘ménage à trois’.

Genome structure and gene content

In vertebrates, angiosperms, and cereals there is ample evidence for genome duplications prior to major species radiations (Abi-Rached et al., 2002; McLysaght et al., 2002; De Bodt et al., 2005). However, no evidence has been found for such phenomena in secondary endosymbionts. In effect, a secondary endosymbiosis is not unlike a wholesale whole genome duplication event because the process brings together two related but divergent genomes upon which evolution can select beneficial functions prior to genome streamlining. EGT can therefore provide a huge gene pool from which algal diversity can emerge. Additionally, comparison of diatom genomes with other sequenced organisms has revealed the presence of hundreds of genes acquired by horizontal gene transfer (HGT) from bacteria (Bowler et al., 2008). These genes appear to have diverse origins, and some may have been acquired as a result of intricate symbiotic associations between diatoms and bacteria (Bowler et al., 2010). The role of viruses as agents of gene transfer also appears to be significant in aquatic environments and is far more dynamic than expected because some marine viruses have megabase-sized genomes with the potential to carry significant quantities of DNA acquired from their hosts or from elsewhere (Rohwer and Thurber, 2009; Moreau et al., 2010). Pervasive HGTs in aquatic environments are likely to provide additional adaptive value and their impact on genome evolution can be reinforced by continuous duplications of the transferred genes and consequent diversification of the resulting paralogs (Nosenko and Bhattacharya, 2007).

Another aspect revealed by algal genome sequencing is the diversity of transposable elements, some of which form new classes. As a case in point, the P. tricornutum genome was found to be especially enriched in a diatom-specific class of copia retrotransposons, the expression of some of which are responsive to stress events such as nutrient starvation (Maumus et al., 2009). It is, therefore, likely that transposable elements have contributed to the architecture and evolution of algal genomes as well.

As with the acquisition of new genes through EGT and HGT, gene loss, duplication and fusion have also contributed to algal genome size. Genome size varies greatly in eukaryotes and it seems not to be related to complexity but rather with genome organisation such as gene distribution. A typical example is seen in Ostreococcus (Blanc et al., 2010). Ostreococcus tauri has 8166 protein coding genes distributed over 20 chromosomes with an overall gene size of 1.54 kb/gene (Table 2). This number is close to what is observed in the tiny nucleomorph genomes of G. theta, Hemiselmis andersenii and Bigelowiella natans [around 1.1 kb/gene (Archibald and Lane, 2009)]. Micromonas species harbor around 10 000 genes with an average size of approximately 2.2 kb/gene (Worden et al., 2009), while more complex eukaryotes such as Arabidopsis and human have much larger genes (Table 2). Surprisingly the unicellular green alga C. variabilis has gene sizes similar to C. reinhardtii and Arabidopsis thaliana, even though it has an endosymbiotic lifestyle. This mode of existence has not therefore resulted in altered gene size, in spite of genome size reduction. It would be interesting to do comparative studies between nucleomorphs and other eukaryotes with bigger genomes to understand the evolution of genome size and gene size.

Table 2.   Comparison of genome size and gene size in different species Thumbnail image of

Whole genome sequencing has also provided information about chromosome structure. The C. merolae genome contains several chromosomes with single A+T-rich regions proposed to be putative centromeres (Matsuzaki et al., 2004). This was indeed confirmed in another study (Maruyama et al., 2008). However these A+T-rich regions are not associated with repetitive sequences like most centromeric regions in other species but resemble rather the ‘point’ centromere structures found in Saccharomyces cerevisiae (Sullivan et al., 2001). Ostreococcus tauri seems to have the same distribution of A+T-rich regions although this is not as obvious as in C. merolae (Derelle et al., 2006). Additional studies are clearly required among other algae to throw more light on centromere structure. For example, diatoms appear to have unorthodox mechanisms for attaching the chromosomes to the mitotic spindle during cell division (De Martino et al., 2009), perhaps suggesting a novel centromere structure in these organisms.

A highly unusual chromosome structure characterizes dinoflagellates, a large and diverse group of eukaryotic algae that appears to have common ancient relations with stramenopiles such as diatoms (Figures 1 and 2). They have very high amounts of DNA, often larger than the human genome, and it is packaged in a liquid crystal state (Bouligand and Norris, 2001). They also lack histones but have histone-like proteins, and their chromosomes are constantly condensed with strands of DNA at the periphery which might be the only ones that are transcribed. How these huge permanently condensed chromosomes are packaged within the cell nucleus without histones and nucleosome structures is a fascinating question. Phylogenetic analysis supports the origin of the histone-like protein of the dinoflagellate Crypthecodinium cohnii from bacteria (Wong et al., 2003). Although challenging, the genomics of dinoflagellates may be key to understanding the evolutionary development of histones and nucleosomes, which ultimately led to more efficient packaging of DNA into structured chromosomes in other eukaryotes.

The cocktail of genes derived from endosymbiotic and horizontal gene transfer in diatoms has provided them with a remarkable capacity to perform a range of novel metabolisms, such as a urea cycle fueled by the products of a carbamoyl phosphate synthase gene derived from the eukaryotic host and a carbamate kinase gene derived from bacteria (Allen et al., 2006). Other bacterial genes uniquely interconnect proline and polyamine metabolism with the urea cycle, which may be of importance for construction of the diatom-specific silicified exoskeleton (Bowler et al., 2010).

Adaptation to environmental stresses and nutrient availability has contributed to the divergence of related species and to defining ecological niches. Among red algae, Galdieria sulphuraria and Cyanidioschyzon merolae are characteristic of acid thermal environments, and so require genes for survival in these extreme conditions (Barbier et al., 2005). Notwithstanding, it was shown that more than 30% of the Galdieria sequences did not relate to any of the C. merolae genes. Among them, a large number of membrane transporters and carbohydrate metabolism genes are proposed to underlie the metabolic versatility of G. sulphuraria. Conversely, the presence and expression of genes such as ferritin and LHCX in some but not all diatoms allows survival in particular conditions, such as in chronically limiting iron or in variable light environments (Marchetti et al., 2009; Bailleul et al., 2010). Comparative studies in unicellular algae have also revealed a striking number of proteins with unknown functions. These proteins seem to be associated with certain metabolic regimes but have no clear function (Maheswari et al., 2009). Many are likely to have critical roles as they have been maintained during evolution. Understanding their function is therefore an important prerogative for gaining insights into algal biology.

Sex in unicellular algae

Although we have a good understanding of the nature and function of sex chromosomes in mammals, little is known about sex determination in other organisms. The plant kingdom includes many taxonomic groups with sex chromosomes, such as mosses (Marchantia polymorpha), gymnosperms (Gingko biloba), monocotyledonous and dicotyledonous angiosperms (Asparagus officinallis and Silene latifolia, respectively). As in humans, this latter species possesses large heteromorphic sex chromosomes, denoted XY in males and XX in females (Vyskot and Hobza, 2004). Ostreococcus tauri also displays a heterogeneous genome with two of its chromosomes having divergent G+C content and transposable element distribution with respect to the other 18. For one of them, chromosome 2, a suppression of recombination has also been noted and so it has been hypothesized to be a sex chromosome (Derelle et al., 2006) (Figure 4). Analysis of individual clones isolated from the environment has provided further support for cryptic sexual cycles in Ostreococcus spp. (Grimsley et al., 2010). The C. reinhardtii and V. carteri genomes contain more conventional sex chromosomes, with mating loci that are linked and that segregate in a Mendelian fashion (Merchant et al., 2007; Ferris et al., 2010).

Figure 4.

 Chromosome structure in Ostreococcus tauri.
Two unusual chromosomes of O. tauri, chromosome 2 and 19, are enriched in transposable elements and bacterial genes and have been hypothesized to represent sex chromosomes. Taken from Derelle et al. (2006). Coloured bars indicate the percentage of G+C content (upper bar) and of transposable elements (lower bar).

Comparative genomics of single celled algae has also revealed a number of known meiosis-specific proteins and homologues of proteins promoting gametic cell wall disintegration and gamete fusion in species such as Chlorella variabilis that have not been reported to have a sexual cycle (Blanc et al., 2010). This finding is furthermore supported by the presence of several homologues of Chlamydomonas flagellar proteins, even though Chlorella is not motile. Conversely, sexual cycles have not been reported in T. pseudonana nor P. tricornutum, and yet they have also retained meiosis-related genes (Armbrust et al., 2004; Palenik et al., 2007; Bowler et al., 2008), as have Ostreococcus spp. (Grimsley et al., 2010).

Also of interest with respect to sexual life histories, it has been found that the haptophyte E. huxleyi can escape infection with a giant virus denoted EhVs when in its haploid form, but is susceptible to infection during the diploid stage of its life cycle (Frada et al., 2008). The mechanisms that confer viral resistance on the haploid phase remain unclear but may be due to membrane composition changes in haploid cells that prevent the virus from attaching to the cell. Indeed, genome analysis of the EhV86 virus revealed an unexpected cluster of sphingolipid biosynthesis genes that seem to have a unique substrate preference for myristoyl coenzyme A rather than palmitoyl coenzyme A (Wilson et al., 2005). When the lipid composition of uninfected and infected host cells were compared (Vardi et al., 2009), the glycosphingolipids in uninfected host cells appeared to be composed predominantly of hydroxyl-sphingolipids derived from palmitoyl-CoA that apparently prevents them from being infected. It should also be noted that the haploid cells lack external coccoliths.

The lack of information about meiotic cycles in algae might be due to sex determination processes controlled by the environment, thereby making it harder to follow and study. If so, the phenomenon of life cycle phase transition to escape virus attack could be broader than thought. By analogy with E. huxleyi, algae with meiosis-related genes but no known sexual cycle might have virus-resistant haploid phases that might only become detectable after viruses have decimated the diploid phase. Therefore, the need for sequencing more algal genomes is necessary to make meaningful comparisons for better understanding sex chromosome origin and evolution.

Evolving into multicellular algae

Early life on Earth was most likely single celled. The passage from unicellular to multicellular life forms appears to have occurred several times during the evolution of eukaryotes (Cavalier-Smith, 2006). Several hypotheses speculate on the origin of multicellularity such as the failure of separation of daughter cells after unicellular organisms divided, followed by the subsequent maintenance as a group of cells or the aggregation of a group of functionally similar cells that pursued their growth as a unit. Fossilized remains of ancient life forms can be a big help for investigating such hypotheses, but because the first multicellular organisms probably lacked hard body parts they have not been well preserved (Kutschera and Niklas, 2004). Today, we can look around and compare existing species, from simple single celled organisms to species with more complex body plans, clumps of cells or more organized body architectures to learn more about this transition.

Comparative genomics also provides an opportunity to look back over geologically relevant time scales to explore how multicellular life might have happened. The transition from simple to highly organized organisms took place independently and repeatedly in the past. Was this transition to cope with environmental changes that contributed to the emergence of features such as cell-cell communication and coordination? Is this the job of newly acquired genes or duplicated gene families? It has been suggested from a comparative genomic study that recruitment of existing genes for new functions and the evolution of novel proteins by combination of existing protein domains contributed to the emergence of organismal complexity (Prochnik et al., 2010). Specifically, Prochnik et al. (2010) used two related green algae, the multicellular V. carteri and the single celled C. reinhardtii, to show that the number of protein coding genes is the same in each, with only a few cases of species-specific proteins. However, V. carteri displays crucial features that distinguish single cells from multicellular forms. Colonies have about 2000 flagellated somatic cells within a glycoprotein-rich extracellular matrix that contains 16 large reproductive cells for sex. But the organism can also propagate by asexual reproduction by cloning itself. These features may be why a higher number of genes responsible for glycoprotein formation and cell differentiation have been found in the V. carteri genome with respect to C. reinhardtii. This striking difference between single and multicellular algae parallels what has been found in animals (King et al., 2008).

One step higher in organismal complexity is the filamentous brown alga E. siliculosus. Many biologists wrongly consider the larger macroalgae such as the seaweed Ectocarpus to be plants, but the absence of highly differentiated cells and lack of true roots, stems, leaves and embryos in fact excludes them from the plant kingdom (Margulis and Schwartz, 1998). Furthermore, many have secondary endosymbiotic origins. Sequencing of the Ectocarpus genome has revealed the presence of membrane bound receptor kinases proposed to be important for cell to cell communication, a feature of multicellularity (Cock et al., 2010). The Ectocarpus genome also encodes auxin indole-3-acetic acid biosynthesis genes. In particular an auxin-inducible gene, EsGRP1, revealed from a small-scale microarray experiment, has an expression profile positively correlated with morphogenetic changes (Le Bail et al., 2010). Genes such as these may have contributed to the establishment of multicellularity, although the Chlorella genome also encodes homologues of Arabidopsis hormone receptors for auxin, cytokinin and abscisic acid, suggesting the establishment of these genes prior to the evolution of land plants (Blanc et al., 2010). It is therefore possible that unicellular ancestors used these molecules to signal with each other, and that this may have facilitated the evolution of multicellularity.

The concept of land plants being descended from ancestral Charophycean algae is supported by a handful of studies (Karol et al., 2001; Steemans et al., 2009). However details of this phylogeny are unclear especially because fossil records are scarce and not very conclusive. Analysis of the genomes of two Prasinophyte Micromonas spp. (Worden et al., 2009) has brought evidence confirming previous studies on land plant origins (Guillou et al., 2004; Niklas and Kutschera, 2009, 2010). It was shown, for example, that among several transcription factor and nutrient transporter gene families, many are only found in land plants, placing Micromonas and other Prasinophytes such as Ostreococcus among the closest living relatives of the common ancestor of Chlorophytes and Streptophytes (land plants and Charophytes) (Figure 2). Comparing genome sequences of moss, vascular plants and algae has shown that transition from water to land correlates with the loss of genes related to aquatic environments, a gain in photoperceptory complexity, increased tolerance to abiotic stresses, signalling and transport capacities, and an overall increase in gene family complexity (Rensing et al., 2008).

Evidence for epigenetic regulatory mechanisms

Counter to common thinking, the aquatic habitats where algae are typically found are not homogeneous. Besides pelagic and benthic environments, they are composed of a variety of microenvironments to which life can adapt, such as aggregates (known as marine snow), fecal pellets, and the bodies of other organisms. Algae also have to adapt to changing light intensities, temperature variations, turbulence, grazing pressure, pathogen attack, nutrient limitation, and intoxication with toxic molecules such as heavy metals and organic toxins. Additionally, numerous symbiotic interactions of varying dependencies are known between different organisms. To explain the tremendous diversity observed within microbial communities the Dutch microbiologist Laurens Baas Becking proposed in 1934 that ‘Everything is everywhere but the environment selects.’ (de Wit and Bouvier, 2006). The concept is receiving renewed interest (e.g., Fuhrman, 2009; Bowler et al., 2010), but the extent of standing stock variation within a given species upon which the environment can select, and the extent to which the environment can generate new variation are currently unresolved (Figure 5). Important issues to be explored are whether algae exposed to highly changing environments have higher mutation rates, and if so why? Besides examining genome sequences from clonal isolates, it would also be of interest to examine whether there are differences in their DNA repair machineries. One could also imagine that DNA sequence polymorphisms may not provide sufficient leverage upon which the environment can select, because point mutation-based processes may be too slow to permit adaptation to a dynamic ocean environment. It is, therefore, possible that the ecological success of photosynthetic eukaryotes is also due to the adaptive dynamics conferred by epigenetic regulation mechanisms, such as reversible histone modifications and DNA methylation (Figure 5).

Figure 5.

 Scheme summarizing adaptive dynamics in a rapidly changing ocean environment.
A given plankton ecosystem can be considered as a collection of organisms that are adapted to a particular environment. The variation present within this environment can be considered the standing stock variation. In response to environmental change (e.g., nutrient limitation, irradiance, increase in CO2, pathogens or predators), selection can work on this standing stock DNA sequence variation to favour organisms better suited to the new conditions. The environment could also induce additional variation, which could confer heritable phenotypic plasticity by epigenetic mechanisms, such as DNA methylation, histone modifications, or involving small RNAs. The flexibility provided by epigenetic mechanisms, in contrast to irreversible point mutation-based processes, may be particularly useful in a rapidly changing environment such as the ocean.

Several studies have reported the presence of histone methyltransferases, DNA methylation and small RNAs in algae (Montsant et al., 2007; De Riso et al., 2009; Feng et al., 2010; Zemach et al., 2010). More specifically, DNA methylation has been found in C. reinhardtii, C. variabilis, V. carteri, T. pseudonana and P. tricornutum. It is however undetectable in E. siliculosus despite the presence of small RNAs that mapped to positions corresponding to transposable elements (Cock et al., 2010). This situation suggests that small RNAs might be involved in the regulation of transposon activity in Ectocarpus. Chlorella genes were found to be consistently methylated in the middle away from 5′ and 3′ ends (known as body methylation) with a sharp decrease in promoter regions in genes displaying expression levels negatively correlated with methylation (Feng et al., 2010; Zemach et al., 2010). Chlorella shares some methylation components with Arabidopsis as it has a cytosine methyl transferase homologue known so far to be present only in Arabidopsis. Body methylation seems to be an ancestral feature as it is common to all algae investigated for methylation so far, as well as in plants and animals (Feng et al., 2010; Zemach et al., 2010). DNA methylation also appears to be regulated by stress responses in algae such as the diatom P. tricornutum where certain transposons become hypomethylated in nitrogen starved conditions or in response to signals such as toxic aldehydes (Maumus et al., 2009).

Chlamydomonas reinhardtii, P. tricornutum and T. pseudonana have all been found to possess putative RNAi machinery components. In P. tricornutum, gene knockdown using antisense or inverted repeats is feasible but the nature of small RNAs and their mode of action remain to be determined (De Riso et al., 2009). Chlamydomonas reinhardtii, on the other hand, has a complex small RNA system with different classes of miRNAs and siRNAs reminiscent of plant small RNAs, suggesting that these pathways might represent ancient mechanisms of gene regulation that evolved prior to the emergence of multicellularity (Zhao et al., 2007). The occurrence of methylation, small RNA and histone modifications in marine algae suggest that epigenetic regulation is an ancient regulatory mechanism that contributed to genome evolution in the oceans before the advent of land plants. Deciphering algal epigenomes will contribute to understand how ancient is the epigenetic code and how it has evolved. For instance, such information has already shown that cytosine methylation is mediated by the same enzymatic superfamily in bacteria, archaea and eukaryotes (Goll and Bestor, 2005).

Algal genes in animals

Evolution of species and the acquisition of new functions is the ultimate result of symbiogenesis, gene or genome duplications, gene shuffling, acquisition of organelles and genes from the environment, and other complex exchanges of entities and genetic information. The contribution of algal genes to evolution may even go beyond the plant kingdom, perhaps driven by kleptoplasty. As a case in point, a sea slug in the opisthobranch order of gastropods, Elysia chlorotica, has evolved a protective mechanism dependent largely upon camouflage provided by symbiont plastids. This organism feeds on algal cells, engulfs only the chloroplasts, and performs photosynthesis that in some cases can sustain the sea slug for several months in the absence of an algal food source (Rumpho et al., 2000). Although the expression of algal genes in the nucleus of the sea slug has not yet been firmly established (Hehemann et al., 2010), the system provides a striking example of the innovative capacity of evolutionary processes. A more recent study suggests the wide occurrence of plastid transfer among eukaryotes. Over 100 genes of possible algal origin were found in Monosiga, a unicellular choanoflagellate, a group considered to be the closest known relatives of metazoans and to be primitively heterotrophic (Figure 2) (Sun et al., 2010). We should therefore bear such phenomena in mind when performing comparative genomics studies.

Current and future directions and challenges

The genome sequencing projects currently in progress (Table 1) will enrich our knowledge of algal evolution and provide information about specific adaptations in individual organismal groups. The sequences from glaucophytes, chlorophytes, and prasinophytes will provide insights into the early evolution of events that shaped the green algae and the transition to higher plants. Conversely, sequences from red algae adapted to non-extreme environments, and from a representative of the haptophytes will provide much needed information to relate such organisms to the green algal lineages and to better assess the origins of organisms such as stramenopiles and alveolates. The Aureococcus and Pseudo-nitzschia sequences may reveal information relevant to the understanding of harmful algal blooms (HABs), and the Fragilariopsis sequence promises to inform us about adaptations to the extreme conditions of brine channels within polar ice.

Until recently, huge genome sizes have prevented the sequencing of more algal genomes that could help illuminate further key events during the evolution of photosynthetic life on Earth (Blankenship, 2010). However, more sophisticated sequencing technologies and decreasing costs is going to make such genomes accessible to the research community in the near future. Indeed, many other draft genomes are already in the pipeline. For diatoms, widely distributed Skeletonema species will be of interest, as well as Leptocylindrus danicus. This latter is at an evolutionary position that is ideal for comparative studies in the diatoms as it is at the root of the branching tree (Kooistra et al., 2003). It will also be of great interest to sequence a representative of the Charophycean green algae, relatives of the presumed ancestor of land plants, which might bridge the gap in understanding the evolution of higher plants from algae (Niklas and Kutschera, 2009, 2010). Such genomes can be used for comparative analyses to trace why and how new traits and gene networks came together to shape life on land. Dinoflagellate genomes should also become tractable, and a genome sequence from the Rhizaria should provide much needed clarification of the origins of this group and their relationship with stramenopiles and alveolates.

Until recently, sequencing a unicellular organism would have required the availability of cells in culture. But now a new sequencing technology may open the way to whole genome sequencing of single cells from unculturable organisms. Single cell genomics techniques combine high speed fluorescent activated single cell sorting or robot-assisted single cell isolation, followed by whole genome amplification by a process called multiple displacement amplification and subsequent sequencing. Single cell genomics is now possible from bacterial communities (Woyke et al., 2010) and is being adapted to human microbial flora. What might hinder single cell genomics in single celled photosynthetic eukaryotes is contamination with chloroplast and mitochondrial genomes. It may therefore be necessary to isolate a single nucleus for whole genome amplification and sequencing. Such a challenge is now tractable. The coming years will undoubtedly see a more rapid accumulation of sequence data and the generation of new sophisticated technologies to answer complex questions at the cellular and organelle level and help understand the evolutionary trajectories of algal species.


The authors thank the three anonymous reviewers for their valuable comments and suggestions to improve the manuscript. The authors are also grateful to Hervé Moreau, Angela Falciatore, Maurizio Ribera D’Alcalà and David Claessen for their critical reading of the manuscript. Research in the author’s laboratory is supported by the Agence Nationale de la Recherche.