- I. Introduction 452
- II. Small interfering RNAs 453
- III. MicroRNAs 455
- IV. Trans-acting siRNAs 462
- V. Conclusion and perspectives 462
RNA has many functions in addition to being a simple messenger between the genome and the proteome. Over two decades, several classes of small noncoding RNAs c. 21 nucleotides (nt) long have been uncovered in eukaryotic genomes, which appear to play a central role in diverse and fundamental processes. In plants, small RNA-based mechanisms are involved in genome stability, gene expression and defense. Many of the discoveries in this new ‘small RNA world’ were made by plant biologists. Here, we discuss the three major classes of small RNAs that are found in the plant kingdom, namely small interfering RNAs, microRNAs, and the recently discovered trans-acting small interfering RNAs. Recent results shed light on the identification, integration and specialization of the different components (Dicer-like, Argonaute, and others) involved in the biogenesis of the different classes of small RNAs in plants. Owing to the development of better experimental and computational methods, an ever increasing number of small noncoding RNAs are uncovered in different plant genomes. In particular the well-studied microRNAs seem to act as key regulators in several different developmental pathways, with a marked preference for transcription factors as targets. In addition, an increasing amount of data suggest that they also play an important role in other mechanisms, such as response to stress or environmental changes.
For many years, noncoding RNA (ncRNA) genes were regarded relics of an RNA-based origin of life (Gilbert, 1986; Gesteland et al., 1999). However, as more and more ncRNAs were uncovered, owing to novel experimental and computational approaches (Olivas et al., 1997; Argaman et al., 2001; Huttenhofer et al., 2001; Rivas et al., 2001; Wassarman et al., 2001), it became clear that many of them showed highly specialized biological roles and were not some kind of ‘molecular fossils’. In this ‘modern RNA world’ vision (Eddy, 2001), many ncRNA are involved in functions requiring sequence-specific recognition of another nucleic acid sequence. Such a task is easily performed with RNA molecules through sequence complementarity, fuelling the early idea that ncRNAs would be well suited as regulatory molecules. Indeed, in 1961, François Jacob and Jacques Monod put forward the hypothesis that regulatory genes could produce RNA molecules that would interact with operators by base pairing (Fig. 1), either at the transcriptional level (model I), or the post-transcriptional level (model II). A similar proposal was made a few years later by Britten & Davidson (1969) to explain eukaryotic gene regulation. These views were quickly abandoned after the discovery that protein complexes were involved in the control of almost every step of gene expression. However, the ‘RNA breakthrough’ in the beginning of this century (Couzin, 2002), with crucial and pioneering contributions of the field of plant biology, was in a way a revival of some of the early ideas.
In 1990, two groups published the same unexpected experimental result, in which overexpression of a gene coding for a chalcone synthase to produce deep purple petunia flowers gave white flowers instead (Napoli et al., 1990; Smith et al., 1990; van der Krol et al., 1990). At that time, this phenomenon, named ‘cosuppression’, did not find any plausible explanation. Although in plants part of the mystery was solved during the 1990s (Palauqui et al., 1997; Voinnet & Baulcombe, 1997), the molecular mechanism was first discovered in the worm Caenorhabditis elegans by studying RNA interference (RNAi; Guo & Kemphues, 1995; Fire et al., 1998; Montgomery et al., 1998; Elbashir et al., 2002). RNAi is present in a broad spectrum of eukaryotes under different names, such as post transcriptional gene silencing in plants (PTGS; Hamilton & Baulcombe, 1999), and quelling in fungi (Cogoni et al., 1996) and algae (Wu-Scharf et al., 2000). The RNAi pathway is thought to act as an immune system against invading nucleic acids coming from viruses, transposons or transgenes (Plasterk, 2002). The silencing is triggered by the presence of long double-stranded (ds) RNA molecules in the cell. These can be synthetic RNAs, replicating viruses or even the result of the transcription of nuclear genes. The dsRNA molecules are chopped in very small pieces of RNA of c. 21 nt, referred to as small interfering RNAs (siRNAs), by a specific enzyme named DICER (Bernstein et al., 2001). These siRNAs will be incorporated in an RNA silencing system (RISC) that will recognize, bind and induce cleavage of perfectly complementary mRNAs (Hamilton & Baulcombe, 1999; Hammond et al., 2000; Hannon, 2002; Zamore et al., 2002). The core component of the RISC complex is a member of the Argonaute protein family which has RNA-binding ability (Hammond et al., 2000). This fundamental discovery makes the artificial silencing of virtually any gene possible with artificially engineered siRNAs, without even the need to know the complete gene sequence, a technique now routinely and widely used in functional genomics. Medical treatments using RNA interference are beginning to be developed (Soutschek et al., 2004).
Another type of small noncoding RNAs are the microRNAs (miRNAs), of which the first one, lin-4, was discovered by Victor Ambros and co-workers (Lee et al., 1993). Experiments showed that mutations in the lin-4 locus disrupted the developmental timing (i.e. normal temporal progression of developmental events) in C. elegans. The authors isolated a 693-nt long DNA fragment by positional cloning that could rescue the phenotype of mutant animals. Ambros and colleagues gradually realized (Lee et al., 2004a; Ruvkun et al., 2004 for an accurate and very lively coverage of events), that they were not dealing with a classical protein coding gene but with a tiny ncRNA gene 22 nt long. It was noticed later that the miRNA lin-4 had antisense complementarity to the RNA sequence of the transcript of the gene lin-14, at several places in the 3′ UTR region (Wightman et al., 1993). Many classical aspects of miRNA biogenesis were described in those first papers, such as the processing of the small RNA from longer precursor molecules that could form hairpin-like secondary structures, which is now considered as the hallmark of miRNAs. Lee et al. (1993) also anticipated that lin-4 may represent a class of regulatory genes that encode small RNA antisense products. Nevertheless, this astonishing discovery remained unnoticed for some years, and was considered to be an exotic worm-specific process. No evidence for lin-4-like miRNAs was found in other organisms and no similar small ncRNAs were detected in nematodes until, in 2000, the miRNA gene let-7 was found to also act in the developmental timing of C. elegans– more in particular in the transition from the first larval stage to the second (Reinhart et al., 2000; Slack et al., 2000). Interestingly, homologs of the let-7 gene could be identified in the fly and human genomes (Pasquinelli et al., 2000) and only 1 yr later, dozens of novel miRNA genes were identified in flies, human and worms by three different groups (Lagos-Quintana et al., 2001; Lau et al., 2001; Lee & Ambros, 2001).
In conclusion, different small RNA silencing mechanisms have been observed in animals, plants and fungi and are therefore assumed to have evolved from a unique ancestral pathway. However, plants are somehow unusual in the sense that they have highly diversified small RNA (siRNAs, miRNAs and ta-siRNAs) based pathways where other organisms have evolved (or retained) only one. For example, in animals, all the known examples of natural silencing involve miRNAs only. Finally, the budding yeast has apparently lost even the ancestral pathway (Baulcombe, 2004).
In this review we discuss the three major classes of plant small RNA pathways and their functions: small interfering RNAs, microRNAs and the recently discovered trans-acting small interfering RNAs.
Small interfering RNAs are generally defined as small RNAs that silence transcripts from which they originate (Bartel, 2004). They were first described in plants, where it was shown that the silencing of three transgenes involved a small antisense RNA c. 25 nt long complementary to each targeted mRNA (Hamilton & Baulcombe, 1999; Hamilton et al., 2002; Tang et al., 2003). In plants, siRNAs have a variety of functions that can be grouped in at least two broad categories: those that trigger changes in the chromatin state of elements from which they derive and those that derive from and defend against exogenous RNA sequences such as viruses or sense transgene transcripts (for review see Baulcombe, 2004).
The siRNAs and miRNA sequences are targeted to a complex called RNA-induced silencing complex (RISC; Fig. 2). Argonaute (AGO) proteins are a core component of this complex (Hammond et al., 2000; Nykanen et al., 2001; Schwarz et al., 2003; Pham et al., 2004; Tomari et al., 2004). In Arabidopsis, AGO1 (Bohmert et al., 1998) is associated with miRNAs, trans-acting siRNAs and transgene-derived siRNAs but not with virus-derived siRNAs and siRNAs involved in chromatin silencing (Fagard et al., 2000; Boutet et al., 2003; Vaucheret et al., 2004; Kidner & Martienssen, 2005). Mutants of AGO1 were shown to be hypersensitive to virus infections (Morel et al., 2002). Some transposons were shown to be upregulated in ago1 mutants (Lippman et al., 2003). It was also shown that at least in vitro, AGO1 does not seem to have other partners in RISC and would be solely interacting with small RNAs, unlike what is observed in animals (Baumberger & Baulcombe, 2005).
The RNA-dependent RNA polymerases (RDRs) RDR1 and RDR6 are required in the siRNA pathway that silences viruses and transgenes (Dalmay et al., 2000; Mourrain et al., 2000; Xie et al., 2001). Those proteins turn single-stranded (ss) RNA into dsRNA, with or without a siRNA as a primer (Baulcombe, 2004). As a result of the action of RDRs, a single RNA or primary siRNA molecule can generate many dsRNA, thus amplifying the response (Fig. 2b). Unexpectedly, it was shown that RDR6 might also repress the expression of a miRNA, miR165/166 (Li et al., 2005).
The dsRNA structures are further processed by a member of the Dicer family that generates small RNAs from double-stranded RNA sequences with 2-nt overhangs at the 3′ ends (Bernstein et al., 2001). In Arabidopsis, four Dicer-like (DCL) proteins are known, each one having a distinct function in different small RNA pathways (Schauer et al., 2002; Xie et al., 2004; Dunoyer et al., 2005; Gasciolli et al., 2005; Xie et al., 2005b). DCL1 produces miRNAs, DCL2 produces siRNAs involved in the silencing of at least some viral sequences (Xie et al., 2004) and DCL3 produces siRNAs involved in DNA methylation and heterochromatin formation (Xie et al., 2004). Recent work suggests that DCL4 produces siRNAs triggered by inverted-repeat transgenes in plants (Dunoyer et al., 2005) and is also associated with the ta-siRNA pathway (Gasciolli et al., 2005; Xie et al., 2005b; Yoshikawa et al., 2005). Some results have also shown that a partial functional redundancy amongst the different Dicer-like proteins in Arabidopsis is possible (Gasciolli et al., 2005; Xie et al., 2005b).
Virus dsRNA sequences are recognized by the RNA silencing machinery, which produces siRNAs that will silent viral genes and prevent the accumulation of the pathogen (for review see Dunoyer & Voinnet, 2005). Virus defense via siRNA is likely to be an ancient mechanism and therefore viruses have evolved various ways to bypass this barrier (Baulcombe, 2004; Dunoyer & Voinnet, 2005; Simon-Mateo & Garcia, 2006). The mechanism best known involves the production of a protein by the virus that will block the silencing pathway of the host (Voinnet et al., 1999; Mallory et al., 2002; Kasschau et al., 2003; Ye et al., 2003; Chapman et al., 2004; Dunoyer et al., 2004; Lakatos et al., 2004). But other mechanisms might also exist. In plants, the expression of siRNAs from inverted-repeats transgenes mimic the symptoms observed during viroid infections, suggesting RNA silencing of the host genes (Wang et al., 2004b). Conversely, many Arabidopsis siRNAs do not show a high degree of similarity to any Arabidopsis mRNA. Therefore, one interesting hypothesis is that they could constitute a reservoir of defense molecules because of their complementarity to viral sequences (Dunoyer & Voinnet, 2005). This RNA silencing defense might not be limited to viruses: it was also shown recently that bacterial infection by a virulent Agrobacterium tumefaciens triggered a rather complex siRNA-mediated silencing response (Dunoyer et al., 2006).
Another striking feature of siRNA-mediated silencing in plants and in some animals is its systemic nature: the effect of silencing can extend beyond the site of initiation and spread through the organism (for a review see Voinnet, 2005). Recent work suggests that DCL4 is responsible for the production of the 21-nt long siRNAs involved in the cell-to-cell silencing signal (Dunoyer et al., 2005). The movement of siRNAs or miRNAs could be important for the regulation of endogenous genes. For example, it is known that the distribution of miR165/166 in the leaf, where they act as repressors of genes affecting leaf polarity, resembles that of a mobile signal (Emery et al., 2003; Juarez et al., 2004; Kidner & Martienssen, 2004). Many siRNAs and miRNAs were detected in the phloem sap of pumpkin, where a protein has been characterized that binds specifically to small RNAs (Yoo et al., 2004).
Recently, a study revealed that an antisense overlapping gene pair generated two types of siRNAs involved in salt-stress tolerance (Borsani et al., 2005). Those genes are P5CDH, a stress-related gene, and SRO5, a gene of unknown function. When both transcripts are present, a 24-nt siRNA is formed by a biogenesis pathway dependent on DCL2, RDR6, SGS3 and NRPD1A. The cleavage of the P5CDH transcript by the 24-nt siRNA then sets the phase for the generation of 21-nt siRNAs by DCL1 that will further cleave the P5CDH transcript. The expression of SRO5 is induced by salt, a step thus necessary for the initial siRNA formation. This elegant study shows that endogenous siRNAs (dubbed nat-siRNAs), derived from a pair of natural cis-antisense transcripts, regulate salt tolerance. Given that overlapping genes are not rare in many eukaryotic genomes, nat-siRNA-based regulation might also occur in many other processes (Borsani et al., 2005).
The link between siRNAs and chromatin modifications was mostly explored in the fission yeast Saccharomyces pombe where it appears that siRNA-mediated heterochromatin modification is a general mechanism for regulating gene expression (for a review see Lippman & Martienssen, 2004; Gendrel & Colot, 2005). The existence of a mechanism of de novo methylation of genes that can be induced and targeted in a sequence-specific manner was first shown in plants (Wassenegger et al., 1994). Later, a link between locus specific siRNAs and histone modifications (deacetylation) or histone H3 lysine 9-methylation was later shown in plants (Aufsatz et al., 2002; Jackson et al., 2002; Zilberman et al., 2003; Xie et al., 2004).
Experiments have shown that expressed siRNAs were matching transposable element sequences that could form imperfect RNA duplexes (Mette et al., 2002). More recently, high-throughput sequencing of expressed small RNAs in Arabidopsis (Lu et al. 2005a) showed that many siRNAs are associated with transposons silenced by methylation. Maintaining this silenced state involves a low level of transcription, which is a paradox because silencing inhibits transcription by known DNA-dependent RNA polymerases (Lippman & Martienssen, 2004). However, the recent finding of a new RNA polymerase might help to solve this issue. Two studies have described a new RNA polymerase named POL-IV that directs heterochromatic silencing, although the mechanism is not yet clear (Herr et al., 2005; Kanno et al., 2005; Onodera et al., 2005; Pontier et al., 2005; Vaughn & Martienssen, 2005). The RNA-dependent RNA polymerase RDR2 was shown to be required for heterochromatin formation, as well as the Dicer-like protein DCL3 and the Argonaute protein AGO4 (Zilberman et al., 2003; Xie et al., 2004; Zilberman et al., 2004).
Unexpectedly, similarly to siRNAs, miRNAs might also contribute to DNA methylation in Arabidopsis. There is experimental evidence that mir165/166, which targets the PHAVOLUTA (PHV)- and PHABULOSA (PHB)-encoding mRNAs, induces methylation of PHV and PHB genes downstream of the miRNA target sites (Bao et al., 2004). MicroRNA pairing very likely takes place with the nascent but already spliced transcript, implying that miRNAs may also be active in the nucleus, at least in plants.
Plant miRNAs were first identified in early 2002 (Llave et al., 2002a; Park et al., 2002; Reinhart et al., 2002). Like their animal counterparts, they are short sequences c. 21 nt long, processed from longer precursor sequences. While animal precursor sequences usually have a length of 70–80 nt, plant miRNA precursor sequences are much more variable and range from 50 nt to more than 350 nt. First discovered in Arabidopsis thaliana, some plant miRNAs were found to be conserved in many plant genomes such as those of Oryza sativa, Zea mays and those of more ancient vascular plant genera such as ferns or even nonvascular plants such as mosses (Floyd & Bowman, 2004; Axtell & Bartel, 2005).
Among the main differences between plant siRNAs and miRNAs is that the latter are processed from their own loci (Fig. 2a). In plants, primary miRNA transcripts (pri-miRNA) are produced by RNA polymerase II (POL-II) and are capped and polyadenylated (Aukerman & Sakai, 2003; Xie et al., 2005a). A TATA regulatory binding motif was found in the upstream region of at least some Arabidopsis pri-miRNAs (Kurihara & Watanabe, 2004; Xie et al., 2005a). Since pri-miRNAs are polyadenylated, some of them can be found in Expressed Sequences Tags (EST) databases (Jones-Rhoades & Bartel, 2004). The mature miRNA sequence can be found on either the 5′ or the 3′ strand of the precursor sequence (Reinhart et al., 2002; see, for example, ath-miR156a and ath-miR319a in Fig. 3). One mature miRNA is encoded by one or more miRNA genes and sequences that only differ by a few nucleotides are usually grouped together to form families (Reinhart et al., 2002; Griffiths-Jones et al., 2006). Transgenic experiments have shown that it was possible to replace the miRNA:miRNA* duplex by an artificial hairpin structure without altering miRNA processing, thus showing that structure is more important than the sequence itself in this process (Parizotto et al., 2004; Vaucheret et al., 2004). Statistical analyses have shown that pre-miRNA secondary structures tend to have free energy values that are significantly different from those of random sequences, contrary to structures of other classes of ncRNAs, such as transfer RNAs or ribosomal RNAs (Bonnet et al., 2004b; Clote et al., 2005). This indicates that miRNA precursor sequences have highly stable secondary structures, a property likely necessary to avoid anticipated degradation and to allow correct processing by Dicer enzymes.
In Arabidopsis, miRNA biogenesis in the nucleus is performed in several steps and requires both DCL1 and HYL1 (Fig. 2a; Papp et al., 2003; Kurihara & Watanabe, 2004; Kurihara et al., 2006). Another protein, HEN1, is required for miRNA biogenesis. This enzyme has two dsRNA-binding domains and a nuclear localization signal. It is conserved in fungi and is required for miRNA and siRNA processing (Park et al., 2002; Boutet et al., 2003; Xie et al., 2003). It was shown recently that HEN1 is responsible for the 3′ end methylation of Arabidopsis miRNA:miRNA* duplexes and that this modification could be essential for their biogenesis and function in the RNA silencing pathway. All known classes of endogenous small RNAs in Arabidopsis require HEN1 (Yu et al., 2005).
HYPONASTIC LEAVES 1 (HYL1) is required to process miRNAs and has a nuclear localization signal (Lu & Fedoroff, 2000; Han et al., 2004; Vazquez et al., 2004a). It was shown that HYL1 interacts with DCL1 in vitro (Hiraguri et al., 2005). HASTY (HST, Bollman et al., 2003) may be involved in export of miRNA to the cytoplasm, although convincing evidence for it is lacking, but HST-independent nucleocytoplasmic pathways do exist since miRNA export is not totally blocked in hst mutants (Park et al., 2005).
Arabidopsis miRNA genes were first identified using cloning experiments, by which small RNAs (size between 16 nt and 30 nt) are isolated from whole plants and then cloned and sequenced (Llave et al., 2002a; Park et al., 2002; Reinhart et al., 2002; Sunkar & Zhu, 2004; Gustafson et al., 2005). As a consequence, different types of small RNAs such as siRNAs, miRNAs and RNA degradation products are selected with this procedure. In order to select for miRNAs, the secondary structure of the miRNA precursor sequence is checked for compliance with known miRNA features (Ambros et al., 2003). To extract the potential precursor sequence, small RNAs are mapped back on genomic sequences to extract flanking regions. The secondary structure is then predicted using ad hoc software tools, such as the mfold or rnafold packages (Zuker & Stiegler, 1981; Hofacker et al., 1994). Using cloning experiments, miRNA sequences were identified in Arabidopsis (Reinhart et al., 2002; Sunkar & Zhu, 2004), rice (Sunkar et al., 2005), poplar (Lu et al., 2005b), moss (Arazi et al., 2005) and tobacco (Billoud et al., 2005). However, this kind of approach implies long and tedious bench work, and the detection is restricted to the most abundant molecules. MicroRNAs expressed under special conditions (stress, etc.) or at specific points in time will remain undetected.
Complementary to cloning experiments are computational approaches for the prediction of miRNAs. Here, the main problem is in discriminating between real miRNAs and so-called false positives. Many algorithms start with predicting all possible hairpin structures for a given genomic sequence (Lim et al., 2003). However, the number of such structures is usually very high, with many false positives, for example, because of the repeats present in most genomic sequences. Therefore, most approaches add to this first step several filters based on the properties of experimentally documented miRNAs. One of the properties used by most algorithms is evolutionary conservation. It has indeed been shown that many miRNAs are conserved in different organisms (Bartel, 2004). In plants, the conservation is, in most cases, limited to the mature miRNA (c. 21 nt) while the rest of the precursor sequence is far less well conserved (Fig. 3). For example, even between distantly related plants such as Arabidopsis and rice that have diverged more than 130 millions years ago (Friis et al., 2004), the sequence of miR162 is completely conserved (Fig. 3; Reinhart et al., 2002). Parameters describing the secondary structure such as free energy, the number of paired residues within the miRNA, or the number and the size of bulges are used to select valid structures, with cut-off values based on experimentally proven miRNAs. Compositional characteristics such as GC content or the low-complexity content of the miRNA sequences can also help to get rid of irrelevant repeat sequences (Bonnet et al., 2004a; Jones-Rhoades & Bartel, 2004; Wang et al., 2004a; Adai et al., 2005).
Initially, target prediction for plant miRNAs was quite straightforward because it was assumed that most of them match their targets with near-perfect complementarity (Rhoades et al., 2002). In that case, the search for transcript targets is done for a sequence pattern complementary to a given miRNA sequence, allowing few mismatches (usually two or three). For example, Jones-Rhoades & Bartel (2004) used a score system taking into account mismatches, gaps and G:U base pairs. However, more recent experimental work on miRNA targets in Arabidopsis showed that some miRNA match their targets with even less complementarity and that the binding pattern is not random (Mallory et al., 2004b; Allen et al., 2005; Schwab et al., 2005). As in animals, there are fewer mismatches in the 5′ part of miRNA:mRNA while the free energy of the duplex is defined by a maximum value. Many miRNA computational prediction tools integrate target detection to support the result of miRNA prediction. In some cases, the target prediction is restricted to one genome (Bonnet et al., 2004a; Wang et al., 2004a; Adai et al., 2005). In other cases, the prediction is further constrained through conservation of targets in several genomes (Jones-Rhoades & Bartel, 2004). So far, computational pipelines have been applied successfully for the discovery of new conserved miRNAs using the complete genomes of A. thaliana and O. sativa (Bonnet et al., 2004a; Jones-Rhoades et al., 2004; Wang et al., 2004a; Adai et al., 2005).
Many of the predicted targets have been verified experimentally. A modified version of the 5′ RACE is usually applied to look for the product of degradation consecutive to the cleavage of the targeted miRNA (Llave et al., 2002b; Jones-Rhoades et al., 2004; Lu et al., 2005b). Some studies remapped known miRNAs on newly available plant genomic sequences such as Sorghum bicolor (Bedell et al. 2005). The availability of new plant genomes, like the poplar (http://genome.jgi-psf.org/Poptr1/Poptr1.home.html; Tuskan et al., 2004), should provide new material to which new computational searches for conserved miRNAs can be applied.
Most of the experimentally documented miRNA sequences from plant, animal and viral genomes are deposited in the miRNA registry (Griffiths-Jones et al., 2006). Release 7.1 (October 2005), contains 731 plant miRNAs genes, representing eight plant genomes (Table 1). The largest number of miRNA families is found in Arabidopsis and rice, with 46 families encoded, respectively, by 117 and 178 miRNA genes. Next is poplar with 33 miRNA families encoded by 213 miRNA genes. Those three genomes are the only complete plant genomes currently available, so it is no surprise that they have the largest collection of miRNA genes. A typical feature for plant miRNAs is that they can be divided in two broad categories: miRNAs that are conserved in different plant genomes and miRNAs that are specific to one organism (Reinhart et al., 2002; Jones-Rhoades & Bartel, 2004; Sunkar & Zhu, 2004; Sunkar et al., 2005). The existence of organism-specific miRNAs seems to be specific for plants, although a recent study suggested the existence of a pool of primate-specific miRNAs (Bentwich et al., 2005).
|miRNA families||miRNA genes|
Most of the plant miRNAs play a role in developmental processes, and a majority of their targets are transcription factors (see for review Jones-Rhoades et al., 2006). It now appears that at least some of the miRNAs are involved in other processes as well, such as response to environmental conditions (Jones-Rhoades & Bartel, 2004; Fujii et al., 2005; Lu et al., 2005b; Sunkar et al., 2005; Chiou et al., 2005). Plant miRNAs seem to be much more specific than their animal counterparts (Schwab et al., 2005). It is estimated that in humans, miRNAs could regulate up to one-third of the protein coding genes (Lim et al., 2005). Contrary to their animal counterparts, plant miRNA targets are more often located in coding sequences and only occasionally in UTR regions (Bartel, 2004). There are several ways to decipher the biological role of a given miRNA. The simplest way consists of finding the target gene(s) of the miRNA, experimentally and/or through in silico search. More sophisticated approaches involve the use of mutants, knock-outs or overexpression of the miRNA precursor genes.
Experimental validation for predicted targets was facilitated by the fact that most of the plant miRNA targets are regulated by cleavage, allowing detecting products of degradation with ad-hoc experiments (modified 5′ RACE, see Llave et al., 2002b; Jones-Rhoades & Bartel, 2004; Lu et al., 2005b). Several groups went a step further and built artificial constructs in order to overexpress a given miRNA or to make predicted targets resistant to miRNA matching by the introduction of mutations. Here we try to summarize those results and to group them according to the biological process in which the miRNA is involved (see also Table 2).
|Function||miRNA families||Target gene(s)||miRNA conservation||References|
|Floral timing and leaf development||156||SPL transcription factors||A G O P Sb So||26|
|Floral development||171||SCL-like transcription factors||A O P S||24, 25|
|Floral development and vegetative phase change||172||AP2-like transcription factors (AP2, TOE1, TOE2, GL15)||A G O P Sb Z||1, 6, 20, 26|
|Expression of Auxin response genes, developmental defects||160||160: Auxin response factors (ARF10, ARF16, ARF17)||160: A G M O P Sb Z||8, 19, 21, 23, 26, 27|
|164||164: NAC1||164: A O P Sb Z|
|167||167: ARF6, ARF8||167: A G O P So Sb Z|
|393||393: TIR1||393: A M O P Sb Z|
|390||390: ARF3, ARF4 via TAS3 (ta-siRNA)||390: A O P|
|Organ separation and number||164||CUC transcription factors||A O P Sb Z||11, 12, 18|
|Organ polarity, vascular and meristem development||165/166||HD-ZIP III transcription factors (PHB, PHV, leaf 1)||A G M O P Sb Z||2, 7, 9, 10, 13, 14, 16, 17|
|Floral and leaf patterning||159/319||319: TCP transcription factors||A G M O P S S Z||3, 5, 22|
|159: MYB transcription factors|
|Regulation of the miRNA pathway||162||162: DCL1||162: A M O P Z||4, 15, 27|
|168||168: AGO1||168: A G O P So Sb Z|
|403||403: AGO2||403: A P|
|Sulfate assimilation||395||ATP sulfurylases (APS1, APS3, APS4)||A O P S Z||8|
|Lignin formation?||397||Laccases||397: A O P||8, 25, 32|
|Oxidative stress||398||Copper superoxide dismutases CSD1, CSD2||A G O P||8|
|Phosphate homeostasis||399||Phosphate transporter, E2 ubiquitin-conjugating enzyme||A M O P Sb Z||8, 36, 37|
|Unknown||161||161: PPR||161: A||8, 27, 28, 29, 30, 31, 33, 34, 35|
|163||163: SAM-dependent methyl transferase||163: A|
|173||173: PPR via TAS1 & TAS2 (ta-siRNA)||173: A|
|390||390: Receptor-like kinase||390: A O P|
|394||394: F-box||394: A O P Sb Z|
|396||396: Growth response factors (GRF1, GRF2, GRF3, GRF7, GRF8, GRF9), rhodanese||396: A G O P So Sb Z|
Leaf, floral and shoot development An overexpression of miR172 in Arabidopsis was demonstrated to cause early flowering and defects in floral identity such as absence of petals and transformation of sepals into carpels (Aukerman & Sakai, 2003). The predicted targets of miR172 are members of the APETALA2 transcription factors, including AP2 itself, but also TARGET OF EAT (TOE1 and TOE2) or GLOSSY15 (GL15). Loss of function analyses for these genes indicate that they normally act as floral repressors. The downregulation of AP2-like genes by miR172 during the early stages of development relieves floral repression and promotes flowering. Surprisingly, in Arabidopsis miR172 appears to downregulate its targets through a mechanism of translational repression rather than cleavage. Indeed, products corresponding to the cleavage of AP2-like targets by miR172 were found by different groups, but they may represent a small fraction of the total AP2 transcript population (Aukerman & Sakai, 2003; Kasschau et al., 2003; Chen, 2004; Schwab et al., 2005). Analysis of transgenic maize lines overexpressing GL15 showed that this gene controls the transition from juvenile to adult leafs (Lauter et al., 2005). This transition is the result of opposite effects of GL15 and miR172, the latter promoting the transition to the adult phase by downregulating GL15. Data suggest that this could be a general mechanism for the regulation of vegetative phase change in higher plants (Lauter et al., 2005).
The miRNA miR171 is perfectly complementary to three members of the SCARECROW-like family of transcription factors (SCL6-II, SCL6-III and SCL6-IV). This gene family controls a wide range of developmental processes, including radial patterning in roots and hormone signaling. The cleavage of SCL6-III and SCL6-IV by miR171 was shown by 5′ RACE experiments. The fact that SCL6-III and SCL6-IV are predominantly found in inflorescence tissues, just like miR171, might suggest a role for this miRNA in flowering processes but this hypothesis needs to be confirmed by specific experiments (Llave et al., 2002b).
The SPL genes encode a class of plant-specific transcription factors (SQUAMOSA PROMOTER BINDING PROTEIN LIKE) that were predicted to be the targets of miR156 (Rhoades et al., 2002). The overexpression of this miRNA has been shown to cause a moderate delay in flowering and a faster initiation of rosette leaves compared with wild-type. A severe decrease of apical dominance is also observed and the first flowers tend to arise from side shoots. The combination of those traits leads to a phenotype with a substantial increase (up to 10 times) in total leaf number on the side of the shoots (Schwab et al., 2005).
The transcription factor genes cup-shaped cotyledons (CUC) are predicted to be the targets of the miR164 family. Expression of miR164-resistant versions of CUC1 caused alterations in Arabidopsis embryonic, vegetative and floral development, affecting cotyledon orientation, rosette leaves shape, petals and sepals number (Mallory et al., 2004a). Overexpression of miR164 reproduced the phenotype of cuc1 cuc2 double mutants by downregulating the levels of CUC1 and CUC2 but not CUC3 mRNAs (Laufs et al., 2004; Mallory et al., 2004a). Disruption of the regulation of CUC2 by miR164 caused enlarged sepal boundary domains, indicating that miR164 regulation constrains the expansion of the boundaries by degrading CUC1 and CUC2 mRNAs (Laufs et al., 2004). Analysis of the mutant early extra petals 1 (eep1) that was found to encode for an extra member of the miR164 family (miR164c) revealed that this miRNA controls petal numbers by regulating CUC1 and CUC2 transcript accumulation (Baker et al., 2005).
MiR319 (also known as miR-JAW) was identified through a genetic screen and guides the cleavage of several TCP transcription factor genes controlling leaf development (Palatnik et al., 2003). Mutants for the miR319 locus exhibited crinkled leaves, as well as an overexpression of miR319. Constitutive expression of TCP2 or TCP4 partly rescued the miR319 mutant, with leaves less affected but still different from the wild type. MiR159 is a close homolog of miR319 that differs by only three residues and guides the cleavage of two transcripts encoding MYB transcription factors (MYB33 and MYB65). Arabidopsis plants transformed with cleavage resistant MYB33 exhibit pleiotropic developmental defects (Palatnik et al., 2003; Millar & Gubler, 2005). A feedback regulation of MYB genes on miR159 levels was revealed in Arabidopsis, as part of the gibberellin–DELLA proteins controlling flower development (Achard et al., 2004).
Leaf polarity, vascular and meristem development Members of the class III HD-ZIP transcription factor gene families PHABULOSA (PHB) and PHAVULOTA (PHV) govern vascular pattern and leaf polarity. Both gene families have complementary sites for miR165/166. Regulation by those miRNAs was indeed shown to be necessary for a proper organ axis specification, vascular development and meristem function (Emery et al., 2003; Mallory et al., 2004b; McHale & Koning, 2004; Zhong & Ye, 2004; Kim et al., 2005; Williams et al., 2005b). In maize, it was demonstrated that miR166 is a conserved polarizing signal whose expression pattern spatially defines the expression of the HD-ZIP III member ROLLED LEAF 1, determining the abaxial (upper) and adaxial (lower) asymmetry of the leaf (Juarez et al., 2004). Moreover, the cleavage of HD-ZIP III genes by miR165/166 was found to be extremely well conserved amongst vascular plants, including ferns and mosses (Floyd & Bowman, 2004).
Auxin response Auxin is a phytohormone implicated in virtually every aspect of plant growth and development. Most of its effects are mediated by auxin transcription factor (ARF) genes. In Arabidopsis, this family consists of 23 genes. Among those, three were predicted to be targets of miR160, namely ARF10, ARF16 and ARF17 (Rhoades et al., 2002). It was shown that plants expressing ARF17 genes resistant to miR160 cleavage have increased levels of ARF17 transcripts and altered levels of GH3-like mRNAs. Those plants also exhibited dramatic pleiotropic developmental defects such as leaf shape defects, premature inflorescence, reduced petal size, etc. Such phenotypes were also observed in plants expressing suppressors of RNA silencing or plants with mutations related to miRNA pathways (Mallory et al., 2005). Mutant plants with miR160 resistant ARF16 genes showed that miR160 and auxin independently regulate the activity of those genes, responsible for root cap development (Wang et al., 2005). The transcription factor NAC1 is known to transduce auxin signals for lateral root emergence while also being a target of miR164. While measuring the levels of miR164 after auxin treatment, Guo et al. (2005) were able to detect a slight but consistent increase in the miRNA levels (c. 1.5-fold increase) some hours after treatment. This suggests a regulation of miR164 levels by auxin but also that the induction of miR164 by auxin may create a homeostatic mechanism that mediates clearance of NAC1 mRNA after its initial induction by auxin. Overexpression of miR164 reduces lateral root formation, but an overexpression of miR164-resistant NAC1 only slightly increases the number of lateral roots (Guo et al., 2005). Furthermore, other groups reported that NAC1 is a miR164 target but did not report a root phenotype when overexpressing miR164 or even fail to find evidence that miR164 targets NAC1 in vivo (Laufs et al., 2004; Mallory et al., 2004a). The role of miR164 in the formation of lateral roots has to be cleared out.
MiR393 is predicted to target several F-box transcripts involved in the ubiquitination pathway (Bonnet et al., 2004a; Jones-Rhoades & Bartel, 2004). Among those targets, cleavage products were found in Arabidopsis for TRANSPORT INHIBITOR RESPONSE1 (TIR1; Jones-Rhoades & Bartel, 2004). This gene plays a central role in the auxin response pathway. TIR1 binds to AUX/IAA proteins, leading to an increased ubiquitination of the TIR1/AUX/IAA complex. In turn, this process will enhance the degradation of this complex, and release ARF proteins from repression, allowing auxin-responsive transcription (for review see Woodward & Bartel, 2005). It was shown recently that TIR1 is an auxin receptor that mediates AUX/IAA degradation and auxin-regulated transcription (Dharmasi et al., 2005; Kepinski & Leyser, 2005). The discovery of an auxin receptor is a true landmark in the search for the mechanism of auxin action and the fact that this receptor is also a miRNA target highlights the importance of miRNAs as key regulators (Napier, 2005).
Regulation of the miRNA pathway Elevated levels of DCL1 mRNAs were found in dcl1 mutants, miRNA defective hen1 mutants, and in plants expressing a virus-encoded suppressor of RNA silencing (P1/HC-PRO). Cleavage products corresponding to the activity of miR162 on DCL1 transcripts were found, revealing a negative feedback regulation of this enzyme by a miRNA (Xie et al., 2003). Transgenic plants expressing a mutated (but functional) AGO1 mRNA with impaired complementarity to miR168 accumulated AGO1 transcripts and showed developmental defects similar to those encountered in plants having crucial mutations for the miRNA pathway (dcl1, gen1 or hyl1). Those defects could be rescued by the introduction of an artificial miRNA complementary to the mutated AGO1 mRNA. These results demonstrate the existence of another feedback regulatory loop in the miRNA pathway (Vaucheret et al., 2004).
Environmental and stress-related responses Despite an overwhelming propensity to target transcription factors, as previously mentioned, plant miRNAs are also predicted to match several other classes of targets. Some are linked to environmental changes or stress responses. For example, several ATP sulfurylase mRNAs (APS1, APS3 or APS4) have a complementary site for miR395 (Bonnet et al., 2004a; Jones-Rhoades & Bartel, 2004). ATP sulfurylases catalyse the first step of inorganic sulfate assimilation. Products of degradation corresponding to the cleavage of APS4 by miR395 were detected in Arabidopsis. It has been shown that miR395 is expressed upon sulfate starvation and that it is inversely correlated with APS1 expression (Jones-Rhoades & Bartel, 2004). A very similar observation has recently been made regarding phosphate homeostasis, where it was shown that miR399 was induced by phosphate starvation, which in turn downregulated its target transcript encoding a ubiquitin-conjugating E2 enzyme through 5′UTR interaction (Chiou et al., 2005; Fujii et al., 2005). Accumulation of the E2 transcripts was suppressed in transgenic plants overexpressing miR399. These transgenic plants accumulate inorganic phosphate and exhibit phosphate toxicity symptoms that phenocopy a loss-of-function E2 mutant. This provides evidence that miR399 controls phosphate homeostasis by regulating a component of the proteolysis machinery in plants.
In poplar, a recent study revealed that the levels of many miRNAs cloned from woody tissues were either upregulated or downregulated in stem tissues submitted to mechanical stresses. This strongly suggests a role of miRNAs in tree defense systems against mechanical stresses (Lu et al., 2005b). The miRNA mir397 is predicted to target laccases, a widespread family of enzymes conserved in bacteria, insects, plants and fungi (Bonnet et al., 2004a; Jones-Rhoades & Bartel, 2004). Cleavage products for laccases were also found in Arabidopsis (Jones-Rhoades & Bartel, 2004). A homolog of miR397 was found in the recently released poplar genome (http://genome.jgi-psf.org/Poptr1/Poptr1.home.html) and is predicted to target 21 laccase homologs in this genome (E. Bonnet et al., unpublished). This could be an interesting finding, as laccases are suspected to be involved in lignin (wood) formation (Mayer & Staples, 2002; Ranocha et al., 2002).
Those results highlight what seems to be a typical feature of plant miRNAs described so far: a high degree of specificity, contrary to what is observed for animal miRNAs. A plant miRNA will typically regulate one or several members of a given protein family, usually closely related. This specificity was confirmed by expression experiments where the effect of the overexpression of a given plant miRNA was quantified using microarrays. Such experiments for five different miRNAs found a limited number mRNA differentially expressed when compared with controls where miRNAs were not overexpressed (Schwab et al., 2005). Similar experiments in animals typically found a few hundred mRNAs differentially expressed per miRNA (Lim et al., 2005). However, there is not always a one-to-one relationship. For example, the miRNAs miR319 and miR159 are grouped into one miRNA family, as they differ for only a few nucleotides. One regulates TCP transcription factors members (miR319) and the other MYB transcription factors (miR159) (Palatnik et al., 2003). Another interesting aspect of miRNA regulation might be their involvement in one or several major regulatory networks. For example, several miRNAs (and also ta-siRNAs: see next chapter) seem to play a key role in the auxin-signaling pathway (miR160, miR164, miR167 and miR393) and target different genes in this pathway (ARFs, TIR1 and NAC1; see also Table 2). There is an indication for a combinatorial role for miRNAs here, in conjunction with other factors.
In a screen for mutants impaired in the juvenile to adult phase transition, Peragine et al. (2004) identified several genes that were up-regulated in sgs3, rdr6 and ago7 mutants, including some AUXIN RESPONSE FACTORs (ARF3 and ARF4). Among those they identified one locus that was silenced post-transcriptionally in trans by an endogenous siRNAs derived from a nonprotein-coding transcript. They also found that the process was SGS3-, RDR6- and DCL1-dependent, suggesting a relationship with the miRNA pathway.
An independent study by Vazquez et al. (2004b) was performed to identify the molecular basis of the rdr6 and sgs3 mutant phenotypes. A nonprotein-coding RNA transcript (now called TAS1a) was identified that accumulated in rdr6 mutants. Small interfering RNAs were also identified that did not accumulate in ago1, dcl1, hen1, hyl1, rdr6 and sgs3 mutants. Vazquez et al. (2004b) showed that those siRNAs were processed from the TAS1a locus and that they guide the cleavage of several endogenous mRNAs.
Those two independent studies thus clearly established that siRNAs generated from noncoding transcripts were able to silence target mRNAs that have little overall resemblance to the gene from which they originate, demonstrating the existence of a third RNA silencing pathway, in addition to miRNAs and siRNAs, and providing yet another dimension to post-transcriptional mRNA regulation in plants (for a review see Vaucheret, 2005).
Later, it was shown that the cleavage of the noncoding transcript by a miRNA was a necessary step before the production of 21 nt siRNAs from the cleavage fragments (Allen et al., 2005; Gasciolli et al., 2005; Yoshikawa et al., 2005). There is also now evidence that DCL4 is the protein processing double-stranded noncoding ta-siRNA transcripts in 21-nt long ta-siRNAs (Gasciolli et al., 2005; Xie et al., 2005b; Yoshikawa et al., 2005).
A model for the processing of ta-siRNAs was proposed by some groups (Xie et al., 2005b; Yoshikawa et al., 2005): the miRNA cleaves a capped and polyadenylated transcript. Cleavage fragments (either 5′ or 3′) are bound by SGS3 or by proteins associated with SGS3, thus protecting them from degradation by enzymes acting on ssRNA. RDR6 then transforms the fragment into double stranded RNA that will be cleaved into 21 nt siRNAs by DCL4 (Fig. 2c).
The miRNA miR173 was found in wild-type Arabidopsis, but no experimental validation was done for its predicted target, a protein of unknown function (Park et al., 2002). miR390 was identified both by cloning and computational approaches (Bonnet et al., 2004a; Sunkar & Zhu, 2004; Adai et al., 2005) but predicted targets failed to be validated by 5′ RACE experiments (Axtell & Bartel, 2005). More recently, using more complex target prediction algorithms and/or experimental approaches, the targets of miR173 and miR390 were identified as three noncoding transcript families encoding ta-siRNAs, designated TAS1, TAS2 and TAS3 (Peragine et al., 2004; Vazquez et al., 2004a; Allen et al., 2005; Yoshikawa et al., 2005).
The TAS1 family is composed of three genes encoding a set of closely related ta-siRNAs that target four mRNAs of unknown function (Peragine et al., 2004; Vazquez et al., 2004a; Allen et al., 2005). TAS2-derived ta-siRNAs target mRNAs encoding pentatricopeptide repeat (PPR) proteins (Allen et al., 2005; Yoshikawa et al., 2005). The TAS3 locus specifies two ta-siRNAs that target a set of mRNAs corresponding to AUXIN RESPONSE FACTORs including ARF3 and ARF4 (Allen et al., 2005; Williams et al., 2005a). These loci are particularly interesting as Allen et al. (2005) found that miR390 genes, miR390 target sites, ta-siRNAs in TAS3 primary transcripts and TAS3 ta-siRNA target sites in ARF3 and ARF4 are conserved between several monocots and dicots, suggesting that this ta-siRNA pathway is at least 150 million or so years old.
Furthermore, other ARF genes are known to be regulated by miRNAs (Jones-Rhoades et al., 2006; see also the above paragraph on miRNA targets), meaning that up to one third of the known ARF genes are regulated by miRNAs or ta-siRNAs. The association between auxin and small RNA regulation might suggest a need for a rapid clearance of auxin effectors mRNAs after signaling events (Bartel, 2004).
It is clear that small RNAs hold many key functions in plants, particularly in genome stability, regulation of gene expression and defense. Some authors even compare them to the ‘dark matter’ of the universe because of their relatively recent discovery and their ubiquity. However, despite an impressive amount of knowledge acquired in a few years after their discovery, many aspects of plant small RNAs biogenesis and function remain unclear (Baulcombe, 2005; Carrington, 2005). Unlike many animals, plants encode multiple Dicer-like and RDR proteins. It was shown that this diversification contributed to specialization of small RNA-directed pathways. Nonetheless, the function of several key enzymes in those pathways remain unclear or unknown. For example, in Arabidopsis, seven out of 10 Argonaute family members (Fagard et al., 2000; Carmell et al., 2002) do show the characteristics of the proteins identified to be part of the miRNA (AGO1) or siRNA (AGO4) pathways (Liu et al., 2004; Song et al., 2004) and thus have the potential to form alternative RISC complexes. In humans, four Argonaute proteins are equally competent to bind small RNAs, but only AGO2 is able to mediate cleavage. In C. elegans, the Argonaute family counts 23 members. Do they participate in some alternative small RNA pathways?
Very little is known about the ancient evolution of miRNAs. Allen et al. (2004) proposed an elegant model where miRNAs arise from inverted duplications of protein-coding genes. Here, the inverted duplications create a perfect hairpin secondary structure that is processed by Dicer enzymes into small RNAs, forming a set of siRNAs that will silence the gene from which they originate. Mutations will then occur in different parts of the hairpin-like structure, progressively transforming the siRNA into a miRNA, with its own locus and the possibility of silencing different targets. Alternative models are likely to exist to give rise to new microRNAs, but they remain to be discovered (Voinnet, 2004).
The inventory of miRNAs genes in different organisms is far from complete at the moment. One may expect the finding of new members in a range of species with specific environmental habits. Cloning and expression experiments together with computational analysis provide a complementary framework for the further identification of miRNAs. In this respect, the availability of newly annotated plant genome sequences will be an important resource for both experimental and in silico approaches. New and faster techniques for the deep sequencing of small RNAs in various organisms, next to analysis of their expression in specific arrays, will also help to get a complete picture of the small RNome.
The general view on miRNA's autonomous function and interplay with other small RNAs within the plant cell is still fuzzy, despite some interesting hypotheses. For example, Bartel & Chen (2004) proposed that miRNAs could act as rheostats of gene expression. In plants, miRNAs could thus, for example, control redundant dose-sensitive genes following polyploidy events. This way, they could prevent the duplication of transcription factors from causing a hugely amplified response (Kidner & Martienssen, 2005). Another interesting hypothesis is that they could act as integrators of other genetic regulatory circuits rather than simple on-off switches. As more and more expression data become available, particularly from small RNA microarrays, it might be possible in the longer term to analyse and integrate those data in order to have an integrated view of the role of the different classes of small RNAs within the different cellular processes.
The question about the universality of the small RNAs and small RNA-driven processes among the eukaryotes and, as far as plants are concerned, among other members of the green lineage, needs further investigation. The same holds true for possible links with the transition of unicellular to multicellular organisms. Which small RNA-driven processes are absent from unicellular and colonial organisms? Did these never exist, or did they get lost?
The future obviously looks bright for biologists interested in plant small RNAs, with plenty of mysteries still to be unravelled.
We thank the anonymous referees for their constructive comments and apologize to those whose work was not included because of space constraints.