Small RNAs – secrets and surprises of the genome


For correspondence (fax +951 827 4437; e-mail


Small RNAs associated with post-transcriptional gene silencing were first discovered in plants in 1999. Although this study marked the beginning of small RNA biology in plants, the sequence of the Arabidopsis genome and related genomic resources that were soon to become available to the Arabidopsis community launched the research on small RNAs at a remarkable pace. In 2000, when the genetic blueprint of the first plant species was revealed, the tens of thousands of endogenous small RNA species as we know today remained hidden features of the genome. However, the subsequent 10 years have witnessed an explosion of our knowledge of endogenous small RNAs: their widespread existence, diversity, biogenesis, mode of action and biological functions. As key sequence-specific regulators of gene expression in the nucleus and the cytoplasm, small RNAs influence almost all aspects of plant biology. Because of the extensive conservation of mechanisms concerning the biogenesis and molecular actions of small RNAs, research in the model plant Arabidopsis has contributed vital knowledge to the small RNA field in general. Our knowledge of small RNAs gained primarily from Arabidopsis has also led to the invention of effective gene knock-down technologies that are applicable to diverse plant species, including crop plants. Here, I attempt to recount the developments of the small RNA field in the pre- and post-genomic era, in celebration of the 10th anniversary of the completion of the first plant genome.

The pre-genomic era of ‘small RNA’ biology

Small RNAs were unknown for the major part of these ‘dark ages’, but a rich body of work starting from the 1980s, primarily conducted by those interested in transgene technologies and plant–virus interactions, established the concept of post-transcriptional gene silencing (PTGS), and culminated in the discovery of small RNAs in 1999. By the mid 1990s it was clear that a mechanism whereby RNA accumulation is suppressed in a sequence-specific manner, by homologous sequences from transgenes or RNA viruses, could not be explained by the paradigms of gene expression at the time. Models to explain the new phenomenon included an RNA intermediate: either an antisense RNA or even small RNAs. Moreover, work with transgenes and viruses also led to the concept of RNA-directed DNA methylation (RdDM), which we now recognize to play an essential role in genome stability. Therefore, the major concepts of RNA-directed RNA degradation and epigenetic modifications were established in the pre-genomic era, although almost none of the protein players in these processes were known at that time.


The desire to manipulate the expression of specific endogenous plant genes led to the discovery of PTGS. In the late 1980s, the phenomenon or technology known as ‘antisense RNA inhibition’, whereby an antisense RNA inhibits the expression/activities of homologous endogenous genes, first shown in animals, was established in plants. By transiently expressing sense and antisense constructs of the chloramphenicol acetyltransferase gene in protoplasts, Ecker and Davis (1986) showed that the antisense construct effectively suppressed the expression of the sense construct. In stably transformed plants, an antisense transgene was able to suppress the expression of a homologous sense transgene (Rothstein et al., 1987). The antisense technology was later found to be effective in the suppression of endogenous genes in stably transformed plants (van der Krol et al., 1988; Smith et al., 1988). Constitutive expression of an antisense chalcone synthase (CHS) gene led to reduced expression of the endogenous CHS gene, and altered floral pigmentation in both Petunia and tobacco (Nicotiana tabacum) (van der Krol et al., 1988). Similarly, expression of an antisense polygalacturonase gene in tomato (Solanum lycopersicum) suppressed the developmentally regulated endogenous gene (Smith et al., 1988). The levels of endogenous CHS and polygalacturonase mRNAs were drastically reduced.

Next, sense transgenes were unexpectedly found to also lead to the suppression of homologous endogenous genes (van der Krol et al., 1990; Napoli et al., 1990; Smith et al., 1990). With the aim of enhancing the colouration of Petunia petals, two groups sought to overexpress flavonoid biosynthesis genes, dihydroflavonol-4-reductase (DFR) or CHS, with either the strong cauliflower mosaic virus 35S promoter or the CHS endogenous promoter (van der Krol et al., 1990; Napoli et al., 1990). At a certain frequency, transgenic plants had reduced floral colouration, which could be attributed to reduced expression of both the endogenous genes and the transgenes (co-suppression). Similarly, constitutive expression of a truncated polygalacturonase gene in tomato resulted in a strong reduction in the levels of the endogenous polygalacturonase gene during fruit ripening (Smith et al., 1990). In the ensuing years, co-suppression was described for many transgene/endogenous gene pairs in a number of plant species (reviewed in Baulcombe, 1996). Moreover, nuclear run-on analyses revealed that this phenomenon occurred at the level of RNA degradation (de Carvalho et al., 1992; Ingelbrecht et al., 1994; Van Blokland et al., 1994; Elmayan and Vaucheret, 1996). The silenced transgenes and endogenous genes were being actively transcribed in the nucleus, but their RNAs failed to accumulate in the cytoplasm. Importantly, it was shown that a single copy 35S::uidA transgene with no sequence homology with any sequence in the tobacco genome underwent PTGS, ruling out models involving DNA–DNA interactions (Elmayan and Vaucheret, 1996).

Virus-induced gene silencing (VIGS)

A rich collection of research on plant RNA viruses was instrumental in revealing the role of PTGS as an anti-viral defense mechanism. Perhaps the earliest observation relating to VIGS as we know today was cross-protection, whereby inoculation of plant hosts with a mild viral strain protected the plants from subsequent infections by related, more severe viruses. These observations, as well as concepts of parasite-derived resistance in bacteria (Sanford and Johnston, 1985), led to a similar idea of pathogen-derived resistance whereby the introduction of pieces of the viral genome into plants may lead to resistance to homologous viruses. This idea was realized in 1986 when it was shown that the coat protein gene of tobacco mosaic virus (TMV), when introduced into tobacco, resulted in resistance against TMV (Abel et al., 1986). The subsequent years witnessed a flurry of similar transgenic efforts that effectively engineered resistance to many viruses in many plant species. However, the underlying molecular mechanisms were unknown. The first indication that at least some aspects of pathogen-derived resistance were RNA-based was from studies in which a non-translatable form of a viral coat protein gene was able to confer virus resistance when introduced into tobacco (van der Vlugt et al., 1992). Studies with pathogen-derived resistance with tobacco etch virus (TEV) finally revealed a link between viral resistance and PTGS (Lindbo et al., 1993). Transgenic tobacco lines expressing the full-length coat protein (CP) gene from TEV took time to develop anti-viral resistance, such that they were initially susceptible to TEV, but became resistant to TEV later in newly emerged leaves. The onset of viral resistance correlated temporally and spatially with the silencing of the CP gene, which was shown to be post-transcriptional by nuclear run-on experiments. The researchers predicted that a cytoplasmic RNA regulatory system of the host, which we now recognize as PTGS, was being triggered by the virus to lead to the simultaneous suppression of both viral and transgene RNAs. Subsequent studies with untranslatable versions of viral sequences corroborated the existence of such an RNA- and homology-based cytoplasmic regulatory system (Smith et al., 1994; Swaney et al., 1995). Since then, the concept that viruses initiate, and are targets of, PTGS was further established by findings that non-viral transgenes could silence viruses if the viruses were engineered to contain the transgenes, and that viruses were able to silence endogenous genes if they carried pieces of host genes (Kumagai et al., 1995; English et al., 1996; Angell and Baulcombe, 1997).

The mobile silencing signal

One intriguing aspect of PTGS in plants is its systemic nature. By grafting a non-silenced scion to a silenced rootstock containing the same transgene, it was shown that a silencing signal moved from the rootstock to the scion, leading to systemic silencing (Palauqui et al., 1997). The sequence-specific nature of the signal was demonstrated by combining a non-silenced scion with a rootstock containing a silenced transgene different from that in the scion. The systemic nature of PTGS was also revealed by an independent study where a GFP-expressing tobacco plant was infiltrated with Agrobacteria containing GFP within the T-DNA (Voinnet and Baulcombe, 1997). Systemic leaves that had not been infiltrated and from which no T-DNA could be detected showed silencing of GFP. Systemic silencing of GFP also occurred when GFP DNA was introduced into lower leaves by particle bombardment (Voinnet et al., 1998). By observing the patterns and timing of the initiation and spread of systemic silencing, it was concluded that the mobile signal travelled from cell to cell through the plasmodesmata, and systemically through the phloem. The sequence specificity of the silencing signal indicates that it is a nucleic acid, but its identity remained enigmatic at the time.

PTGS and RNA interference (RNAi)

In 1998, work in Caenorhabditis elegans revealed that injection of double-stranded RNA (dsRNA) into worms caused potent and specific repression of endogenous gene expression, a phenomenon termed RNAi (Fire et al., 1998). RNAi and PTGS were noted to share a number of common features, such as silencing at the level of mRNA degradation, sequence specificity and their systemic nature. Shortly after, dsRNA-containing transgenes were found to be the most effective trigger of PTGS in plants (Waterhouse et al., 1998; Chuang and Meyerowitz, 2000).


A report in 1989 (Matzke et al., 1989) first documented the phenomenon of homology-dependent de novo methylation. DNA methylation and inactivation of a transgene in tobacco occurred upon the introduction of a second T-DNA. Dependence on the second T-DNA for the methylation and inactivation of the first transgene was clearly established, and it was noted that the two T-DNAs shared regions of sequence identities, including identical promoters. A landmark study in 1994 demonstrated sequence-specific DNA methylation directed by RNA (Wassenegger et al., 1994). In this study, replication-competent and -incompetent versions of the potato spindle tuber viroid (PSTVd) cDNA were introduced into the tobacco genome. It was found that the transgene containing the replication-competent but not the replication-incompetent form was methylated. Infection of the transgenic plants containing the replication-incompetent transgene with the viroid induced methylation of the homologous transgene, but not other non-homologous sequences. This demonstrated that the replicating viroid RNAs specifically resulted in the methylation of the homologous DNA. Finally, it was shown that the silencing of a 35S promoter-driven transgene that conferred hygromycin resistance by another 35S promoter-containing transgene was accompanied by DNA methylation of the 35S promoter of the silenced transgene, and occurred at the level of transcription (Park et al., 1996). Thereby, homology-dependent promoter methylation was linked to transcriptional gene silencing (TGS).

Many studies also noted the methylation of coding sequences in PTGS (Ingelbrecht et al., 1994; Sijen et al., 1996; Van Houdt et al., 1997; Jones et al., 1998). The common theme in the TGS- and PTGS-associated methylation phenomena was that the methylation was largely restricted to regions of sequence homology between the trigger and the silenced loci. DNA–DNA and DNA–RNA interactions were proposed to explain the sequence specificity before dsRNA came to be known as the trigger of RNA silencing.

A major advance in the field of RNA silencing was the demonstration that TGS and PTGS are mechanistically similar, in that dsRNA was the most effective trigger in both processes. Two early studies found that when promoter sequences were arranged in inverted repeats to produce dsRNAs in vivo, homologous promoters elsewhere in the genome became methylated and transcriptionally silenced (Mette et al., 2000; Sijen et al., 2001).

Identification of small RNAs in plants

A major breakthrough in our understanding of TGS and PTGS in plants as well as RNAi in animals came in 1999, when Hamilton and Baulcombe (1999) reported small RNAs associated with PTGS. By fractionation of RNAs by gel electrophoresis followed by hybridization with probes corresponding to various transgenes undergoing PTGS by different triggers, they uncovered the presence of small RNAs matching to both strands of the transgenes in various plants (tomato, tobacco and Nicotiana benthamiana). The small RNAs were only present in plants undergoing PTGS of the transgenes. Small RNAs corresponding to viral sequences were also detected from plants infected with potato virus X. Consistent with the systemic nature of PTGS, small RNAs were detected in systemic tissues exhibiting silencing, triggered by Agroinfiltration of a single, basal leaf.

The discovery of small RNAs immediately shed light on the mechanism of PTGS or RNAi, which by 1998 was known to be triggered by dsRNA (Fire et al., 1998). The fact that the small RNAs associated with PTGS were of both sense and antisense orientations immediately suggested that they are processed from long dsRNA triggers by an endonuclease. It was also natural to speculate that the small RNAs served as the sequence determinants in guiding target RNA degradation. Indeed, biochemical studies performed with Drosophila in vitro RNAi systems uncovered RNA-guided target mRNA degradation at 21–23-nt intervals (Hammond et al., 2000; Zamore et al., 2000). These and other studies were to eventually establish a framework of RNA silencing, thereby revealing a new paradigm of gene regulation.

Plant development

Although the persistent pursuit by a small number of research groups to unravel the mystery behind homology-dependent gene silencing eventually established the concept of RNA-mediated gene silencing, research conducted by developmental biologists unknowingly laid the foundation for the molecular framework of gene silencing by a class of endogenous small RNAs: microRNAs (miRNAs). Many plant miRNAs target transcription factor mRNAs, and play essential roles in plant development. Consequently, genes that participate in miRNA biogenesis or mediate miRNA functions are expected to mutate to pleiotropic developmental defects. A number of genes that play essential roles in miRNA biogenesis or function as we recognize today were identified in the pre-genomic era for various developmental defects that the corresponding mutants displayed, although the molecular functions of the genes were unknown. Mutants in DCL1, the major miRNA-generating Dicer, were isolated in at least three genetic screens aimed at identifying genes acting in different developmental processes. Severe dcl1 alleles were isolated as embryo-lethal mutants, and were found to be defective in embryo and suspensor development (Schwartz et al., 1994; McElver et al., 2001). Less severe dcl1 alleles were found to have short integuments (Robinson-Beers et al., 1992; Ray et al., 1996), or enhance other floral mutants to lead to the over-proliferation of the floral meristem (Jacobsen et al., 1999). A mutation in the miRNA biogenesis gene SERRATE (SE) was found to cause abnormal embryogenesis and accelerated phase changes (Clarke et al., 1999). A mutant in another miRNA biogenesis gene, HYPONASTIC LEAVES 1 (HYL1), exhibited leaf morphological defects and altered sensitivity to various plant hormones (Lu and Fedoroff, 2000). The developmental functions of two argonaute genes, AGO1 and AGO10 (also known as ZWILLE or PINHEAD), were also studied. ago1 mutants displayed severe defects in overall plant architecture, including radialized leaves, abnormal and sterile flowers, and lack of inflorescence branching (Bohmert et al., 1998). The defects of ago10 mutants were more restricted and less severe (Moussian et al., 1998; Lynn et al., 1999). These mutants failed to faithfully maintain the stem cells of the shoot apical meristem, and exhibited mild phenotypes in floral organs and ovules. When first cloned, the plant proteins were found to be similar to a rabbit protein eIF2C, which was so named for its ability to stimulate translation initiation in vitro (Bohmert et al., 1998; Moussian et al., 1998; Zou et al., 1998; Lynn et al., 1999). Since then, eIF2C proteins, now known as argonaute proteins, were found to be broadly conserved among diverse lineages of life, and their in vivo silencing functions were first revealed through genetic analysis in Drosophila and C. elegans (Schmidt et al., 1999; Tabara et al., 1999). Another gene, HUA ENHANCER 1 (HEN1), was also identified in a genetic screen for genes acting in flower development (Chen et al., 2002), and was later shown to be an integral player in various small RNA pathways.

The explosion of the small RNA field in the post-genomic era

Although earlier work conducted primarily with tobacco and N. benthamiana established the phenomena of PTGS and RdDM, the underlying molecular mechanisms remained unknown. The adoption of Arabidopsis as the experimental system to study gene silencing allowed the use of genetics to dissect the underlying molecular framework, as well as the functions of small RNAs. The genome sequence expedited the cloning of genes of which mutants are defective in gene silencing, and together with the sequence-indexed T-DNA knock-out collections, allowed rapid interrogation of gene function through reverse genetics. The discovery of endogenous small RNAs as crucial regulators in various aspects of plant biology further fueled interest, and brought together diverse groups of researchers into the small RNA field. The adaptation of high-throughput sequencing technologies to the discovery of small RNAs revealed a small RNAome of unexpected richness in sequence diversity. The 10 years since the completion of the Arabidopsis genome have been 10 years of a marvelous journey in our quest to understand and manipulate small RNAs. Looking back at these 10 years, one cannot help being held in awe at some of the surprises and wonders along this journey.

Uncovering the genetic requirements of transgene silencing

In the late 1990s, two laboratories successfully engineered PTGS systems in Arabidopsis, and employed them to search for mutants that were impaired in PTGS. One system, also known as L1, was an Arabidopsis line carrying the 35S::uidA transgene, which underwent PTGS such that GUS activity was below the levels of detection (Elmayan et al., 1998). The silencing of the transgene was faithfully initiated at the seedling stage in each generation, and was completed by the adult stage. By ethyl methanesulfonate (EMS) mutagenesis in this line, mutants in several complementation groups that resulted in high GUS activity were recovered and were named sgs (suppressor of gene silencing). SGS2 and SGS3, both of which were required for L1 silencing, were found to encode RNA-DEPENDENT RNA POLYMERASE 6 (RDR6) and a novel protein, respectively (Mourrain et al., 2000). Another gene identified through this screen was AGO1 (Fagard et al., 2000), which we now recognize as the main effector of RNA silencing. HEN1, which was previously identified in a genetic screen for floral patterning genes, and found to be a general factor in miRNA biogenesis (Chen et al., 2002; Park et al., 2002), was also recovered from the screen and shown to be required for the accumulation of transgene small interfering RNAs (siRNAs; Boutet et al., 2003).

Another PTGS system in Arabidopsis exploited the ability of plant viruses to trigger gene silencing. This system, called GxA, combined a 35S::GFP transgene with a potato virus X viral amplicon containing the GFP sequence driven by the 35S promoter (Dalmay et al., 2000a). The viral amplicon resulted in the stable silencing of the 35S::GFP transgene. The GxA line was mutagenized and sde (silencing defective) mutants were isolated. SDE1 and SDE3 were found to encode RDR6 (Dalmay et al., 2000b) and an RNA helicase (Dalmay et al., 2001), respectively. SDE2 was identical to SGS3 recovered from the L1 screen (Hamilton et al., 2002). Because of the heavy DNA methylation associated with both transgenes (Dalmay et al., 2000a), the screen also identified SDE4, which was later found to act in the biogenesis of 24-nt siRNAs that trigger DNA methylation (Hamilton et al., 2002; Herr et al., 2005). SDE5 was found to encode a protein with homology to a human mRNA export factor (Hernandez-Pinzon et al., 2007).

The finding that PTGS in plants requires an RNA-dependent RNA polymerase (RdRP) and an argonaute protein solidified the notion that PTGS in plants is analogous to quelling in fungi and RNAi in C. elegans, as quelling and RNAi also require these types of proteins (Cogoni and Macino, 1999; Tabara et al., 1999; Catalanotto et al., 2000; Smardon et al., 2000). PTGS in plants, quelling in fungi and RNAi in animals are now collectively referred to as RNA silencing. Biochemical studies in animal systems established the major molecular framework of RNA silencing (Figure 1). dsRNA, either exogenously introduced as in RNAi or generated from single-stranded RNAs by RdRPs, are diced into small duplexes of RNAs. One strand of the small RNA duplex is loaded into an argonaute protein, which cleaves a complementary mRNA in the middle of the mRNA/small RNA duplex, to initiate the degradation of this mRNA. The RNase-III enzyme Dicer that is responsible for siRNA production from long dsRNAs, first uncovered in Drosophila (Bernstein et al., 2001), was not identified from the aforementioned genetic screens in Arabidopsis, probably because of functional overlap among the four DICER-LIKE (DCL) genes in Arabidopsis.

Figure 1.

 The molecular framework of RNA silencing.
Double-stranded RNAs (dsRNAs) exogenously introduced, or derived from sense or antisense transcripts by an RNA-dependent RNA polymerase (RdRP), are diced into duplexes of small RNAs by Dicer. One strand of a small RNA duplex is bound by an argonaute protein (AGO), and guides the cleavage of a complementary mRNA in the middle of the small RNA/mRNA complementary region. The mRNA fragments are further degraded by cellular mechanisms of RNA decay. Note that antisense transcript-mediated silencing in plants has not yet been demonstrated to be mediated by small RNAs, although this is highly likely.

Probing the molecular basis of systemic silencing

Studies with the GFP-expressing N. benthamiana line 16c showed that systemic silencing had at least two mobile components: cell-to-cell movement and long-distance movement of unknown silencing signals (Voinnet et al., 1998). Infiltration of a leaf of the 16c plant with Agrobacteria containing a T-DNA with a GFP gene first led to local silencing of the transgenic GFP in the infiltrated area. Subsequently, systemic leaves showed silencing near veins, and the silencing later spread to cover the entire systemic leaves. The vein-associated appearance of silencing in systemic leaves suggested the movement of a signal through the phloem. The later spread of silencing from veins to the rest of the leaves probably entailed cell-to-cell movement of a silencing signal through the plasmodesmata.

Cell-to-cell spread and perception of silencing.

Two groups devised elegant genetic screens to probe the mechanisms underlying the cell-to-cell aspect of silencing movement. Both groups employed the SUC2 promoter, the activity of which is restricted to phloem companion cells (Imlau et al., 1999), to drive the expression of an inverted repeat construct aimed at the silencing of an endogenous gene. The SUC2:SUL (Dunoyer et al., 2005, 2007) and PSuc2:PSDIR (Smith et al., 2007) constructs were used by the Voinnet and Baulcombe groups to target the endogenous SULFUR and PYTOENE DESATURASE genes, respectively. Both constructs resulted in chlorosis surrounding the veins caused by the movement of silencing from the veins into surrounding leaf mesophyll cells. The strains were mutagenized, and mutants with reduced leaf chlorosis were recovered. The strains were also crossed with known mutants in various RNA silencing pathways, to evaluate the genetic requirements of the cell-to-cell movement of RNA silencing. It should be noted that, as a consequence of the experimental set-up, mutants may arise in genes required not only for the cell-to-cell movement of RNA silencing per se, but also for intracellular RNA silencing in both the phloem companion cells and in the recipient leaf mesophyll cells.

The recovery of a series of dcl4 mutants, in which the production of 21-nt siRNAs from the silencing trigger SUC2:SUL was abolished, whereas the levels of 24-nt siRNAs were unaffected, severely reduced the area of leaf tissue showing silencing (Dunoyer et al., 2005). One reasonable interpretation is that DCL4 serves as the major factor in the biogenesis of 21-nt siRNAs in the phloem companion cells, and that the 21-nt, but not 24-nt, siRNAs are the likely agents that move between cells. Testing the effects of known miRNA biogenesis mutants in the SUC2:SUL line also identified AGO1, HEN1 and DCL1 as required for the degree of silencing around the veins (Dunoyer et al., 2004). AGO1 probably binds the SUL siRNAs to mediate their silencing effects, and HEN1 probably methylates the siRNAs to promote their stability (see the section on miRNAs). DCL1 was thought to promote the DCL4-dependent processing of SUL siRNAs by excising the loop structure in the stem loop RNA precursor.

Both mutagenesis screens also recovered mutations in genes known to act in the heterochromatic siRNA pathway (see below), including NRPD1 and NRPD2 (which encode two subunits of RNA polymerase IV, Pol IV), RDR2, and CLASSY1 (Dunoyer et al., 2004; Smith et al., 2007). Mutations in other known genes in the heterochromatic siRNA pathway, such as DCL3, AGO4, DRD1 and NRPE1, did not result in reduced silencing in one or both of the systems (Dunoyer et al., 2004; Smith et al., 2007). It is thought that Pol IV/RDR2/CLASSY1 genes act in the recipient cells to amplify the silencing signal. How these nuclear silencing factors promote the cell-to-cell movement of silencing is unknown.

Long-distance spread and perception of silencing.

Grafting experiments were performed with various mutants, or with an RDR6 PTGS line, as the scion or the rootstock, to evaluate the genetic requirements underlying the long-distance spread or perception of silencing. Using a PTGS line targeting the N. benthamiana RDR6 gene as either the rootstock or the scion, grafting experiments showed that RDR6 is not required in the rootstock to generate the long-distance signal, but is required in the scion for the perception or the effect of the systemic signal (Schwach et al., 2005). To evaluate the effects of known genes in various silencing pathways on systemic silencing, a more extensive analysis was carried out in an Arabidopsis system in which a GFP transgene in the scion was silenced by another transgene containing a GFP inverted repeat in the rootstock. Intriguingly, mutants in the heterochromatic siRNA pathway genes such as NRPD1, DCL3, RDR2 and AGO4 were found to be deficient or compromised in systemic silencing when used as the scion. This suggests that the nuclear heterochromatic silencing pathway acts in the perception or amplification of silencing in the scion. Unpublished studies combining similar grafting efforts and the detection of small RNAs in systemic tissues by the sensitive high-throughput sequencing technology revealed the 22-, 23-, and 24-nt siRNAs as the systemic signal (M. Melnyk and D. Baulcombe, personal communications).

Understanding the arms race between plants and RNA viruses

As the concept that plant viruses are both the trigger and the target of RNA silencing was developing, another exciting revelation was that some viral proteins served as suppressors of RNA silencing. The potyvirus HcPro protein and the cucumber mosaic virus (CMV) 2b protein were the first viral RNA silencing suppressors identified (Anandalakshmi et al., 1998; Beclin et al., 1998; Brigneti et al., 1998; Kasschau and Carrington, 1998). Since then, more than 30 viral RNA silencing suppressors have been identified from single-stranded RNA viruses, double-stranded RNA viruses and double-stranded DNA viruses (reviewed in Li and Ding, 2006). Plant viruses usually have small genomes that encode only a few proteins. The fact that almost all plant viruses encode RNA silencing suppressors reinforces the concept that RNA silencing evolved as an antiviral defense mechanism, and highlights the arms race between plants and plant viruses.

Viral RNA silencing suppressors from different families of viruses tend to have little sequence or structural similarities, and can interfere with the RNA silencing pathway with different strategies (reviewed in Li and Ding, 2006). The potyvirus HcPro leads to a reduced accumulation of 21-nt siRNAs by either inhibiting the dicing of long dsRNAs (Mallory et al., 2001; Dunoyer et al., 2004) or destabilizing small RNAs by preventing their methylation (Ebhardt et al., 2005; Yu et al., 2006). The CMV 2b protein inhibits the slicer activity of AGO1 (Zhang et al., 2006). p19 of tombusviruses binds small RNA duplexes, the products of Dicer, and presumably sequesters the small RNAs from downstream events in small RNA biogenesis, such as methylation and the assembly of the small RNA-AGO1 complex known as the RNA-induced silencing complex (RISC; Dunoyer et al., 2004; Lakatos et al., 2004; Silhavy et al., 2002; Yu et al., 2006). P0 of the poleroviruses targets AGO1 for proteasome-mediated degradation (Baumberger et al., 2007; Bortolamiol et al., 2007). Because siRNA and miRNA pathways share key molecular components, some RNA silencing suppressors also affect the biogenesis or activities of endogenous small RNAs, especially miRNAs (Kasschau et al., 2003; Chapman et al., 2004; Chen et al., 2004; Dunoyer et al., 2004; Zhang et al., 2006). It has been proposed that the symptoms caused by viral infection are in part the result of inhibitory effects on host miRNAs by viral RNA silencing suppressors (Kasschau et al., 2003; Silhavy and Burgyan, 2004). This idea has recently been called into question because the phenotypic effects of viral silencing suppressors have been observed when the genes are expressed in plants from the nearly ubiquitous 35S promoter, whereas viral infection may not achieve as broad a spatial distribution of viruses (Li and Ding, 2006).

Revealing miRNAs as endogenous regulators of gene expression

Discovery.  The discovery of miRNAs as endogenous regulators of gene expression was one of the major advances in biology. The first miRNA lin-4 was identified in 1993 in C. elegans as a regulator of developmental timing, and was found to repress the translation of its target mRNA lin-14, to which it base-paired with partial sequence complementarity (Lee et al., 1993; Wightman et al., 1993). lin-4 was perhaps initially thought of as an oddity of the worm, as no clear homologs of lin-4 were found in insects or mammals. The discovery of a second miRNA, let-7, that has homologs in all animal lineages with bilateral symmetry, led to the realization that miRNAs are common regulators of gene expression in animals (Pasquinelli et al., 2000; Reinhart et al., 2000). In 2001, three groups reported the cloning of many miRNAs from C. elegans, Drosophila and humans (Lagos-Quintana et al., 2001; Lau et al., 2001; Lee and Ambros, 2001). In 2002, efforts to clone small RNAs from Arabidopsis by several groups led to the first discovery of miRNAs from plants (Llave et al., 2002a; Park et al., 2002; Reinhart et al., 2002). By now, miRNAs have been found either by homology or by cloning from plants representing many major lineages, such as green algae (Molnar et al., 2007; Zhao et al., 2007), mosses (Axtell and Bartel, 2005; Talmor-Neiman et al., 2006a,b; Axtell et al., 2007), ferns (Axtell and Bartel, 2005), gymnosperms (Lu et al., 2007; Morin et al., 2008), monocots (Sunkar et al., 2005; Yao et al., 2007) and dicots (Barakat et al., 2007a,b; Sunkar and Jagadeeswaran, 2008). The discovery of miRNAs relied mainly on size fractionation to isolate small RNAs from total RNAs, ligation of the small RNAs to 5′ and 3′ adaptors, and RT-PCR of the ligation products to generate a small RNA library, which is then sequenced. Initial efforts towards miRNA discovery relied on traditional Sanger sequencing, and resulted in the identification of mostly abundant miRNA species. The adaptation of high-throughput sequencing technologies to small RNA discovery, first performed in plants (Lu et al., 2005), greatly enriched the known repertoire of small RNAs, including miRNAs, not only in plants but also in animals. The combination of high-throughput sequencing with mutants defective in endogenous siRNA biogenesis further allowed the detection of rare miRNAs (Lu et al., 2006). A few years after the Arabidopsis genome was sequenced, a new class of regulatory genes hidden in the genome was finally unveiled. Eight years since their first discovery, plant miRNAs are now widely recognized as major players in gene regulation that impact almost all aspects of plant biology.

Biogenesis.  Molecular genetic studies in Arabidopsis established the major framework of miRNA biogenesis (Figure 2). A mature miRNA lies in one arm of the stem of a hairpin RNA precursor, which is named a pre-miRNA. The hairpin RNA tends to reside in a larger RNA (as revealed by expressed sequence tags, ESTs), termed a pri-miRNA. The fact that the RNase III enzyme Dicer generates the lin-4 and let-7 miRNAs in C. elegans (Grishok et al., 2001; Hutvagner et al., 2001; Ketting et al., 2001) prompted two groups to investigate the role of DCL1, one of the four Arabidopsis homologs of Dicer, in miRNA biogenesis. The weak dcl1-9 allele was shown to have reduced accumulation of most tested miRNAs, indicating that DCL1 is a major miRNA Dicer in plants (Park et al., 2002; Reinhart et al., 2002). Whereas the C. elegans dcr-1 mutant resulted in reduced miRNA accumulation, accompanied by over-accumulation of pre-miRNAs, pre-miRNAs were not readily detectable by northern blotting in the Arabidopsis dcl1-9 mutant. In fact, dcl1 mutants had reduced levels of pre-miRNAs and increased levels of pri-miRNAs (Kurihara and Watanabe, 2004; Kurihara et al., 2006), consistent with a role of DCL1 in the processing of pri-miRNAs to pre-miRNAs, as well as pre-miRNAs to miRNA/miRNA* duplexes. Although DCL1 is the major DICER-LIKE protein that produces miRNAs in Arabidopsis, DCL4 also generates a few miRNAs that are derived from precursors with long double-stranded regions (Rajagopalan et al., 2006).

Figure 2.

 Three distinct endogenous small RNA pathways.
(a) microRNA (miRNA) metabolism. A primary miRNA (pri-miRNA) is processed into the pre-miRNA, which is furthered processed into the miRNA/miRNA* duplex. This duplex undergoes 2′-O-methylation by the small RNA methyltransferases HEN1. One strand is bound by AGO1. The mature miRNAs are turned over by the SDN1 family of small RNA exonulceases.
(b) The biogenesis of trans-acting small interfering RNAs (ta-siRNAs). A non-coding ta-siRNA-generating region (TAS) is transcribed into a single-stranded RNA, which is targeted by an miRNA for cleavage. After cleavage, one of the fragments is copied into double-stranded RNA (dsRNA) by RDR6. The dsRNA is processed into phased siRNAs by DCL4. Some of the siRNAs are bound by AGO1 and regulate other mRNAs in trans, as do miRNAs.
(c) Biogenesis and function of heterochromatic siRNAs. A heterochromatic locus is presumably transcribed by Pol IV into a single-stranded RNA, which is turned into dsRNA by RDR2. The dsRNA is processed into 24-nt siRNAs, which are bound by AGO4. The AGO4/siRNA complex is attracted to the homologous DNA locus by transcripts generated by Pol V, and recruits DNA and histone modification machinery to result in heterochromatin formation. CLASSY1 and DRD1 are SNF2-like, putative chromatin remodeling proteins that act together with Pol IV and Pol V, respectively. DMS3 and KTF1 are also required for recruiting AGO4 to genomic loci.

Other genes in miRNA biogenesis were soon identified primarily based on the similar developmental defects exhibited by mutants in these genes and dcl1 mutants. HYL1, a protein with dsRNA binding domains, and SE, a zinc-finger protein, were found to be required for miRNA biogenesis, because mutants in these genes had lower levels of most miRNAs (Han et al., 2004; Vazquez et al., 2004a; Lobbes et al., 2006; Yang et al., 2006a). The reduced accumulation of pre-miRNAs and/or over-accumulation of pri-miRNAs in these mutants placed the two genes in the pri-miRNA-to-pre-miRNA processing step, together with DCL1 (Kurihara et al., 2006; Lobbes et al., 2006; Yang et al., 2006a). The two proteins were then shown to interact with each other and with DCL1 in vitro, and to co-localize with DCL1 in vivo in so-called nuclear D-bodies, presumably sites of miRNA biogenesis (Hiraguri et al., 2005; Kurihara et al., 2006; Lobbes et al., 2006; Yang et al., 2006a; Song et al., 2007). In vitro processing assays showed that the two proteins determine the precision of excision of miRNAs by DCL1 (Dong et al., 2008).

Which polymerase generates the pri-miRNAs? ESTs from a few MIR genes indicated that pri-miRNAs were spliced and polyadenlyated, implicating Pol II in the transcription of MIR genes (Aukerman and Sakai, 2003; Kurihara and Watanabe, 2004). A comprehensive analysis of transcription start sites of numerous MIR genes revealed the presence of TATA boxes in their promoters, and indicated that Pol II transcribes MIR genes (Xie et al., 2005a). This implies that MIR genes can be subjected to transcriptional regulation, as are protein-coding genes. Indeed, many miRNAs exhibit temporally or spatially regulated patterns of expression, or accumulate in response to environmental stimuli. Mutants in the two nuclear cap binding complex (CBC) genes, CBP20 and CBP80 (also known as ABH1; Hugouvieux et al., 2001) exhibit phenotypes similar to weak se alleles in terms of leaf serration, and have reduced levels of some but not all miRNAs (Gregory et al., 2008; Laubinger et al., 2008). It was not until recently that ARS2, the SE homolog in animals, was shown to act in miRNA biogenesis in Drosophila and mice (Gruber et al., 2009; Sabin et al., 2009). The studies also revealed the physical interaction between ARS2 and CBC. Therefore, the CBPs may link pri-miRNA transcription to its processing.

After DCL1-mediated processing of pre-miRNAs to generate miRNA/miRNA* duplexes, the next step in miRNA biogenesis is methylation, a step that was completely unexpected because of a lack of a parallel process in animal miRNA biogenesis. In the course of studying floral morphogenesis, my group isolated mutants in the HEN1 gene that promotes the development of reproductive organs in the flower (Chen et al., 2002). The phenotypic similarities between hen1 and dcl1-9 plants prompted us to test whether HEN1 played a role in miRNA biogenesis. Indeed, we found that miRNAs were reduced in abundance in hen1 mutants (Park et al., 2002). Next, the presence of a methyltransferase domain in the HEN1 protein prompted us to test whether HEN1 is an miRNA methyltransferase. In vitro methyltransferase reactions with various known intermediates in miRNA biogenesis as substrates revealed that HEN1 was a methyltransferase acting only on miRNA/miRNA* duplexes, in a sequence-independent but structure-dependent manner (Yu et al., 2005). HEN1 requires the presence of both the 2′ and 3′ hydroxyls on the 3′ terminal ribose and a 2-nt 3′ overhang (features of Dicer products), and prefers 21–24-nt duplexes (sizes of products of the four Arabidopsis DCL proteins) for its activities (Yang et al., 2006b). Biochemical studies pinpointed the 2′ OH on the 3′ terminal ribose as the site of methylation (Yang et al., 2006b). A recent study solved the structure of the HEN1 protein and shed light on the structural basis for substrate recognition by HEN1 (Huang et al., 2009b). The structural studies also revealed a novel Mg2+-dependent enzymatic mechanism, not found in known RNA methyltransferases, that could account for the 2′ OH-specific methylation.

The presence of a methyl group on plant miRNAs was confirmed by β-elimination reactions that require the presence of both the 2′ and 3′ hydroxyl groups on the 3′ terminal ribose, and by mass spectrometry analysis of a purified plant miRNA species (Yu et al., 2005). It is now established that plant miRNAs carry a 2′-O-methyl group on the 3′ terminal ribose. Most animal miRNAs are not methylated, but siRNAs and piwi-interacting RNAs are 2′-O-methylated in animals by HEN1 homologs (Kim et al., 2009).

The 2′-O-methyl group appears to protect miRNAs from 3′ exonucleolytic degradation and 3′ uridylation, a process in which a short U-rich tail is added to unmethylated miRNAs (Li et al., 2005). The function of uridylation is unknown, but is likely to cause instability of miRNAs. Tailing of miRNAs or other types of small RNAs with Us or As has also been found in animals (Kim et al., 2009; van Wolfswinkel et al., 2009). We suspect that tailing is a general strategy in the regulation of miRNA stability. Our recent study identified the SDN family of exonucleases that preferentially degrades single-stranded small RNAs in vitro, and limits the abundance of miRNAs in vivo (Ramachandran and Chen, 2008). It is not yet known whether the SDN proteins preferentially degrade U-tailed or non-tailed miRNAs.

The nuclear localization of DCL1 suggests that the two processing events leading to the production of miRNA/miRNA* duplexes occur in the nucleus (Papp et al., 2003). The nuclear processing and the cytoplasmic actions of miRNAs imply an export step in miRNA biogenesis. HASTY, a homolog of the mammalian pre-miRNA export factor exportin 5, is likely to export miRNA/miRNA* duplexes or miRISCs to the cytoplasm (Park et al., 2005).

The final step in miRNA biogenesis is the incorporation of the miRNA strand into AGO1, one of 10 Arabidopsis argonaute proteins. AGO1 was first characterized for its roles in plant development, especially leaf polarity specification (Bohmert et al., 1998). Mechanistic insights into the molecular functions of argonaute proteins were provided by structural studies, which revealed that the conserved piwi domain adopted an RNase H-fold, and biochemical studies, which demonstrated that mammalian AGO2 exhibited small RNA-guided endonucleolytic activity against target mRNAs (Liu et al., 2004; Song et al., 2004). AGO1 from Arabidopsis was shown to bind miRNAs and execute the cleavage of target mRNAs (Baumberger and Baulcombe, 2005; Qi et al., 2005). In the 10-member argonaute family in Arabidopsis, AGO10 is the most closely related to AGO1, and has been implicated in the translational repression of miRNA targets (Brodersen et al., 2008). It has not been shown yet whether AGO10 associates with miRNAs. Two other genes, VCS, a member of the mRNA decapping complex, and KTN1, a subunit of the microtubule-severing enzyme KATANIN, were found to be required for miRNA-mediated translational repression of target mRNAs (Brodersen et al., 2008).

Function.  Studies with the first miRNA lin-4 suggested that it inhibits the translation of its target mRNA lin-14 (Wightman et al., 1993; Olsen and Ambros, 1999). This conclusion was based on the disproportionate effects on the lin-14 protein versus mRNA levels caused by mutations in the miRNA. Animal miRNAs were later found to also cause the degradation of their target mRNAs, via decapping and/or deadenylation (Bagga et al., 2005; Behm-Ansmant et al., 2006; Eulalio et al., 2009). Although the mechanisms of action of animal miRNAs are still controversial, it is clear that the great majority of animal miRNAs pair with their target mRNAs with a central bulge, which would prevent the cleavage of the mRNAs.

When plant miRNAs were first identified, it was noted that plant miRNAs had extensive sequence complementarity to their potential targets (Llave et al., 2002a; Park et al., 2002; Reinhart et al., 2002). Soon afterwards, it was demonstrated that miR171 could lead to the cleavage of its target mRNA (Llave et al., 2002b). The 5′ end of the 3′ cleavage product of the miR171 target mRNA mapped to the nucleotide opposite the 10th nucleotide from the 5′ end of miR171 in the miR171/mRNA duplex, as determined by 5′ RACE PCR, a method that has since been widely adopted for the confirmation of miRNA targets in plants. That cleavage is not the only mode of action of plant miRNAs was realized when the effects of miR172, which targets AP2 and related genes, were analyzed (Aukerman and Sakai, 2003; Chen, 2004). Over-expression of miR172 led to phenotypes resembling ap2 loss-of-function mutants. Consistent with the phenotypic outcome, AP2 protein levels were reduced; however, AP2 mRNA levels were unaffected. Similarly, transgenic lines expressing AP2 or a miR172-resistant version of AP2 could have similar levels of the AP2 mRNA but drastically different levels of the AP2 protein, as well as different phenotypic outcomes. These observations led to the conclusion that miR172 could inhibit the translation of AP2 mRNA. miR156/157, which targets the SPL family of transcription factor genes, and whose target sites in some of the genes reside in the 3′ untranslated regions, was also found to disproportionately affect the mRNA and protein levels of its target genes (Gandikota et al., 2007). A more extensive survey of many miRNA/target pairs found widespread translational repression by plant miRNAs (Brodersen et al., 2008). Therefore, plant miRNAs, as well as animal miRNAs, regulate their target mRNAs through both mRNA degradation and translational inhibition. Intriguingly, a study conducted on miR398 and its two targets encoding copper superoxide dismutase (CSD) suggested that the miRNA/target pairing requirements differ between the two modes of action (cleavage vs. translational repression) (Dugas and Bartel, 2008). Note that the mechanisms of translational inhibition are still highly controversial, and proposed mechanisms include inhibition of translation initiation, inhibition of translation elongation, post-translational or co-translational protein degradation and sequestration of mRNA targets in subcellular structures, such as P bodies or stress granules (reviewed in Valencia-Sanchez et al., 2006).

A major biological function of miRNAs is in controlling development. Many plant miRNAs target transcription factor genes (Rhoades et al., 2002), which tend to play important roles in developmental regulation. In fact, a number of miRNAs, such as miR172, miR319 and miR164, were discovered in genetic screens because either their loss-of-function or over-expression led to developmental defects (Aukerman and Sakai, 2003; Palatnik et al., 2003; Baker et al., 2005). Some miRNA-target modules are conserved within or beyond angiosperms. For example, miR165/166 helps restrict the homeodomain leucine zipper genes PHBULOSA, PHAVOLUTA and REVOLUTA, which specify adaxial identity in leaves, to the adaxial domain in both Arabidopsis and maize (Zea mays) (McConnell et al., 2001; Emery et al., 2003; Juarez et al., 2004; Mallory et al., 2004). Like many miRNA-target regulatory modules, the regulation of AP2-domain protein genes by miR172 is conserved in diverse plant species. However, the biological functions of the regulatory modules may vary in different species. In Arabidopsis, miR172 promotes flowering and the determinacy of floral meristems (Aukerman and Sakai, 2003; Chen, 2004). In maize, miR172 regulates sex determination and meristem activities, and is also implicated in vegetative phase transition (Lauter et al., 2005; Chuck et al., 2007). In potato (Solanum tuberosum), miR172 promotes flowering and induces tuber formation (Martin et al., 2009). It is interesting to note that in species as divergent as Arabidopsis and C. elegans, miRNAs are employed as regulators of developmental transitions. In C. elegans, the first two miRNAs discovered, lin-4 and let-7, promote the transition from L1 to L2 larval stages, and from the L4 larval stage to adulthood, respectively (Lee et al., 1993; Reinhart et al., 2000). In Arabidopsis, a gradual decrease in miR156 levels as the plant ages leads to a gradual increase in miR172 levels to ensure proper vegetative phase transitions and the vegetative-to-reproductive transition (Wang et al., 2009; Wu et al., 2009).

Uncovering a novel class of small RNA regulators – the trans-acting siRNAs (ta-siRNAs)

Discovery and biogenesis.  Ta-siRNAs are phased 21-nt small RNAs that are produced from non-coding RNA precursors, and that target other mRNAs in trans, like miRNAs. The initial discovery of this intriguing class of small RNAs was from efforts to identify endogenous targets of RDR6 and SGS3, two genes with a role in transgene-mediated RNA silencing. Comparative transcript profiling between the wild type and rdr6 and/or sgs3 mutants identified either the loci giving rise to the ta-siRNAs (Vazquez et al., 2004b) or the targets of the ta-siRNAs, which then led to the discovery of the ta-siRNAs (Peragine et al., 2004). These two early studies also defined the biogenesis requirements of ta-siRNAs by examination of their accumulation in various mutants known to affect miRNA or siRNA biogenesis. The unique biogenesis requirements, involving both the miRNA biogenesis machinery and non-miRNA biogenesis genes, such as RDR6 and SGS3, suggested that the ta-siRNAs constituted a new class of endogenous small RNAs. The presence of clusters of small RNAs with both sense and antisense polarities suggested that these were siRNAs.

Further examination of the non-coding transcripts from the TAS loci (loci that give rise to ta-siRNAs) identified potential binding sites for known miRNAs, miR173 and miR390, the targets of which had been previously unknown (Allen et al., 2005; Yoshikawa et al., 2005). Indeed, it was shown that cleavage of the TAS non-coding transcripts guided by the miRNAs triggered the production of phased siRNAs from the end defined by the cleavage. Mutants in DCL4, a gene not required for the biogenesis of miR173 or miR390, were defective in ta-siRNA production, suggesting that DCL4 performed the phased processing of ta-siRNA precursors (Gasciolli et al., 2005; Xie et al., 2005b; Yoshikawa et al., 2005). Given these known players, a pathway of ta-siRNA biogenesis emerged: miRNA-triggered cleavage of a TAS non-coding RNA results in two cleavage products, one of which is copied into dsRNA by RDR6. The dsRNA is processed by DCL4 from the end defined by miRNA-mediated cleavage, in a phased manner. Only some of the phased small RNAs are stable, and are bound by AGO1 (Baumberger and Baulcombe, 2005).

Further dissection of the biogenesis of the ta-siRNAs revealed that not all miRNA-mediated cleavage triggered the production of ta-siRNAs. Substitution of the miR173 binding site, or one of the two miR390 binding sites, respectively, in the TAS1 or TAS3 loci with other miRNA binding sites abolished ta-siRNA production (Montgomery et al., 2008a,b). miR390 is unique in that it is bound specifically by AGO7 but not by AGO1 (Montgomery et al., 2008a). The function of AGO7 at TAS3 cannot be substituted by AGO1.

Function.  ta-siRNAs regulate target genes in trans, as do miRNAs and siRNAs in PTGS. They cause the cleavage of their target mRNAs in the same manner as miRNAs (Peragine et al., 2004; Vazquez et al., 2004b). Among the four TAS loci identified so far in Arabidopsis, biological functions are only known for a TAS3 ta-siRNA, which targets AUXIN RESPONSE FACTOR 3 (ARF3) and ARF4, and is thus known as the tasiR-ARF. The tasiR-ARF is conserved in diverse flowering plants, but its biological functions appear to vary in different organisms, probably depending on the presence or absence of compensatory mechanisms. In Arabidopsis the tasiR-ARF regulates the timing of vegetative phase transitions, such that rdr6, sgs3 and ago7 mutants display phenotypes indicative of precocious vegetative transitions (Hunter et al., 2003, 2006; Peragine et al., 2004; Adenot et al., 2006; Fahlgren et al., 2006). tasiR-ARF also contributes to leaf adaxial polarity specification by restricting the expression of ARF3 and ARF4 to the abaxial side of leaves (Garcia et al., 2006; Chitwood et al., 2009). However, rdr6, sgs3 or ago7 single mutants do not show obvious leaf polarity defects, probably because of the presence of genes with overlapping functions, such as AS1 (Garcia et al., 2006). Overall, Arabidopsis rdr6, sgs3, ago7 and dcl4 mutants display only mild morphological defects. In contrast, mutants in the rice (Oryza sativa) or maize homologs of Arabidopsis RDR6, SGS3 and DCL4 have much stronger developmental defects. Severe alleles in rice and maize have radialized leaves, and the rice alleles also have abnormal meristematic activities and are seedling lethal (Liu et al., 2007; Nagasaki et al., 2007; Nogueira et al., 2007).

A great mystery concerning ta-siRNAs is how they evolved, and why such a complicated system is used to regulate target genes. One may speculate, and there is some supporting evidence (Ronemus et al., 2006), that miRNA-mediated cleavage of target genes naturally triggers the production of secondary siRNAs at a low efficiency. Perhaps the TAS loci began as normal miRNA target loci, but were selected in evolution because of the regulatory potential of their secondary siRNAs. The efficiency of secondary siRNA production from these loci were than enhanced by unknown mechanisms, which are manifested by the ‘special’ properties of miR173 and miR390, in that they cannot be replaced by other miRNAs for triggering ta-siRNA production.

Appreciation of endogenous siRNAs as guardians of the genome

Discovery.  The amazing diversity of endogenous 24-nt siRNAs was one of the surprises in our appreciation of the transcriptome. Initial low-throughput small RNA profiling experiments first indicated the presence of many endogenous small RNAs that were not miRNAs (Llave et al., 2002a). High-throughput sequencing of small RNAs revealed a surprising landscape of endogenous small RNAs, with tens of thousands of distinct 24-nt siRNA species that accounted for more than 80% of the total small RNA reads (Lu et al., 2005, 2006; Henderson et al., 2006; Rajagopalan et al., 2006; Kasschau et al., 2007; Zhang et al., 2007; Mosher et al., 2008). These small RNAs are often referred to as heterochromatic siRNAs, as they tend to be derived from repeats and transposable elements, and act to silence the loci in cis.

Biogenesis and function.  As in the case of miRNAs and ta-siRNAs, the elucidation of the biogenesis requirements of endogenous 24-nt siRNAs benefited from the rich genetic resources in Arabidopsis. The first study in 2002 examined the known PTGS mutants such as rdr6, sgs3, sde3 and sde4, and found that endogenous siRNAs corresponding to the transposable element AtSN1 were absent in sde4, but unaffected in the other mutants (Hamilton et al., 2002). This study also correlated the absence of AtSN1 siRNAs with the lack of DNA methylation at this locus, providing the first piece of evidence that 24-nt siRNAs trigger DNA methylation of homologous loci. Forward genetic studies identified AGO4 as required for siRNA-triggered DNA methylation, as well as siRNA accumulation at certain loci (Zilberman et al., 2003). Reverse genetic studies using mutants in the RDR and DCL families of genes identified RDR2 and DCL3 as being required for endogenous siRNA biogenesis (Xie et al., 2004). The reduced DNA methylation and histone H3 lysine 9 dimethylation at endogenous siRNA loci in rdr2, dcl3 and ago4 mutants further cemented the role of siRNAs in heterochromatin formation and TGS. By 2004, an RDR2/DCL3/AGO4 pathway was recognized to distinguish endogenous 24-nt siRNAs from other types of small RNAs, such as miRNAs and ta-siRNAs. A forward genetic study also revealed a similar role of AGO6 to that of AGO4 in siRNA-mediated DNA methylation and TGS (Zheng et al., 2007).

Another major advance regarding heterochromatic siRNAs was the finding that two DNA-dependent RNA polymerases, Pol IV and Pol V, specialize in siRNA production and siRNA-mediated TGS, respectively. Pol IV and Pol V are plant-specific polymerases that appear to have evolved from Pol II, because many of their subunits are shared with Pol II or have paralogous counterparts in Pol II (He et al., 2009a; Huang et al., 2009a; Lahmy et al., 2009; Ream et al., 2009). They differ from Pol II and from each other by their largest subunit (NRPD1 and NRPE1 for Pol IV and Pol V, respectively) as well as other subunits. They share the same second-largest subunit (NRPD2 or NRPE2), which differs from the second-largest subunit of Pol II. SDE4, which was demonstrated to act in siRNA biogenesis (Hamilton et al., 2002), was found to encode the largest subunit of Pol IV (NRPD1) (Herr et al., 2005). Mutants in NRPD2 were also found to lack 24-nt siRNAs (Onodera et al., 2005). It can thus be concluded that Pol IV generates 24-nt siRNAs – it is thought that Pol IV transcribes heterochromatic loci to generate siRNA precursors, which are made double-stranded by RDR2. However, transcripts of Pol IV have not been detected so far. CLASSY1, a protein similar to the SNF2 chromatin remodeling protein, was also found to act in siRNA biogenesis (Smith et al., 2007). It is thought to assemble the Pol IV transcription apparatus at heterochromatic loci.

Studies with mutants in the gene encoding the largest subunit of Pol V, NRPE1, established a role of Pol V in siRNA-mediated TGS. Mutants in NRPE1 were found to be defective in hairpin RNA-triggered promoter DNA methylation, despite normal levels of siRNA accumulation from the hairpin transgene (Kanno et al., 2005). At some endogenous siRNA loci, siRNA accumulation was not affected in the nrpe1 mutants, but at other loci, siRNA accumulation was largely diminished (Kanno et al., 2005; Pontier et al., 2005). It is thought that the primary function of Pol V is to act downstream of siRNAs in heterochromatin formation, and that the heterochromatic marks help promote siRNA biogenesis in a feed-forward loop. As such, the role of Pol V in siRNA biogenesis at some loci is an indirect effect of its role in siRNA-mediated TGS. Genetic screens using reporter genes that undergo siRNA-mediated TGS identified more genes that appear to act downstream of siRNAs in RdDM. These include DRD1, an SNF2-like protein, DMS3, a chromatin associated protein, and KTF1, a protein with similarity to the transcription elongation factor SPT5 (Kanno et al., 2004, 2008; Bies-Etheve et al., 2009; He et al., 2009b; Huang et al., 2009a). In maize, paramutation, a phenomenon in which one allele leads to meiotically heritable silencing of a homologous allele, was also found to require RDR2 and Pol IV, thus linking endogenous siRNAs to this genetic phenomenon (Alleman et al., 2006; Erhard et al., 2009).

Pol V has recently been shown to generate non-coding transcripts from heterochromatic loci (Wierzbicki et al., 2008). These transcripts originate outside of, and perhaps traverse, the siRNA-generating loci, and require Pol V, DRD1 and DMS3 for their accumulation (Wierzbicki et al., 2008, 2009). It is assumed that DRD1 and DMS3 program the Pol V transcription apparatus at siRNA loci. The transcripts are bound by AGO4 and are required for the occupancy of AGO4 at heterochromatic loci, suggesting that the transcripts recruit AGO4/siRNAs to the loci (Wierzbicki et al., 2009). At a subset of heterochromatic loci, Pol II (instead of Pol V) generates non-coding transcripts that recruit AGO4/siRNAs (Zheng et al., 2010). The current model of siRNA-mediated TGS is that siRNAs are recruited to homologous genomic loci by base-pairing interactions with the Pol II or Pol V-dependent transcripts, and the siRNA/AGO4 RISCs then recruit DNA methyltransferases and histone H3K9 methyltransferases to effect TGS. Both Pol II and Pol V can physically interact with AGO4 (Li et al., 2006; El-Shami et al., 2007; Zheng et al., 2009).

A short time after siRNAs were first found to cause DNA methylation of homologous loci, in 2002, resources in Arabidopsis, including forward genetics, reverse genetics and proteomics, allowed the rapid dissection of the molecular framework underlying the biogenesis and function of this class of siRNAs.

Finding other types of small RNAs

Natural antisense siRNAs (nat-siRNAs) are derived from two mRNAs that are partially complementary to each other (Borsani et al., 2005; Katiyar-Agarwal et al., 2006; Held et al., 2008). Cis-antisense transcripts are found where two nearby genes have overlapping transcription units with opposing polarities. Genomic and bioinformatic analyses show that endogenous small RNAs tend to be enriched in regions of overlap between two cis-antisense transcripts (Henz et al., 2007; Jin et al., 2008). Although the mRNAs of pairs of cis-antisense transcripts are expected to be in the cytoplasm, the biogenesis of nat-siRNAs from two loci that have been studied so far involves Pol IV, which is expected to act in the nucleus (Borsani et al., 2005; Katiyar-Agarwal et al., 2006). This suggests that there may be a nuclear step in the biogenesis of nat-siRNAs. The requirement for an RDR in nat-siRNA biogenesis (Borsani et al., 2005; Katiyar-Agarwal et al., 2006) implies that the siRNAs are not simply produced from dsRNAs formed between the pairing of the two partially complementary mRNAs.

Small RNAs that are longer than miRNAs and siRNAs have also been found in Arabidopsis (Katiyar-Agarwal et al., 2007). It remains to be determined whether these small RNAs are widely used in Arabidopsis and other plants as regulators of gene expression.

Development of new technologies based on small RNAs

While the discovery of small RNAs could be traced back to attempts to develop transgenic technologies to manipulate gene expression for the benefit of agriculture, we have come full circle in translating our knowledge of small RNAs into more effective and highly specific knock-down technologies. Based on the knowledge that dsRNAs are the trigger of RNA silencing, strategies to knock down gene expression in vivo shifted from using antisense constructs to ones that generate hairpin RNAs (Figure 3; Waterhouse et al., 1998). The dsRNA strategy greatly improved the efficiency of gene knock-down. However, one potential problem with the dsRNA technology is that numerous small RNAs are derived from a hairpin RNA, such that some of the small RNAs may fortuitously regulate the expression of other genes not intended to be manipulated. The knowledge of miRNA biogenesis requirements and targeting specificities has allowed the development of the artificial miRNA technology that affords more specificity (Alvarez et al., 2006; Schwab et al., 2006). In this technology, an artificial miRNA is designed to target one or a few genes, and is expressed in vivo from the backbone of a known miRNA (Figure 4). When combined with tissue-specific or inducible promoters, the technology allows the controlled knock down of specific genes. Another potential advantage of the artificial miRNA technology over the hairpin RNA technology is that the silencing trigger itself is less prone to silencing by small RNAs as a result of mismatches in the two arms of the artificial miRNA precursor. VIGS is another knock-down technology based on small RNAs and RNA silencing (Figure 5; Baulcombe, 1999; Kumagai et al., 1995; Ruiz et al., 1998). By infiltration of a viral vector containing a piece of sequence homologous with an endogenous gene into a basal leaf, transient silencing of the target gene occurs in the systemic tissues. This technology allows the interrogation of gene function in species that are not amenable to genetic analysis or are difficult to transform, and offers the opportunity for large-scale reverse genetic studies.

Figure 3.

 Post-transcriptional gene silencing (PTGS) based on inverted repeat (IR) transgenes as an effective gene knock-down strategy.
(a) A diagram of IR-PTGS. Sense and antisense sequences homologous with an endogenous gene to be silenced are separated by an intron and placed behind a promoter. In vivo, the transgene is transcribed and spliced to result in a hairpin RNA, which is processed into small RNAs by Dicer. Single-stranded small RNAs bound by AGO1 guide the silencing of an endogenous gene.
(b) An example of IR-PTGS in Arabidopsis. A wild-type flower consists of four types of floral organs: sepal, petal, stamen and carpel. A null mutant (ag-1) in the floral homeotic gene AGAMOUS (AG) has only sepals and petals, and exhibits prolonged proliferation of the floral meristem. A transgenic line containing an IR-PTGS construct targeting AG in the wild-type background phenocopies the ag-1 mutant. The images in B were taken from Figure 2 of Chuang and Meyerowitz (2000)PNAS, 97, 4985–4990; copyright (year 2000) National Academy of Sciences, USA; courtesy of Dr Elliot Meyerowitz.

Figure 4.

 Artificial microRNAs (amiRNAs) as efficient and specific agents in gene knock down.
(a) A diagram of the amiRNA approach. A known MIR gene is manipulated such that the sequences of the mature miRNA and its antisense strand miRNA* are replaced by those of an amiRNA and its antisense strand. In vivo, the modified MIR gene is transcribed into a primary miRNA (pri-miRNA), which is processed into a precursor miRNA (pre-miRNA) that is further processed into the amiRNA. The amiRNA is bound by AGO1, and guides the repression of an endogenous gene that is complementary to the amiRNA.
(b) An example of amiRNA-based gene knock down in Arabidopsis. A wild-type plant (left) generates flowers after making a certain number of leaves, whereas a mutant in the LEAFY gene (middle) shows a partial conversion of flowers into leaves. A wild-type plant that harbors an amiRNA targeting LEAFY (right) phenocopies the leafy mutant (middle). The images in B were part of Figure 3A of Schwab et al. (2006)Plant Cell, 18, 1121–1133, and are reproduced with permission from the American Society of Plant Biologists.

Figure 5.

 Virus-induced gene silencing as an effective gene knock-down approach.
(a) A diagram of the technology based on tobacco rattle virus (TRV). The cDNAs from the two RNAs of TRV were cloned between the 35S promoter and a terminator (NOS). The DEFICIENS gene from Nicotiana benthamiana (NbDEF) is cloned into RNA2. Agrobacteria carrying both constructs were infiltrated into N. benthamiana leaves.
(b) Floral phenotypes of VIGS against NbDEF. Although plants infiltrated with TRV cDNA without NbDEF have flowers (middle) identical to those of the wild type (left), plants infiltrated with TRV vectors containing NbDEF have flowers (right) with petal-to-sepal transformation, a phenotype indicative of loss of NbDEF function. The diagrams of the constructs and the images of flowers were extracted from Figures 1 and 2 of Liu et al. (2004)Plant Molecular Biology, 54, 701–711, with kind permission from the corresponding author Dr Dinesh-Kumar and the publisher Springer (License Number: 2255450723695).

Concluding remarks

Since the first discovery of transgene siRNAs in 1999, and endogenous small RNAs in 2002, in plants, small RNAs have transformed our views of the transcriptome landscape of plant genomes and paradigms of gene expression regulation. In one decade, we have come a long way from knowing nothing about small RNAs to a comprehensive appreciation of their biogenesis, modes of actions and biological functions. This rapid progress would not have been possible without the tools and resources of the model plant Arabidopsis, especially the genetics, the genome sequence and the collection of mutants in most genes. The knowledge of small RNAs gained primarily in Arabidopsis has led to new gene knock-down technologies that are broadly applicable to all plant species. These technologies are being employed in crop species to interrogate gene function and to improve their agricultural properties. The technologies have the potential to revolutionize biological research in species not amenable to genetic analyses.


I thank Dr Julien Curaba for assistance with an illustration and Dr Kathy Barton for helpful discussions. Research in the Chen laboratory is supported by grants from the National Institutes of Health (GM61146) and National Science Foundation (MCB-0718029).