Comprehensive analysis of gene function requires the detailed examination of mutant alleles. In Arabidopsis thaliana, large collections of sequence-indexed insertion and chemical mutants provide potential loss-of-function alleles for most annotated genes. However, limitations for phenotypic analysis include gametophytic or early sporophytic lethality, and the ability to recombine mutant alleles in closely linked genes, especially those present as tandem duplications. Transgene-mediated gene silencing can overcome some of these shortcomings through tissue-specific, inducible and partial gene inactivation, or simultaneous targeting of several, sequence-related genes. In addition, gene silencing is a convenient approach in species or varieties for which exhaustive mutant collections are not yet available. Typically, gene function is reduced post-transcriptionally, effected by small RNAs that act in a sequence-specific manner by base pairing to complementary mRNA molecules. A recently introduced approach is the use of artificial microRNAs (amiRNAs). Here, we review various strategies for small RNA-based gene silencing, and describe in detail the design and application of amiRNAs in many plant species.
Both prokaryotes and eukaryotes employ various classes of RNAs to establish and maintain basic cellular functions and identities. Their classically studied roles center around protein biosynthesis, where they serve as mobile shuttles of DNA-encoded sequence information (messenger RNAs, mRNAs) and structural elements in ribosomes (ribosomal RNAs, rRNAs), as well as amino acid adapters during translation (transfer RNAs, tRNAs). Many eukaryotes, including unicellular organisms such as the green alga Chlamydomonas reinhardtii, share an additional, highly abundant class of single-stranded small RNAs that range in size from just under 20 bases to over 30 bases (Hannon et al., 2006; Molnar et al., 2007; Zhao et al., 2007). They participate in gene silencing through RNA–RNA and possibly also RNA–DNA interactions, and mediate a wide range of phenomena such as transcriptional silencing of heterochromatin and post-transcriptional regulation of mRNA stability, as well as destruction and silencing of invading viral genomes or transgenes (Brodersen and Voinnet, 2006; Lippman and Martienssen, 2004; Vaucheret, 2006). Because of their involvement in gene silencing, these small RNAs are commonly referred to as silencing RNAs (sRNAs).
Silencing RNAs derive from longer RNA molecules that have at least partially double-stranded character because of either intra- or inter-molecular interaction. The precursors are processed by a specialized class of RNases, the Dicer family, into sRNAs. Major classes of sRNAs include microRNAs (miRNAs) and short interfering RNAs (siRNAs), which differ in their biosynthesis. miRNAs originate from longer, single-stranded transcripts that include imperfect foldbacks, with Dicer-mediated processing leading to the preferential accumulation of one distinct small RNA, the miRNA. Precursors of siRNAs, in contrast, form perfectly complementary double-stranded RNA (dsRNA) molecules. They originate, for example, as intermediates of viral replication or through the action of RNA-dependent RNA polymerases on single-stranded plant RNAs. Unlike miRNAs, the diced siRNA products derived from the long complementary precursors are not uniform in sequence, but correspond to many portions, and both strands, of the precursor. Whereas miRNAs mainly mediate post-transcriptional control of endogenous transcripts, siRNAs have been implicated in both transcriptional silencing of transposable elements in heterochromatin and post-transcriptional regulation of endogenous and exogenous long RNAs, including viral RNAs.
The importance of dsRNA in the generation of silencing signals has been systematically investigated; while testing various combinations of sense and antisense RNAs, it was discovered that silencing of endogenous loci or transgenes was most efficient when complementary long RNAs that could form a dsRNA were simultaneously introduced into either the worm Caenorhabditis elegans or tobacco (Fire et al., 1998; Waterhouse et al., 1998). Shortly thereafter, sRNAs were found to accumulate in tobacco cells during transgene or viral RNA mediated gene silencing (Hamilton and Baulcombe, 1999). Biochemical studies in Drosophila demonstrated that dsRNA as a silencing trigger is processed into RNAs that are approximately 21 nucleotides long (Zamore et al., 2000). The enzyme that processes dsRNA into sRNAs is Dicer, and the immediate Dicer products are short, 5′-phosphorylated dsRNAs with two-nucleotide 3′ overhangs (Bernstein et al., 2001; Elbashir et al., 2001).
In animals, siRNAs were initially implicated in post-transcriptional silencing of mRNAs (Montgomery et al., 1998), and this process has been termed RNA interference (RNAi). Similar observations had been made previously in plants, but the term post-transcriptional gene silencing (PTGS) never caught on in animals (Depicker and Van Montagu, 1997). In plants, siRNAs have also been shown to trigger transcriptional gene silencing (TGS; Jones et al., 1999; Mette et al., 1999). TGS is mediated by siRNAs of approximately 24–26 nucleotides, whereas siRNAs of approximately 21–22 nucleotides trigger mostly PTGS (Hamilton et al., 2002). Both siRNA classes, which can originate from the same transgene trigger, are also found in wild-type plants, where the 24–26 nucleotide class is mainly involved in silencing of centromeric and peri-centromeric heterochromatin (Kasschau et al., 2007; Xie et al., 2004). This process requires the activity of an RNA-dependent RNA polymerase, RDR2 in Arabidopsis, which mediates unprimed dsRNA formation from low-abundance transcripts that originate from repeats and various classes of transposons (Xie et al., 2004). Long dsRNAs are subsequently processed into siRNAs, which interact with their locus of origin. They recruit several DNA- and histone-modifying proteins including the cytosine methyltransferase CHROMOMETHYLASE3 (CMT3; Lindroth et al., 2001), which together mediate the formation of a silent chromatin state with minimal transcriptional activity. It is not known whether these repeat siRNAs bind nascent transcripts from the target locus or interact directly with genomic DNA.
The various classes of plant siRNAs are generated by distinct Dicer enzymes. In Arabidopsis thaliana, DICER-LIKE1 (DCL1) and DCL4 produce siRNAs that are around 21 nucleotides in length, DCL2 produces 22 nucleotide siRNAs, and DCL3 produces 24–26 nucleotide siRNAs (Xie et al., 2004). Despite these distinct roles, there is some redundancy, as DCL2, for example, can functionally compensate for DCL4 in virus defense (Deleris et al., 2006).
The discovery of miRNAs dates back to 1993, when cloning of the lin-4 gene, defined by a heterochronic mutant in C. elegans, revealed a mutation in a gene encoding a small RNA, with partial sequence complementarity to the 3′ UTR of a heterochronic gene with opposite activity, lin-14 (Lee et al., 1993; Wightman et al., 1993). A series of further experiments indicated that the lin-4 small RNA normally inhibits translation of lin-14 mRNA through RNA–RNA interaction. The widespread impact of such small RNAs on both animal and plant development was only recognized quite a few years later, starting with the cloning of a second small RNA encoding a locus in C. elegans, let-7, and the discovery of let-7 homologs in the genomes of many other animals including humans (Pasquinelli et al., 2000; Reinhart et al., 2000). Since then, miRNAs have been identified in a wide range of eukaryotes, primarily through large-scale sequencing, but occasionally also through forward genetics, just like lin-4 and let-7 (Berezikov et al., 2006).
The primary transcripts giving rise to miRNAs are mostly generated by RNA polymerase II (Lee et al., 2004). The precursor transcripts, which may be spliced, harbor one, or occasionally several, imperfect foldbacks. These are approximately 70–80 nucleotides long in animals, but more variable in length, from approximately 80 to 250 nucleotides, in plants. In both animals and plants, the primary transcripts can be much longer than the foldback, up to several kilobases in length (Xie et al., 2005). The importance of these additional sequences is unclear, as overexpression of just the foldback is generally as efficient for generation of miRNAs as overexpression of the entire primary transcript. The mature miRNA can be derived from either the 5′ or 3′ arm of the foldback. Plant precursors are processed in the nucleus by DCL1 in a two-step cleavage event, which releases a duplex of miRNA and miRNA* (Kurihara and Watanabe, 2004). Plant sRNA duplexes, including miRNA–miRNA* duplexes, are modified at the 3′ terminal ribose position by the methyltransferase HUA ENHANCER 1 (HEN1) (Yu et al., 2005), which prevents uridylation and thus stabilizes the sRNAs (Li et al., 2005).
Mechanisms of sRNA-mediated gene silencing
Silencing RNAs serve as specificity components for protein machines known as RNA-induced silencing complexes (RISCs), which contain as catalytic subunits Argonaute (Ago) proteins, the mediators of gene silencing (Hammond et al., 2000). The immediate Dicer products are sRNA duplexes, but normally one of the two constituent sRNAs preferentially associates with Argonautes. This strand has been termed the siRNA guide strand, and, in the case of miRNAs, corresponds to the mature miRNA. An important feature with regard to which strand is selected as the guide strand is the thermodynamic stability of the 5′ ends in the double-stranded Dicer product. 5′ instability because of higher AU content or mismatches, compared with the 3′ end, generally characterizes the guide strand (Khvorova et al., 2003; Schwarz et al., 2003). Similar characteristics are also observed in most plant miRNAs, which often start with a U and have C at position 19, which is the last pairing nucleotide in a 21 nucleotide miRNA–miRNA* duplex. Effective siRNAs and miRNAs share additional features, such as an over-represented A at position 10, immediately preceding the cleavage site. This is consistent with endonucleases preferring to slice after U, the complementary base to A (Huesken et al., 2005; Reynolds et al., 2004).
Argonaute proteins bind both small and longer target RNAs and bring them in close proximity. They direct inhibition of target mRNA translation, which is typical for most animal miRNAs, or cleavage of target transcripts opposite position 10/11 of the sRNA, which is typical for siRNAs and plant miRNAs (Lingel et al., 2003; Song et al., 2004). The siRNA producing Dicer-2 of Drosophila directly associates with the RISC complex, and hands over the newly generated siRNA duplex to Argonaute, which slices the passenger strand (which in the case of miRNAs corresponds to miRNA*). This process provides an elegant mechanism for retention of the active strand in the RISC (Matranga et al., 2005). In Arabidopsis, ARGONAUTE1 (AGO1) directs both miRNA- and siRNA-mediated target cleavage without requiring further protein partners (Baumberger and Baulcombe, 2005). Because siRNAs derived from a long RNA trigger are typically heterogeneous in sequence, they will initiate AGO-dependent cleavage at many sites of the target transcript. In contrast, because miRNAs are unique, distinct miRNA-guided cleavage products usually predominate, and their 5′ ends can often be readily identified by RACE-PCR (Kasschau et al., 2003; Llave et al., 2002).
As endogenous siRNAs mostly silence longer RNAs produced by the locus from which the siRNAs themselves originate, they pair perfectly with target RNAs. Similarly, exogenously supplied siRNAs in mammals are normally designed such that they have perfect-match complementarity with their intended target mRNA. Plant miRNAs, similar to siRNAs in both animals and plants, preferentially cause cleavage of target mRNAs. They do not, however, need to be perfectly complementary to their targets, and up to five mismatches are allowed (Palatnik et al., 2003). A tally of mismatches between all known targets and their miRNAs suggests that pairing to the 5′ and central part of the miRNA is most important, as mismatches are mostly found towards the 3′ end (Mallory et al., 2004). These conclusions are supported by mutational analysis of miRNAs and their targets (Emery et al., 2003; Mallory et al., 2004; Palatnik et al., 2007). An unbiased empirical search for effective miRNA target determinants also defined miRNA positions 2–12 as the region most sensitive to mismatches (Schwab et al., 2005), similar to what has been biochemically determined as the region sufficient for siRNA-mediated target degradation in human cells (Doench and Sharp, 2004). However, pairing to the 3′ end of a plant miRNA is not completely dispensable (Juarez et al., 2004; Palatnik et al., 2007).
At least two Arabidopsis miRNAs, miR172 and miR156, not only cause target RNA cleavage, but also translational inhibition (Aukerman and Sakai, 2003; Chen, 2004; Gandikota et al., 2007; Lauter et al., 2005; Mlotshwa et al., 2006; Schwab et al., 2005; Wu and Poethig, 2006), which is the common mode of animal miRNA action. A caveat of many studies of plant miRNA targets is that, compared with transcript cleavage, it is considerably more difficult to discern the extent to which plant miRNAs cause translational inhibition. Extensive phenotypic comparisons of the effects of miRNA overexpression and knockouts of miRNA targets do, however, suggest that rules for translational inhibition are not substantially different from rules for target cleavage, as miRNA overexpressers in plants often have very similar phenotypes to plants with loss-of-function alleles of the respective targets (Jones-Rhoades et al., 2006).
Application of siRNAs in animals
Silencing RNAs have been used extensively to knockdown genes of interest in various organisms. siRNA duplexes can be used directly with in vitro cultures of cell lines, such as human HeLa or Drosophila S2 cells (Tuschl et al., 1999; Zamore et al., 2000). They are applied as in vitro synthesized dsRNAs with two-nucleotide 3′ overhangs, and are normally designed to have perfect complementarity to their target transcripts. Various chemical modifications, such as 2′O-methyl groups, can stabilize the siRNAs. An alternative is stable transformation with expression cassettes producing hairpin precursors. Among the most successful variants are short-hairpin RNAs (shRNAs), which are based on miRNA precursor backbones, the processing of which is understood in great detail (Chang et al., 2006; Paddison et al., 2002; Silva et al., 2005). In C. elegans, application of small RNAs is as simple as feeding them bacteria that contain dsRNA-expressing plasmids (Timmons and Fire, 1998).
Methods for engineering siRNA-mediated gene silencing in plants
In plants, perhaps the simplest approach to sRNA-directed gene silencing is via stable transformation, which offers several advantages, but also has a few drawbacks. Major advantages of RNAi include the possibility of using tissue-specific as well as inducible promoters, and the identification of partial loss-of-function alleles based on inherent variation of transgene expression in different transformants. Drawbacks include dominant sterility or lethality, or the possibility that the targets are only partially silenced and the true null phenotype therefore remains unknown. An overview of several strategies and how they exploit endogenous RNAi components is shown in Figure 1.
The first attempts to induce loss of gene function in plants were based on observations made in the 1980s, demonstrating an inhibitory effect of long antisense RNAs on corresponding protein-coding (sense) mRNAs in animal cells (Izant and Weintraub, 1984). Subsequent experiments with transiently or stably expressed antisense RNAs often resulted in successful suppression of accumulation of the corresponding mRNA, and suggested a role for dsRNA as a template for RNA degradation (reviewed by Mol et al., 1988). At the same time, it was shown that strong overexpression of sense transgenes sometimes results in co-suppression, a simultaneous reduction in expression of both the transgene and the homologous endogenous gene (sense PTGS, sPTGS; Napoli et al., 1990). We now know that both silencing phenomena are mediated by sRNAs, produced either from the sense–antisense RNA hybrid or dsRNA generated by an RNA-dependent RNA polymerase, which can somehow recognize aberrant versions of highly abundant transgene RNAs (reviewed by Jorgensen et al., 2006).
A more recently developed and particularly effective way to generate siRNAs in plants is from long hairpin precursors; this approach is known as inverted repeat (ir) PTGS or hairpin RNAi (hpRNAi; reviewed by Watson et al., 2005). In these transcripts, sense and antisense RNAs are brought in very close proximity, such that dsRNA is formed very easily (see Figure 2a). Two studies systematically compared various silencing strategies, including separately transcribed sense and antisense strands, and found that hairpins were the most efficient silencing triggers (Chuang and Meyerowitz, 2000; Wesley et al., 2001). The most potent variation is a hairpin in which the terminal loop is initially formed by a short intron. Reported success rates are variable, but exceeded 90% in one study (Kerschen et al., 2004; Wesley et al., 2001). hpRNAi has been widely adopted for many plant species, especially as convenient generic plasmids for transgene generation are available (http://www.pi.csiro.au/rnai/).
Transcriptional gene silencing (TGS)
Engineering TGS via promoter methylation is less commonly used in plants, even though it can also be very effective (Aufsatz et al., 2002). In this case, the siRNAs originate from hairpin transgenes that typically contain non-coding promoter-proximal sequences, and cause DNA methylation and chromatin modification. The former is now known as RNA-directed DNA methylation (RdDM) (Matzke et al., 2006). There have been few systematic tests of TGS as a general gene-silencing tool, and it is not yet clear what overall success rate can be expected. Because of the distinct possibility that the siRNAs require nascent transcripts for homologous base pairing rather than genomic DNA as template, it is possible that only extragenic regions that happen to be occasionally transcribed will be susceptible to TGS-associated DNA and chromatin modifications.
Virus-induced gene silencing (VIGS)
An alternative to the approaches discussed so far is virus-induced gene silencing (VIGS), which exploits the plant’s ability to target viral RNAs. Perfectly complementary dsRNA molecules that are generated during replication of viral RNA genomes, or foldbacks in single-stranded viral RNA, can serve as templates for DCL proteins to produce siRNAs, thereby attenuating or shutting off virus spread (Molnar et al., 2005). If viral genomes are engineered to include plant sequences, the resulting siRNAs can also effectively silence endogenous plant genes. Virus-derived siRNAs belong either to the approximately 21 nucleotide class, in which case they elicit PTGS by targeting of endogenous transcripts, or to the approximately 24–26 nucleotide class that triggers RdDM and TGS.
For practical purposes, modified cDNAs of viral genomes are placed behind plant promoters in T-DNA vectors (Figure 2b), and the modified genome is transferred into the plant using stable, or, more often, transient, Agrobacterium-mediated transformation. In the host, plant RNA polymerases convert the modified viral cDNA into viral RNA. Not surprisingly, many viruses encode so-called viral suppressor proteins, which counteract the silencing of viral genomes by DCL and Argonaute proteins (Voinnet, 2005b). For efficient silencing of engineered loci, it is thus beneficial to work with viruses that confer only weak silencing suppression (Lu et al., 2003; Watson et al., 2005).
Systemic effects of gene silencing
Virus-derived siRNAs can act at a distance to cause silencing throughout a plant, even if the virus was initially only inoculated locally (Jones et al., 1999; Voinnet and Baulcombe, 1997). A current model (Himber et al., 2003) suggests movement of primary siRNAs generated from the viral genome by DCL4 across approximately 10–15 cells in leaves (Dunoyer et al., 2005). In other tissues, including embryos, siRNAs might be able to move further, but, on the other hand, there might also be symplastic boundaries that limit movement (Kobayashi and Zambryski, 2007). In secondary cells, these primary siRNAs not only cause degradation of complementary target RNAs, but also serve as primers for RNA-dependent RNA polymerases (RDRs), which generate more dsRNA from the initial target. These are new substrates for DCL proteins, which produce secondary siRNAs. These can again move across several cell layers. This process is known as transitivity, and greatly amplifies the action of the silencing trigger (Voinnet, 2005a). RDR-generated secondary siRNAs may contain sequences that were not present in the primary siRNA pool, because the target transcript does not have to be complementary to the initial siRNA pool in its entirety (Himber et al., 2003). Although hpRNAi transcripts are probably also processed by DCL4, the same DCL that is responsible for silencing of viral RNAs (Deleris et al., 2006; Dunoyer et al., 2005), transitive spread of hpRNAi-associated siRNAs appears to be less frequent. Both tissue-specific and transient silencing of an endogene using hpRNAi have been reported (Byzova et al., 2004; Davuluri et al., 2005).
Transitive formation of secondary siRNAs triggered by endogenous miRNAs has been observed in a limited number of cases. Most importantly, miRNAs miR173, miR390 and miR828, which target non-coding transcripts TAS1–TAS4, trigger RDR6-mediated dsRNA formation, which is followed by DCL4-dependent phased processing into approximately 21 nucleotide siRNAs, starting from the initial miRNA-guided cleavage of the TAS transcript (Allen et al., 2005; Rajagopalan et al., 2006). Several of the phased trans-acting siRNAs (ta-siRNAs) are stable and trigger AGO-mediated destruction of protein-coding transcripts. Remarkably, this hierarchy can extend, with siRNAs derived from the protein-coding transcripts in turn acting as siRNAs on another layer of protein-coding transcripts, as observed for some PPR transcripts (Chen et al., 2007; Howell et al., 2007). Transitive siRNAs originating from protein-coding miRNA target transcripts have also been observed (Ronemus et al., 2006), but seem to be generally of low abundance. It remains to be demonstrated whether these miRNA-dependent secondary siRNAs act cell-autonomously, or whether they can move across a limited number of cells.
Consistent with the long-distance effects of viral siRNAs being associated with movement through the phloem, viral siRNAs have been detected in the phloem sap of pumpkins (Yoo et al., 2004). In the same experiment, some endogenous miRNAs were detected in phloem sap, but it remains unclear whether this is due to short-distance export from companion cells or associated with long-distance effects.
Specificity of siRNA-mediated gene silencing
In both plants and animals, most initial efforts to improve sRNA-mediated gene silencing focused on maximal effectiveness. However, questions of specificity have received increasing attention. One of the first reports came in 2003, when it was shown in a mammalian cell culture system that transfection with siRNAs not only affected transcripts with perfect complementarity, but also many others with varying degrees of partial complementarity (Jackson et al., 2003). When this phenomenon was investigated in more detail, again in mammalian cells, a strong correlation between unintended targets, generally called ‘off-targets’, and short stretches of complementarity to the siRNA in the 3′ UTRs of the affected transcripts was found (Birmingham et al., 2006). The types of matches were reminiscent of functional pairing between animal miRNAs and their targets, which mostly relies on hexa- or heptamer matches to the seed region (positions 2–8) of the miRNA (Brennecke et al., 2005). Consistent with these observations, in several genome-wide screens with siRNA libraries, the most effective siRNAs were ones that effectively knocked down known components of the genetic pathways analyzed, even though there was only limited complementarity between these siRNAs and the downregulated genes (Lin et al., 2005; Ma et al., 2006). These pathway components would normally be classified as off-targets of the successful siRNAs.
To validate siRNA-mediated effects on target genes, two or three independent siRNAs or hairpins generating siRNAs complementary to various regions of a target gene are commonly used. Another approach is to complement siRNA effects with transgenes that carry silent mutations in the target so that they are no longer susceptible to siRNA-triggered silencing (Lens et al., 2003). This is readily possible in mammalian cell cultures, as only individual siRNA duplexes are applied, and RNA-dependent RNA polymerases, which could mediate the formation of secondary siRNAs, are not found in mammals.
Potential off-target effects of hpRNAi in plants could arise from two linked features of the hairpin transcripts. First, the sites at which DCLs process the dsRNA are not known, and a large number of siRNAs with diverse sequence arises. These might target not only the intended transcript, but also others that accidentally share perfect or near-perfect complementarity to any of the siRNAs. Second, the minimal sequence determinants for effectiveness of siRNAs are not known, so mRNAs with one, two or even more mismatches to the siRNAs could be affected. Xu et al. (2006) have computationally identified pairs of protein-coding transcripts from Arabidopsis that share contiguous sequence identity over at least 21 nucleotides, which could lead to unintended silencing by the hairpin trigger. The majority of transcripts had at least one partner that fulfilled this criterion. This appears to be significant at least in some cases, as shown with transgenic plants carrying hpRNAi transgenes (Xu et al., 2006). These findings suggest that sequences used as hpRNAi triggers should be carefully selected. While long hairpins are more likely to generate a diverse set of optimally effective siRNAs, they also have an increased potential to produce siRNAs with off-target effects. Unfortunately, the specificity of plant siRNA action has not been studied systematically at the molecular level, although a few papers suggest that moderately closely related homologs are usually not targeted (Chuang and Meyerowitz, 2000; Li et al., 2004; http://www.pi.csiro.au/rnai/benefits.htm).
Artificial microRNAs (amiRNAs) in plants
Principle features of amiRNAs
The artificial microRNA (amiRNA) technology exploits endogenous miRNA precursors to generate sRNAs that direct gene silencing in either plants or animals (Alvarez et al., 2006; Niu et al., 2006; Parizotto et al., 2004; Schwab et al., 2006; Zeng et al., 2002; Figure 1). miRNA precursors preferentially produce one sRNA duplex, the miRNA–miRNA* duplex. When both sequences are altered without changing structural features such as mismatches or bulges, this often leads to high-level accumulation of an miRNA of desired sequence. amiRNAs were first generated and used in human cell lines (Zeng et al., 2002), and later in Arabidopsis (Parizotto et al., 2004), where they were shown to effectively interfere with reporter gene expression. Subsequently, it was demonstrated that not only reporter genes but also endogenous genes can be targeted with amiRNAs (also called synthetic miRNAs), and that these seem to work with similar efficiency in other plant species (Alvarez et al., 2006; Schwab et al., 2006). amiRNAs are effective when expressed from either constitutive or tissue-specific promoters. The experiments with tissue-specific constructs indicated that there are few, if any non-autonomous effects (Alvarez et al., 2006; Schwab et al., 2006). In addition, genome-wide expression analyses have shown that plant amiRNAs have similarly high specificity as endogenous miRNAs (Schwab et al., 2005, 2006), such that their sequences can be easily optimized to silence one or several target transcripts without affecting the expression of other transcripts.
Conceptually, plant amiRNAs are related to the short hairpin RNAs (shRNAs) that have been developed for animal systems (Silva et al., 2005). The main difference is that shRNAs, which are generated from animal miRNA precursors, are generally perfectly complementary to their intended targets, just like siRNAs. The systematic and large-scale synthesis of shRNA libraries has allowed functional screens in which most genes in a genome are silenced by individual hairpins (Chang et al., 2006; Silva et al., 2005).
Efficiency of small RNA-mediated gene silencing
Some gene silencing transgenes work very effectively, while others do not. Three possibilities could account for unsuccessful silencing: (i) insufficient production of siRNAs with favorable intrinsic properties, (ii) inaccessibility of the complementary site in the target mRNA, or (iii) difficulties in reducing steady-state target RNA levels because of negative feedback regulation.
Comprehensive studies of intrinsic siRNA properties conferring highly efficient gene silencing in mammals have identified 5′ instability as a main criterion for small RNA effectiveness (Khvorova et al., 2003; Schwarz et al., 2006), as only these siRNAs are efficiently incorporated into RISCs. Additional determinants include an A at position 10, consistent with endonucleases preferentially cleaving 3′ to a U (the complementary base in the target; Donis-Keller, 1979). Both features are also over-represented in endogenous plant miRNAs, which mechanistically function very similarly to siRNAs. amiRNAs can be easily optimized for the above-described favorable parameters, and should therefore function effectively on an optimal target mRNA. However, as only little is known about processing of plant miRNA precursors, it is possible that some of the products will not be precisely processed as intended, and will therefore not have the predicted sequence (Schwab et al., 2006).
In mammalian cell cultures, there appears to be a high correlation between siRNA effectiveness and accessibility of the binding site (Ameres et al., 2007; Overhoff et al., 2005; Schubert et al., 2005). In particular, pairing to the siRNA 5′ portion depends on single-stranded features in the target RNA, ensuring efficient RNA–RNA hybrid formation and thus maximal cleavage activity of the RISC (Ameres et al., 2007). Surrounding sequences also affect the effectiveness of miRNA target sites in animals (Long et al., 2007), but plant sRNA target sites have not yet been investigated for these features.
Least understood is the observation that transcripts seem to differ in their intrinsic susceptibility to sRNA-mediated gene silencing. While application of PTGS can sometimes produce homogenous populations of greatly affected plants, only minor and variable effects on target transcript accumulation are seen in other cases (Alvarez et al., 2006; Chuang and Meyerowitz, 2000; Kerschen et al., 2004; Schwab et al., 2006). Part of this phenomenon might be explained by target site accessibility, but another explanation could be negative feedback regulation, where reduced transcript levels are compensated for by increased transcription rates. Other more complex scenarios such as alterations in the expression levels of target transcripts probably play a role as well.
Due to the novelty of the amiRNA technology, it has yet not been systematically compared with hpRNAi, although Qu et al. (2007) have reported one case in which amiRNAs were more effective. The limited number of published studies suggest an overall success rate of possibly up to 90% (Alvarez et al., 2006; Choi et al., 2007; Mathieu et al., 2007; Niu et al., 2006; Qu et al., 2007; Schwab et al., 2006), while anecdotal evidence from unpublished studies in our and other laboratories indicate a rate of close to 75% in Arabidopsis, when targeting either single or multiple genes. In addition, in at least two cases, phenotypic effects were seen despite minimal effects on target RNA level, suggesting translational inhibition.
Design of amiRNAs for Arabidopsis and other plants using the WMD platform
We have developed the WMD (Web MicroRNA Designer) platform, which automates amiRNA design, and only requires selection of favorite candidates according to a small set of criteria, which are described below. This tool was initially implemented for Arabidopsis thaliana (Schwab et al., 2006), but has now been extended to >30 additional species for which genome or extensive EST information is available (Table 1; http://wmd2.weigelworld.org). It is designed to optimize both intrinsic small RNA properties as well as specificity within the given transcriptome.
For fully annotated genomes, such as Arabidopsis thaliana, rice and poplar, it is sufficient to enter gene identifiers for the respective genome release (e.g. At1g23450 or Os01g24680) in the ‘Design’ tool of WMD. amiRNA candidate sequences will be determined from the first annotated splice form. If simultaneous silencing of several related genes is desired, it is necessary to additionally indicate the minimal number of target genes (at least two) to be silenced with one amiRNA. This allows the selection of various subgroups for silencing, if no optimal amiRNA for simultaneous silencing of all targets can be found. If annotated, distinct splice forms can be directly specified as well (e.g. At2g23450.1 or At2g23450.2), but silencing of individual splice forms requires entry of unique sequence regions in FASTA format, headed by the identifier of the splice form from which the sequence is derived. Other targets such as antisense transcripts, sequence variants from accessions other than the reference strain (Columbia in A. thaliana, Nipponbare in rice), or foreign sequences (e.g. GUS or GFP), have to be entered in FASTA format, and it is necessary to mark the corresponding checkbox for ‘not annotated transcripts’ and choose unique headers that distinguish them from annotated transcripts. Sequence variants additionally require that the reference gene be allowed as an acceptable ‘off-target’, unless allele-specific silencing is desired.
As full genome sequences and annotations are not always available, WMD can also exploit information from EST collections. All 34 EST libraries currently found in the Gene Index Project (Quackenbush et al., 2001; http://compbio.dfci.harvard.edu/tgi/plant.html), including those from maize, tobacco and soybean, have been integrated into WMD for amiRNA predictions. Desired targets from species with only partial sequence information require a FASTA sequence input. As a locus may be covered by various ESTs, it is important to determine the names of all ESTs for the desired target, and to specify all as acceptable off-targets (following naming conventions of the respective database). Otherwise, amiRNA candidates could be rejected, because ESTs originating from the same locus are still treated as independent targets. Any of the redundant ESTs can be used as the main target. It is also possible to combine redundant ESTs under a new name and use this sequence as input for WMD. It should be noted that incomplete genome and annotation information might lead to off-targets being missed.
Principles of amiRNA design using WMD
The ‘Design’ tool of WMD implements two main steps: (i) optimization of small RNAs for maximal effectiveness, and (ii) selection of those with highest specificity for the intended target gene(s) (Figure 3). For the second step, whole-genome information is taken into account, and is thus dependent on the species of interest.
Optimization of amiRNA sequences mostly follows criteria that have been developed for mammalian siRNAs, but also apply to many endogenous plant miRNAs (Figure 4). Initially, candidate 21-mer sequences are picked from the whole length of reverse complements of target transcripts (or partial transcripts where specified), such that they share an A at position 10 (A or U for multiple-target amiRNAs), and display 5′ instability (higher AU content at the 5′ end and higher GC content at the 3′ end around position 19). At position 1, a U is introduced in all cases, even when other nucleotides would normally be found at this position.
Next, all candidates undergo a series of in silico mutations at positions 13–15 and 17–21. Resulting mutated candidates should hybridize to the specified target gene with no more than two mismatches between positions 13 and 21, and 5′ instability is mandatory as well. As target sequences can differ slightly when designing amiRNAs for multiple genes, additional targets must follow more relaxed, empirically established criteria for miRNA targeting (Schwab et al., 2005), which are used to identify targets in genome-wide searches. These allow for maximally one mismatch from amiRNA positions 2 to 12, none at the cleavage site (positions 10 and 11), and up to four mismatches between positions 13 and 21, with no more than two consecutive mismatches. In addition, acceptable amiRNA–target duplexes must have at least 70% of the free hybridization energy calculated for a perfectly complementary amiRNA, with at least −30 kcal mol−1, as determined by RNAcofold (Bernhart et al., 2006) and mfold (Zuker, 2003).
Selection of candidates
After the reiterative mutation process, all candidates are ranked, taking into account total number as well as positions of mismatches. One or two mismatches between positions 17 and 21 are preferred, which should reduce potential transitive effects due to priming and extension by RNA-dependent RNA polymerases. Additional criteria are absolute and relative hybridization energy, with 80–95% of perfect-match free energy and an absolute value of −35 to −38 kcal mol−1 being preferred; number of non-intended targets with five or fewer mismatches, preferably none; and maximal difference between intended and all non-intended targets with respect to free energy. Additional penalties are imposed for potential off-targets that have maximally three mismatches, of which at least two are between positions 2 and 12. These are normally not considered miRNA targets, but represent the most likely off-targets as they should easily pair with miRNAs. Finally, we try to avoid amiRNAs that are perfectly complementary to their intended targets, because of the afore-mentioned transitivity concerns.
As the output of WMD, all potential amiRNA sequences are listed by rank, with the best candidates at the top, underlaid in green color, followed by intermediate ones in yellow and orange, and the poorest, with most penalty points, at the bottom in red. It is important to understand that the red category does not necessarily imply non-functionality of the amiRNA candidates, but rather increased potential for off-targets. Off-targets are not automatically shown for each amiRNA candidate, but can be easily identified as amiRNA sequences are hyperlinked to the ‘Target Search’ tool in WMD.
It is recommended that selection of amiRNAs proceeds from the top to the bottom of this list, taking into consideration additional criteria. Among these might be acceptable off-targets that receive high penalty scores but may not interfere with the experimental design, or the position of the amiRNA target site within the target transcript. As the effects of local sequence context are currently not well understood for plant miRNAs, it might be beneficial to choose at least two amiRNAs directed against different regions of the target transcript. Although a large fraction of endogenous miRNA target sites are found towards the 3′ end of the coding sequence, there does not seem to be a bias for the position of effective amiRNA target sites. When attempting simultaneous silencing of multiple genes, it might be useful to select amiRNAs with similar mismatch patterns and hybridization energies to all target transcripts. We also try to avoid mismatches to positions 2–12 of the amiRNAs. Finally, anecdotal evidence suggests that extreme GC content, over 60%, should preferentially be avoided as well. For siRNAs, GC content of 30–50% has been suggested (Reynolds et al., 2004). To facilitate amiRNA selection, amiRNA sequences are hyperlinked to alignments of targets and potential off-targets, with a graphical representation of where the amiRNA target site is found in each transcript.
After selection of an amiRNA, the 21-mer sequence must be engineered into a miRNA precursor using overlap PCR to replace the endogenous miRNA sequence. WMD includes the ‘Oligo’ tool, which allows automatic generation of oligonucleotide primers that can be used in combination with the MIR319a precursor from A. thaliana. Several other precursors from A. thaliana have been used successfully as well, including MIR164b (Alvarez et al., 2006), MIR159a (Niu et al., 2006), MIR171 (Parizotto et al., 2004) and MIR172a (Schwab et al., 2006). Both miRNA and the partially complementary region on the other arm of the foldback, the miRNA*, are substituted by amiRNA and amiRNA*, respectively. The sequence of the amiRNA* is specified in such a way that mismatch positions to the amiRNA are retained, because structural features are considered to be important for guiding correct DCL1-mediated processing (Figure 2c).
For historical reasons, a 20 bp sequence in MIR319a is replaced by a 21 bp sequence, because it was thought initially that miR319a was only 20 bases long (Palatnik et al., 2003). More recent analyses have, however, revealed that a 21-base form of miR319a predominates, as is typical for plant miRNAs (Fahlgren et al., 2007; Rajagopalan et al., 2006). RNA blot analysis has indicated that the mature miRNAs created from the modified precursors, despite having an extra base, are mostly 21 nucleotides long (Schwab et al., 2006).
To engineer the amiRNA, three fragments containing (i) the 5′ region up to the amiRNA*, (ii) the loop region ranging from amiRNA* to amiRNA, and (iii) the 3′ region starting with the amiRNA are amplified separately from a pBluescript template plasmid that contains the MIR319a precursor (pRS300, available on request from the authors; see also the WMD website and the detailed protocol available in Appendix S1). The three PCR fragments will overlap for 25 bp in the amiRNA and amiRNA* regions, and the final product is generated in a single PCR reaction (see Figure 2c). Sequence-verified amiRNA foldbacks can be transferred into binary plasmids of choice, with different promoters or terminators, using the various restriction enzyme sites present in the pRS300 plasmid. Recently, pRS300 derivatives have been successfully adopted for Gateway®-assisted cloning (W. Busch, S.U. Anderson and J.U. Lohmann, Max Planck Institute for Developmental Biology, Tübingen, Germany, personal communication). An alternative strategy, which becomes very useful when cloning several amiRNAs, uses uracil-excision-based cloning to directly regenerate the amiRNA-containing MIRNA precursor from the individual PCR products (Figure 2d, and see detailed protocol in Appendix S1) (Geu-Flores et al., 2007; Nour-Eldin et al., 2006). Oligonucleotides in which a single uracil replaces a template thymidine 8–9 bases from the 5′ end are used to amplify the three foldback pieces using the proofreading polymerase PfuCx. These are subsequently treated with the USER™ enzyme mix (New England Biolabs; http://www.neb.com), which removes uracils and leaves 8–9-base single-stranded overhangs. Overlapping fragments can be directly placed into a corresponding USER-compatible vector without adding DNA ligase. The advantage of this strategy, which has been successfully implemented for amiRNA vectors (B.G. Hansen, H.H. Nour-Eldin, I.E. Sønderby and B.A. Halkier, University of Copenhagen, Denmark, personal communication), is that the overlapping regions of the individual PCR products can be much shorter and designed such that the loop fragment remains the same for various amiRNA constructs.
Testing the effectiveness of amiRNAs
amiRNA-mediated gene silencing occurs in a quantitative fashion, with stronger promoters often causing higher degrees of gene silencing (Alvarez et al., 2006; Schwab et al., 2006). Expression changes of target genes can be easily monitored by RT-PCR, preferentially using oligonucleotide primers spanning the amiRNA-guided cleavage site, as miRNA effects are mostly evident at the transcript level. In addition, the amiRNA-guided cleavage site may be mapped by RACE-PCR (Llave et al., 2002). A great advantage of amiRNAs compared with hpRNAi is the possibility of phenotypic complementation using a target transgene that carries silent mutations in the amiRNA-complementary site, as has been performed for targets of endogenous miRNAs (Palatnik et al., 2003). Additional suggestions for characterization of amiRNA-expressing plants are discussed in the ‘Help’ section of WMD (http://weigelworld.org).
Arabidopsis miRNA precursors have been modified successfully to silence endogenous and exogenous targets in Arabidopsis, tomato and tobacco (Alvarez et al., 2006; Niu et al., 2006; Parizotto et al., 2004; Qu et al., 2007; Schwab et al., 2006). However, Arabidopsis precursors have not been systematically tested for functionality in other plants, and it might therefore be preferable to use autologous miRNA precursors as backbones, preferentially ones that are known from experimental studies to be efficiently processed into miRNAs. Proof-of-principle studies for related techniques such as hpRNAi or VIGS have used target genes such as those encoding phytoene desaturase (PDS), inactivation of which confers obvious photobleaching (Liu et al., 2002; Ruiz et al., 1998), and this might be desirable for testing amiRNAs in other species as well. PDS has been successfully used as a target to show amiRNA functionality in rice (N. Warthmann, H. Chen and P.M. Hervé, Max Planck Institute for Developmental Biology, Tübingen, Germany, and IRRI, Los Baños, Laguna, Philippines, personal communication).
Unique applications of amiRNAs
Gene silencing in non-reference strains or non-model systems
Available sequence-indexed knockout collections combined with TILLING lines have endowed reverse genetic approaches with unparalleled power. Unfortunately, these collections are generally restricted to one or few reference strains in model organisms. As with other forms of hpRNAi (e.g. Steppuhn et al., 2004), amiRNAs are well suited for silencing genes in non-standard strains of model species (Bomblies et al., 2007), or in non-model organisms. As described above, several species for which only EST collections are available have been accommodated in the WMD platform. For those interested in natural variation, knocking out the same gene(s) in various wild strains offers an excellent tool to directly compare the activity of different alleles. Akin to the concept of quantitative complementation (Mackay, 2004), this approach might be termed quantitative knockout, and should provide a powerful approach for demonstrating differential allelic effects of genes pinpointed in quantitative trait loci (QTL) cloning studies, for example.
amiRNAs might also have potential advantages for crop plants, as a single species of sRNA is preferentially generated, the actions of which are much more predictable than those of the collection of sRNAs with diverse sequences produced by hpRNAi constructs. This property may also help to alleviate regulatory concerns.
Duplicated genes and gene families
A major limitation in both forward and reverse genetic screens is that loss-of-function alleles that eliminate tandem arrays of closely related genes are very difficult to obtain. In addition, the generation of higher-order mutant combinations, comprising loss-of-function alleles of related genes, is tedious. This is a significant problem, as, in Arabidopsis thaliana for example, about a quarter of all genes are found in tandem arrays or large-scale segmental duplications (Arabidopsis Genome Initiative, 2000). In addition, a large number of plant species, especially many crop species, are polyploid, and thus have two or more very similar copies of each gene in their genome. As discussed above, amiRNAs targeting several sequence-related genes are readily designed. An advantage over conventional hpRNAi is that amiRNA design rules explicitly allow for variable mismatches. Thus, amiRNA complementary sites in different targets do not have to be identical. The successful targeting of multiple genes has been demonstrated (Alvarez et al., 2006; Choi et al., 2007; Schwab et al., 2006). A variation on this theme is the use of polycistronic amiRNA precursors. Even though plant miRNA precursors generally feature only a single foldback, multiple foldbacks are not uncommon in animals (Lee et al., 2004), and amiRNA precursors with two foldbacks have been successfully employed (Niu et al., 2006). This approach, the generation of sRNAs against non-sequence-identical targets using polycistronic precursors, is open to other silencing methods, and has been implemented for VIGS and hpRNAi by insertion of multiple sequences into one vector (Allen et al., 2004; Watson et al., 2005).
Allele- and splice form-specific silencing
Approximately 10% of all genes in Arabidopsis have more than one splice form (Iida et al., 2004). The amiRNA approach should allow silencing of individual isoforms, as amiRNAs generally do not cause transitivity, i.e. generation of secondary sRNAs that might then target portions of the transcript that are shared by multiple splice forms. The feasibility of this approach, however, remains to be tested.
A related application is the targeting of different alleles created either by mutagenesis or by nature. While this has not yet been accomplished for plant amiRNAs, a related experiment has been performed using siRNAs in human cells. It was shown that a single SNP at the cleavage site strongly influenced silencing of a transcript, and could therefore be used to assay the individual activity of two alleles (Schwarz et al., 2006).
Transient and tissue-specific silencing
Transgenes offer the unique possibility of performing tissue-specific and transient gene silencing. sRNAs tend to silence constitutively expressed reporter genes, such as GFP, not only in precursor-expressing cells, but also transitively throughout the plant in a process requiring the RNA-dependent RNA polymerase RDR6 (Parizotto et al., 2004). Endogenous targets of long hpRNAi constructs have been successfully silenced in a tissue-specific fashion (Byzova et al., 2004), as have been targets of amiRNAs (Alvarez et al., 2006; Schwab et al., 2006). A study by Mathieu et al. (2007) used amiRNAs specifically expressed in the shoot apex to demonstrate that the FT protein, rather than its mRNA, functions as a mobile signal inducing reproductive transition at the shoot apex. In addition, silencing of the floral identity gene LFY was efficient when amiRNAs were expressed from the endogenous LFY promoter (Schwab et al., 2006).
Transient gene silencing is very useful when constitutive loss of gene function causes sporophytic or gametophytic lethality. Visible changes in amiRNA expression are readily detectable within 3 days with an ethanol-inducible system, and persist for several days after removal of the silencing trigger (but not on tissue that was initiated after the inductive pulse) (Schwab et al., 2006).
It is not only plant genes but also foreign sequences that can be targeted by RNAi. Because RNAi is used by plants and animals as defense against RNA viruses, it was natural to exploit hpRNAi to engineer virus resistance (Prins, 2003). This strategy has been transferred to amiRNAs. In two published studies, virus resistance was achieved through expression of amiRNAs against viral suppressor proteins (Niu et al., 2006; Qu et al., 2007). There is some evidence that amiRNA-mediated virus resistance is less easily compromised by low temperature compared with hpRNAi-mediated resistance (Niu et al., 2006), but this does not appear to apply universally (Qu et al., 2007).
To adopt amiRNA-mediated gene silencing for high-throughput genetic screens, large-scale production of hairpin precursors has been initiated, similar to shRNA libraries in animals (Chang et al., 2006; Silva et al., 2005) and hairpin libraries in plants (http://www.agrikola.org, Hilson et al., 2004). The goal of the amiRNA effort (http://2010.cshl.edu/), which is supported by the National Science Foundation Arabidopsis 2010 program, is to provide an amiRNA resource that can be immediately exploited for constitutive expression, but can also be easily transferred into other vectors that provide tissue-specific or inducible expression. Several amiRNAs have been designed for each Arabidopsis protein-coding gene. In addition, amiRNAs for tandemly arrayed and segmentally duplicated genes have been included, such that the total number of designed amiRNAs is approximately 80 000. This effort might be extended to non-coding transcripts as well. When completed, it should provide an important new tool for A. thaliana functional genomics.
Although still a relatively new method, gene silencing using amiRNAs appears to be at least as effective and versatile as conventional hpRNAi, while at the same time promising greater specificity. Progress in the understanding biogenesis and action of endogenous miRNAs, a very active field of scientific inquiry, should allow further improvement and increased sophistication of the amiRNA approach.
We thank our colleagues at the Max Planck Institute, the International Rice Research Institute and Cold Spring Harbor Laboratory for comments and unpublished information, and also all those who provided feedback on the success of the amiRNA technology. Our work on small RNAs is supported by grants from DFG-SFB 446 and European Community FP6 IP SIROCCO (contract LSHG-CT-2006-037900) and FP6 IP AGRON-OMICS (contract LSHG-CT-2006-037704) to D.W., by an EMBO Long-Term Fellowship to R.S., and by the Max Planck Society, of which D.W. is a director.