• Open Access

The molecular machines that mediate microRNA maturation

Authors


Correspondence to: Ian J. MACRAE,
Department of Molecular Biology,
The Scripps Research Institute, La Jolla, CA 92037, USA.
Tel.: +(858) 784-2932
Fax: +(858) 784-7579
E-mail: macrae@scripps.edu

Abstract

  • • Introduction
  • • miRNAs originate from primary RNA transcripts
  • • The Microprocessor initiates miRNA processing
  • • Nuclear proteins regulate activity of the Microprocessor
  • • Intronic pre-miRNAs can bypass the Microprocessor
  • • The Exportin-5 Complex transports pre-miRNA to the cytoplasm
  • • The RISC-Loading Complex completes miRNA maturation
  • • Mature miRNAs silence target genes from within RISC
  • • Discussion and future directions

Abstract

MicroRNAs (miRNA) are small RNAs that regulate the translation of thousands of message RNAs and play a profound role in mammalian biology. Over the past 5 years, significant advances have been made towards understanding the pathways that generate miRNAs and the mechanisms by which miRNAs exert their regulatory functions. An emerging theme is that miRNAs are both generated by and utilized by large and complex macromolecular assemblies. Here, we review the biology of mammalian miRNAs with a focus on the macromolecular complexes that generate and control the biogenesis of miRNAs.

Introduction

MicroRNAs (miRNAs) are single-stranded RNAs of 22-nucleotides in length that play a critical role in regulating gene expression in multi-cellular organisms [1]. MiRNAs guide gene silencing by base pairing with the target mRNAs, and lead to translational repression and/or mRNA cleavage. MiRNAs direct diverse regulatory pathways, including developmental timing control, cell differentiation and proliferation, apoptosis and organ development [2]. Currently, over 600 miRNAs have been identified in human cells and it is estimated that thousands of genes in the human genome are regulated by miRNAs [3].

The human genome contains over 3 billion base pairs of DNA, about 15% of which is believed to be transcribed [4]. From within this vast sea of RNA, the miRNA biogenesis machinery must recognize and process the tiny RNA fragments destined to become miRNAs. Here, we review the current understanding of miRNA biogenesis with an emphasis on the molecular machines that mediate this essential cellular process. We also discuss recent findings that have begun to shed light on how miRNA recognition and maturation are controlled.

miRNAs originate from primary RNA transcripts

Most miRNAs arise from long RNA transcripts generated by RNA polymerase II [5, 6], although RNA polymerase III is also reported to generate a subset of miRNAs as well [7]. These primary transcripts, or pri-miRNA, often contain multiple miRNA sequences in tandem and can be up to several kilobases long. Pri-miRNA can be non-coding transcripts, bearing a 5′ 7-methyl guanylate (m7G) cap and a 3′ poly (A) tail or be encoded within introns or untranslated regions of coding RNAs [8–12]. About half of the known miRNAs in human beings are derived from the introns of messenger RNAs (mRNAs) [13, 14]. Regardless of the origin of the pri-miRNA, the consensus requirement for the recognition and subsequent processing of these molecules relies upon the ability of the sequence to form a stable hairpin structure of at least 30 base-pairs. This extended hairpin structure serves as signal for entry into miRNA maturation pathway (Fig. 1).

Figure 1.

The miRNA biogenesis pathway. Primary miRNAs (pri-miRNAs) contain hairpin structures that are recognized by either the Microprocessor or splicing machinery in the nucleus. Pre-miRNAs are then exported to the cytoplasm where they are processed into mature single-stranded miRNAs and silence target genes. Regulatory proteins are generically labelled ‘R’ at known points of regulation in maturation of specific miRNAs.

The Microprocessor initiates miRNA processing

Within the nucleus, the hairpin structures formed in pri-miRNAs are recognized and cropped out of primary transcripts by a ∼650 kD protein complex called the Microprocessor [15, 16]. The catalytic subunit of the Microprocessor is a protein in the RNase III family of enzymes named Drosha (also known as RNASEN) [9, 17, 18]. Drosha cleaves the pri-miRNA near the base of the hairpin to liberate a 65–100 nucleotide RNA product. Recognition of hairpin structure and cleavage site selection are mediated by another subunit in the Microprocessor named DiGeorge syndrome Critical Region gene 8 (DGCR8), which is also called Pasha (partner of Drosha) [16, 19, 20]. DGCR8 acts by recognizing the junction between single-stranded and duplex RNA at the base of the pri-miRNA hairpin thereby anchoring the Microprocessor to the bottom of the hairpin structure [21]. Such anchoring of the complex by DGCR8 positions Drosha for cleavage of the pri-miRNA by placing the Drosha active site about 11 base-pairs (one dsRNA helical turn) from the hairpin base. Drosha makes a double-stranded cleavage, generating a pre-miRNA hairpin with the 3′ 2-nucleotide overhang and 5′ terminal phosphate characteristic of RNase III products [9].

Recombinant Drosha in complex with DGCR8 is sufficient to recognize and properly cleave synthetic pri-miRNA substrates in vitro[15]. Therefore, the Drosha-DGCR8 complex is thought to form the functional core of the Microprocessor. However, the endogenous Microprocessor, when isolated from human or Drosophila cells, contains multiple additional protein subunits [15, 22, 23]. Precise roles for these additional factors in the Microprocessor function have not yet been established. However recent biochemical studies have shown that the DGCR8-associated RNA helicases, p68 (also called DDX5) and p72 (also called DDX17), are required for the recognition and processing of some of pri-miRNAs in vivo[23]. Steady state levels of a subset of pre-miRNAs (94 out of 266 examined) were diminished in p72−/− mouse embryos [23]. Intriguingly, p72 point mutants lacking only putative ATPase activity do not rescue pri-miRNA processing, suggesting that ATP-dependant alteration of RNA structure or removal of RNA binding proteins may be an important feature in the recognition and cleavage of many pri-miRNAs by the Microprocessor.

Nuclear proteins regulate activity of the Microprocessor

The Microprocessor carries out the first step in miRNA biogenesis pathway (after transcription) and hence also poses as a potentially powerful step for regulation of miRNA silencing. An emerging theme for the regulation of miRNA biogenesis is the interaction of various RNA-binding proteins with specific pri-miRNA transcripts. For example, two groups recently found that the human protein Lin-28 specifically binds to the terminal loop in let-7 pri-miRNA hairpin and prevents cleavage by the Microprocessor [24, 25]. The binding of site of Lin28 on pri-let7 shares considerable overlap with the binding site of the Microprocessor [26]. In this case, it is therefore likely that inhibition of the Microprocessor is the result of direct competitive binding.

RNA-binding proteins can also influence Microprocessor function by altering pri-miRNA structure. A recent study showed that the ubiquitous RNA-binding protein hnRNP A1 binds to the hairpin of pri-miR-18a and confers greater processing efficiency for the pri-miRNA via facilitating substrate recognition by the Microprocessor [27]. Intriguingly, the requirement for hnRNP A1 is context-dependent. That is, moving the pri-miRNA18a hairpin structure to other positions in the primary transcript obviated the requirement for hnRNP A1 binding. This shows that sequences outside of the hairpin structure can contribute to pri-miRNA recognition and suggests a mechanism in which hnRNP A1 binding stabilizes the hairpin structure relative to other RNA structures that may be formed with regions flanking the miRNA in the primary transcript.

Proteins that directly interact with the Microprocessor can also influence pri-miRNA recognition and processing. It was recently shown in vascular smooth muscle cells that activation of TGF-β and BMP signalling pathways recruits SMAD signal transducers to the Microprocessor through interactions with its p68 subunit [28]. Association with SMAD proteins in turn stimulates Micro processor activity on pri-miR-21, which leads to induction of a contractile phenotype in vascular smooth muscle cells. Intriguingly, activation of the Microprocessor by TGF-β and BMP was shown to be specific for only a subset of pri-miRNAs. Understanding the mechanism by which SMAD proteins activate the Microprocessor towards a handful of specific pri-miRNA targets is an important challenge for the future.

The above examples show that the Microprocessor's selectivity for specific pri-miRNA substrates can depend on factors beyond Drosha-DGCR8 alone. Although Drosha-DGCR8 is sufficient to process simple synthetic pri-miRNAs in vitro, pri-miRNAs generated in vivo are apt to have greater structural features allowing an additional layer of selectivity imposed by other RNA-binding factors in the cell. Moreover, recent studies have found that pri-miRNA transcripts are processed co-transcriptionally [29], suggesting that kinetics of RNA folding may also be a factor in pri-miRNA recognition. Morlando and co workers also found that in the case where the target lies within the intron of a gene, Drosha-mediated cleavage occurs prior to splicing, and that this promotes intron degradation by exonucleases [29]. This finding suggests that in some cases miRNA generation and mRNA maturation could be tightly coupled processes. Like splicing and alternative splicing of mRNAs, recognition of pri-miRNAs by the Microprocessor is likely to be highly dependent on the notoriously dynamic property of RNA folding. It is therefore reasonable to predict that pri-miRNA processing is integrated with nuclear RNA metabolism in general and that a host of RNA-interacting factors that regulate biogenesis of specific miRNAs remain to be discovered.

Intronic pre-miRNAs can bypass the Microprocessor

Originally identified in Drosophila and C. elegans, there is also a subset of miRNAs that originate from short intronic hairpins termed ‘mirtrons’[30, 31]. These miRNA precursors draw distinction from the conventional pri-miRNA in that they bypass the need for cleavage by Drosha. Instead, mirtrons are generated through action of the splicing machinery and lariat-debranching enzyme. The mirtrons rejoin the canonical miRNA biosynthesis pathway preceding cytoplasmic export.

The independence of mirtrons from Drosha/DGCR8 processing provides the possibility of having pre-miRNA originating from different pathways. Recently, it was found that both plants and mammals generate mirtrons as well [32]. The presence of mirtrons could be a strategy of evolution in diversifying miRNA-based gene silencing.

The Exportin-5 Complex transports pre-miRNA to the cytoplasm

Upon being generated by the Microprocessor or through the mirtron pathway, pre-miRNA hairpins are exported to the cytoplasm of the cell [8]. The nuclear export process is mediated by a ∼230 kD protein complex containing the nucleoplasmic transport factor Exportin-5 (Exp5) [33–37]. Exp5 is a member of the karyo-pherin family of transport proteins that interact directly with the small GTPase Ran (RAs-related Nuclear protein). Exp5 recognizes the characteristic 3′-overhang terminal end of the pre-miRNA and part of its duplex structure [38, 39]. Pre-miRNA binding to Exp5-Ran requires Ran to be in the GTP bound state (RanGTP). The pre-miRNA bound complex then moves to the cytoplasm through the nuclear pore complex. In the cytoplasm, the Exp5 complex interacts with Ran GTPase activating protein (RanGAP), which stimulates the GTPase activity of Ran [40]. GTP hydrolysis then induces Exp5 to release its cargo into the cytoplasmic milieu.

Exp5 has also been shown to interact with the protein inter-leukin enhancer-binding factor (ILF3, also called NF90 and NFAR1) [35]. Intriguingly, ILF3 also interacts with the Microprocessor protein DGCR8 [22]. Because ILF3 binds to both DGCR8 and Exp5, it has been proposed that ILF3 functions as a shuttling factor that delivers pri-miRNAs formed by the Microprocessor to Exp5 for export out of the nucleus. Important biochemical studies for the future include determining the effect of ILF3 binding on Microprocessor activity, particularly with respect to how ILF3 influences release of the pre-miRNA product.

The RISC-loading complex completes miRNA maturation

In the cytoplasm, pre-miRNAs are processed into mature single-stranded miRNAs by a ∼500 kD protein assembly called the RISC-loading complex (RLC) [41–43]. The RLC cleaves the pre-miRNA to remove its hairpin loop, resulting in a miRNA duplex (also called miRNA:miRNA* duplex, where miRNA* denotes the passenger RNA strand that will be discarded in the final step of miRNA maturation, see below). After the duplex is formed, one of the RNA strands is selected and loaded into a member of the Argonaute (Ago) family of proteins. Once loaded with a single-stranded miRNA Ago proteins form the core subunit of the RNA-induced silencing complex (RISC, also called miRNP when loaded with a miRNA), where all mature and functional miRNAs are thought to reside [44–46].

The RISC-loading process has been best studied in Drosophila embryo extracts. Flies contain two distinct RLCs: RLC-1 contains the proteins Dicer-1 and Loquacious (also called R3D1) [47]; RLC-2 is composed of the proteins Dicer-2 and R2D2 [42, 48]. The function of the Dicer subunit in the RLC is to process dsRNA into small RNA duplexes of uniform length, typically about 20 base pairs [49]. This is accomplished by making a double-stranded cleavage roughly 20 nucleotides from the free open end of the pre-miRNA hairpin or long dsRNA [50, 51]. Like Drosha, Dicer is a member of the RNase III family of enzymes and thus also generates RNA products with 3′ 2-nucleotide overhangs and 5′ terminal phosphates [49, 52]. In Drosophila, Dicer-1 is responsible for miRNA production and is an essential protein that is required for fly development [53]. Dicer-2, on the other hand, is not required for development but is involved in the production of small interfering RNAs (siRNA), which are generated from long dsRNA fragments and mediate silencing of retrotransposons and RNA viruses [54, 55].

After cleavage by Dicer, the small RNA duplex is thought to dissociate from the RLC and then rebind [56]. The release and rebinding of mi- or siRNA duplexes to the RLC has two functions in flies. First, rebinding allows small RNA sorting so that duplexes generated by RLC-1 can be transferred to RLC-2 and vice versa[57, 58]. Second, rebinding allows the duplex to orient itself on the RLC in such a way that the correct ‘guide’ strand of RNA is loaded into Ago.

Selection of the guide strand is one of the final steps in miRNA biogenesis. The rule for choosing which RNA strand in the miRNA duplex is to be retained in Ago and which is to be discarded as the ‘passenger’ is well established: the RNA strand with its 5′ end on the less thermodynamically stable end of the duplex is designated as the guide strand [59, 60]. The mechanism by which thermodynamic asymmetry is measured is best understood for siRNAs in the Drosophila RLC-2. In this complex, the protein R2D2, which contains two double-stranded RNA binding domains (dsRBD), selectively binds to the end of the siRNA duplex that possesses the greatest double-stranded character [42]. This binding orients the siRNA duplex on the RLC for the next step in the process, which is the recruitment and passing of the duplex to the protein Argonaute-2 (Ago2). Once the duplex has been passed to Ago2 one siRNA strand (the passenger) is cleaved and discarded, whereas the other strand (the guide) is retained as a mature siRNA and used in subsequent gene silencing. The orientation of the duplex as it is passed from Dicer-2/R2D2 to Ago2 determines which strand will be retained. MiRNAs are thought to be loaded into Ago1 through RLC-1 using a similar mechanism, except that the passenger strands in miRNA duplexes are not cleaved, but instead removed by an uncharacterized bypass mechanism [61].

In human cells, Dicer is associated with Ago2 prior to miRNA duplex binding [41, 43]. The human RLC is believed to be composed of the proteins Dicer, Ago2 and TRBP. TRBP contains two functional dsRBDs similar to Drosophila R2D2 and, although it has not yet been demonstrated, likely plays a similar role in guide strand selection. Human Dicer also interacts with an RNA-binding protein called PACT, which is a paralog of TRBP [62]. The precise mechanistic function of PACT in pre-miRNA processing has not yet been determined. However, recombinant human RLC assembled in vitro can process pre-miRNA substrates and load them into Ago2 in the absence of PACT [63].

There are some distinct biochemical differences the RLCs found Drosophila and in mammalian systems. Although Drosophila has two Dicers that exist in distinct molecular complexes, the human genome encodes only a single Dicer gene. The human Dicer is likely the ortholog of Drosophila Dicer-1 because both process pre-miRNA substrates and neither require ATP hydrolysis for catalysis [64–69]. However, in contrast to the Drosophila system, human Ago2 has been shown to associate with the RLC even prior to pre-miRNA binding [41, 43]. Furthermore, after human Ago2 is loaded with the guide miRNA, it dissociates from the RLC [41, 63]. In contrast, Drosophila Ago2, Dicer-2 and R2D2 stay associated with each other and recruit additional protein components to form an 80S super assembly termed ‘Holo-RISC’[70].

Mature miRNAs silence target genes from within RISC

Argonaute bound to a single-stranded guide miRNA forms the core subunit RISC (also called miRISC and miRNP), the effector complex of miRNA-mediated gene silencing [45, 46]. A ‘minimal RISC’, which is sufficient to recognize and cleave a target RNA, can be generated in vitro by simply incubating recombinant human Ago2 with a single-stranded guide RNA [71]. However, in vivo RISC is a much larger and possibly dynamic molecular machine. Over 50 proteins have been identified in association with Ago1/Ago2 [72] and reported sizes of RISC range from 100 kD [73] to over 2.5 MD [70, 74].

RISC uses the mature miRNA bound to Ago to guide the silencing of genes bearing full or partial sequence complementarity to the miRNA. In mammals, most miRNA-mediated gene silencing is thought to function through translational repression of targeted message RNAs. However, cases of Ago2-catalysed target cleavage have been reported as well [75]. The mechanism of translation repression by miRNAs is the subject of many reviews and to date remains somewhat controversial [76–82].

Discussion and future directions

The coordinated action of RNA polymerase II, the Microprocessor, Exp5 and the RLC results in the step-wise generation of mature miRNAs in human beings. Although the basic pathways for generating miRNAs are now understood, many mechanistic questions remain. For example, there are four Ago proteins in human cells and thus four possible destinations for any given miRNA. Is there a miRNA sorting mechanism in mammals and if so are all Ago proteins loaded by the same RLC? Recent results have shown that human Ago2 is post-translationally modified [83, 84]. Does modification alter the loading mechanism or contribute to miRNA sorting? Recent results show that cytoplasmic Lin-28 can bind pre-miRNAs and inhibit cleavage by Dicer [85]. Are there other regulators of RISC-loading remaining to be discovered? Is there a cytoplasmic counterpart to ILF3 that shuttles pre-miRNAs from Exp5 to the RLC? And finally, how are miRNAs turned over in the cell? Recent experiments in plants have identified a family of miRNA degrading exonucleases termed Small RNA degrading nuclease (SDN), which belong to the 3′-5′ Exoc family of nucle-ases [86]. These RNases were shown to degrade single-stranded mature miRNAs down to 7–8 nucleotide fragments in vitro. However, if mature miRNAs are tightly bound to Ago proteins, it is not clear how the SDNs will access their substrates in vivo. Clearly, many exciting discoveries in the field of miRNA biology are on the near horizon.

Acknowledgements

We are grateful to Ashley J. Pratt for stimulating conversations and critical reading of the manuscript. P.W.L. is a pre-doctoral fellow of the American Heart Association. I.J.M. is a Pew Scholar in the Biomedical Sciences.