Although huge amounts of high-throughput sequencing (HTS) data are available, limited systematic analyses have been performed by integrating these valuable resources. Based on small RNA (sRNA), RNA and degradome HTS data, the sRNAs specifically expressed at vegetative and reproductive stages were identified separately in rice.
Two distinct groups of sRNA HTS data, which were prepared during the vegetative and the reproductive stages, were utilized to extract stage-specific sRNAs. Degradome sequencing data were employed for sRNA target validation. RNA sequencing data were used to construct expression-based, sRNA-mediated networks.
As a result, 26 microRNAs and 413 sRNAs were specifically expressed at the vegetative stage, and 79 microRNAs and 539 sRNAs were specifically expressed at the reproductive stage. In addition to the microRNAs, numerous stage-specific sRNAs enriched in ARGONAUTE1 showed great potential to perform cleavage-based repression on the targets. Several stage-specific sRNAs were indicated to result from the wobble effect of Dicer-like 1-mediated processing of microRNA precursors. The expression patterns of the sRNA targets, and the stage-specific cleavage signals strongly indicated the reliability of the constructed networks.
A set of rice stage-specific sRNAs along with the regulatory cascades, which have great potential in regulating specific developmental stages, were provided for further investigation.
The small RNA (sRNA) molecules, currently recognized as important transcriptional/post-transcriptional regulators, play an essential role in plant growth and development (Chen, 2009). One of the well-characterized sRNA species is microRNA (miRNA). In plants, most miRNAs are incorporated into ARGONAUTE1 (AGO1)-associated RNA-induced silencing complexes (RISCs), and exert their repressive regulation on the highly complementary RNA targets mostly through cleavages (Jones-Rhoades et al., 2006). In addition to the miRNAs, natural antisense small interfering RNAs (nat-siRNAs; Borsani et al., 2005), trans-acting small interfering RNAs (ta-siRNAs; Allen et al., 2005; Williams et al., 2005) and mirtrons (Meng & Shao, 2012) have been reported, and the repeat-associated small interfering RNAs (ra-siRNAs) occupy a dominant portion of the plant endogenous sRNAs (Herr et al., 2005; Onodera et al., 2005; Chen et al., 2010). However, to date, only a few cases, such as ta-siRNA-mediated regulation of Auxin Response Factor (ARF) genes (Williams et al., 2005), have been available to address the question as to whether the sRNAs other than miRNAs could guide AGO-associated complexes to cleave the targets. Moreover, the biological roles of the huge sRNA population remain elusive in plants.
The newly developed high-throughput sequencing (HTS) technology has been widely used to interrogate the expression patterns of sRNAs (known as sRNA HTS) and genes (RNA-seq), and also to obtain the signals of sRNA-mediated target cleavages (degradome sequencing; Ozsolak & Milos, 2010; Han et al., 2011). As a result, vast amounts of HTS data are available in the public repositories, such as the Gene Expression Omnibus (GEO; Barrett et al., 2009) and the Next-Gen Sequence databases (Nakano et al., 2006). Although most HTS data sets were generated for certain research purposes, systematic studies to exploit the new value of these resources are lacking. In this study, three kinds of HTS data (i.e. sRNA and RNA HTS data, and degradome sequencing data) were integrated to construct regulatory networks mediated by stage-specifically expressed sRNAs in rice (Oryza sativa). First, the sRNA HTS data sets were divided into two groups: those prepared from young seedlings (i.e. vegetative stage), and those prepared at the reproductive stage (Supporting Information, Table S1). Based on this classification, both vegetative and reproductive stage-specific sRNAs, including miRNAs, were extracted. The sequence features of these stage-specific sRNAs were characterized. Lists of miRNA/sRNA–target pairs were obtained through large-scale prediction and degradome data-based validation. Many stage-specific sRNAs were found to share high sequence similarity with certain miRNAs or miRNA*s of rice, which might result from the wobble effect of Dicer-like 1 (DCL1)-mediated processing of the miRNA precursors. Besides, some sRNAs with a strong potential to exert cleavage-based repression could be mapped to the sequences not coding for the miRNA/miRNA* duplexes on specific precursors, indicating a novel biological role of these regions. The gene regulatory networks mediated by the vegetative and the reproductive stage-specific sRNAs were constructed separately. Stage-specific RNA-seq and degradome sequencing data combinatorially strengthened the reliability of the established networks. Finally, a number of subnetworks was investigated to gain deeper biological insights. Taken together, highly comprehensive regulatory networks mediated by stage-specifically expressed miRNAs and sRNAs were constructed. Several regulatory cascades possess great potential to maintain specific developmental stages or modulate the vegetative-to-reproductive phase transition in rice, which needs further experimental validation. Our comprehensive study also provides a case showing how to fully exploit the novel value of the publicly available HTS data resources for certain research purposes. All the developmental stage-specific sRNAs, along with their validated target genes, were integrated into the Rice Annotation Project Database (RAP-DB; Itoh et al., 2007; Tanaka et al., 2008).
Materials and Methods
The sRNA HTS data, including those prepared during the vegetative and the reproductive stages, and AGO1-associated sRNA sequencing data, were retrieved from GEO (http://www.ncbi.nlm.nih.gov/geo/; Barrett et al., 2009) and Next-Gen Sequence Databases (http://mpss.udel.edu/rice_sRNA/; Nakano et al., 2006). The rice degradome sequencing data sets were obtained from GEO (see Table S1 for details). The RNA-seq data for expression pattern analysis of the sRNA target genes were downloaded from the rice genome annotation project established by the Institute for Genome Research (TIGR rice, http://rice.plantbiology.msu.edu/expression.shtml; Yuan et al., 2003). The gene annotations and the cDNA sequences were obtained from the ftp sites of TIGR rice (release 6.1). The rice miRNAs and miRNA precursors were downloaded from miRBase (release 18; http://www.mirbase.org/cgi-bin/mirna_summary.pl?org=osa; Griffiths-Jones et al., 2008).
Prediction and validation of the sRNA/miRNA targets
Target prediction was performed using the miRU algorithm (Zhang, 2005; Dai & Zhao, 2011) with default parameters. The degradome sequencing data were utilized to validate the predicted sRNA–target pairs. First, in order to allow cross-library comparison, the normalized read count (in reads per million, RPM) of a short sequence from a specific degradome library was calculated by dividing the raw count of this sequence by the total counts of the library, and then multiplied by 106. Two-step filtering was then performed to extract the most likely sRNA–target pairs. During the first step, the predicted sRNA binding sites along with the 50 nucleotide (nt) surrounding sequences at both ends were collected in order to reduce the BLAST time. For the BLAST, all the collected degradome data sets were utilized at the same time to do a comprehensive search. This was based on the scenario that a sRNA–target pair was considered to be the candidate once the cleavage signal(s) existed in any data set(s). The predicted targets meeting the following criteria were retained: there must be perfectly matched degradome signatures with their 5′ ends residing within the region 8–14 nt away from the 5′ ends of the target binding sites; and for a specific position within the 8–14 nt region, which could be regarded as the potential cleavage site, there must be two or more distinct degradome signatures with 5′ ends to support this position. These retained transcripts were subjected to a second BLAST, and the degradome signals along each transcript were obtained to provide a global view of the signal noise when compared with the signal intensity within a specific target binding site. Referring to our previous study (Meng et al., 2011a), both global and local target plots (t-plots) were drawn. Finally, exhaustive manual filtering was performed, and only the transcripts with easily recognizable cleavage signals were extracted as the potential sRNA–target pairs.
Results and Discussion
Identification of stage-specific sRNAs in rice
The rice sRNA HTS data sets were classified into two groups (Fig. 1a, Table S1): those prepared from young seedlings (16 data sets, defined as ‘vegetative’ group) and those prepared at the reproductive stage (18 data sets, defined as the ‘reproductive’ group). Although the data sets belonging to the same group might be prepared from different rice tissues in different growth conditions, and might be prepared by using distinct experimental procedures, the distinguishable developmental stages are the common features shared by the same group members. In this regard, we searched for the sRNAs existing exclusively in either the ‘vegetative’ or the ‘reproductive’ group, which were regarded as stage-specific sRNAs for further analysis. After the first filtering, a portion of sRNAs were removed from each data set. However, the major portions were retained for both groups (64.75–87.57% were retained for the ‘vegetative’ group, and 64.93–90.29% for the ‘reproductive’ group). This result indicates that the two groups indeed possess their featured sRNA populations.
According to the miRBase registries (release 18), 26 and 79 rice miRNAs were identified from the ‘vegetative’ and the ‘reproductive’ groups, respectively (Notes S1, S2). As already mentioned, most plant miRNAs exert their repressive regulation on the highly complementary targets through AGO-mediated (AGO1 in most cases) cleavages (Jones-Rhoades et al., 2006; Voinnet, 2009). This action mode provides us with a good basis for the sequence complementarity-based target prediction and the subsequent degradome signal-based validation. Thus, all these miRNAs were included for the following analyses. However, for the remaining stage-specific sRNAs, whether these sRNAs can play a repressive role in gene expression through target cleavages is unclear. To extract those with great potential to modulate gene expression in the same way as the miRNAs, the HTS data sets generated from the AGO1-enriched sRNA libraries (Table S1) were utilized for the second filtering. This was based on the enticing hypothesis that the sRNAs enriched in AGO1 are likely to guide the RISCs to perform target cleavages as the miRNAs. In this scenario, the sRNAs were considered to be AGO1-enriched and were included for further analyses if they met either of the following criteria: compared with the control data set, the level of the sRNA in any of the three AGO1-associated sRNA HTS data sets was 10 times higher; the sRNA was not detected in the control set, and its level was 10 RPM or higher in any of the AGO1 sets. As a result, 413 and 539 AGO1-enriched sRNAs were extracted from the ‘vegetative’ and ‘reproductive’ groups, respectively (Fig. 1a, Notes S3, S4).
Next, the sequence characteristics of both the miRNAs and the sRNAs were examined. Consistent with the previous report (Mi et al., 2008), 68.52% of the vegetative stage-specific, AGO1-enriched sRNAs, and 81.26% of the reproductive stage-specific, AGO1-enriched sRNAs start with uridine (5′ U; Fig. 1b). This reflects the fact that the criteria for the identification of AGO1-enriched sRNAs are highly efficient. Surprisingly, a large portion of the stage-specific miRNAs do not possess 5′ U. Instead, 31.25 and 35.42% of the vegetative-specific miRNAs begin with 5′ A and 5′ G, respectively; and 28.00% of the reproductive-specific miRNAs possess 5′ A (Fig. 1b). Thus, it is worthwhile investigating further the interesting phenomenon that the stage-specific sRNAs identified in this study share a similar 5′ terminal composition pattern, but the miRNAs do not. Besides, dominant portions of the stage-specific miRNAs and sRNAs are 21 nt in length (Fig. 1c), although relatively small populations with different lengths were observed. For instance, 18.75% of the vegetative-specific miRNAs are 24 nt, and 32.00% of the reproductive-specific miRNAs are 22 nt. Moreover, c. 42% of the vegetative-specific sRNAs are 19–20 nt. Collectively, our sequence characteristic analysis shows that, except for the vegetative-specific miRNAs, most of the other stage-specific miRNAs and sRNAs are 21 nt and begin with 5′ U, indicating their strong potential to be incorporated into AGO1 complexes for guiding target cleavages.
Identification of the targets of the stage-specific sRNAs
Considering the strong potential of the stage-specific sRNAs to perform sequence complementarity-dependent target cleavages, we performed a large-scale computational prediction of their target genes using the miRU algorithm (Zhang, 2005; Dai & Zhao, 2011). Based on the degradome sequencing data, all the predicted targets were further validated by t-plots (German et al., 2008, 2009; Meng et al., 2011a; see the Materials and Methods section for details). As a result, target lists of the developmental stage-specific miRNAs and sRNAs were obtained (Figs S1–S4, Notes S5, S6). The degradome sequencing data sets were then divided into two groups: the libraries prepared from the vegetative tissues (GSM434596 and GSM455938) and those prepared from the reproductive tissues (GSM455939 and GSM476257). Among hundreds of the miRNA/sRNA–target interactions, various target cleavage patterns were discovered (Fig. 2). For dozens of the miRNA/sRNA–target pairs, the stage-specific patterns of the degradome signals are highly correlated with the expression patterns of the corresponding sRNAs. For example, the sRNAs miR172c and sRNA236 (refer to Notes S3) that were specifically expressed at the vegetative stage recognize LOC_Os04g55560.2 (AP2 domain-containing protein, expressed; RAP ID, Os04g0649100) and LOC_Os01g49219.1 (expressed protein; RAP ID, Os01g0686100) as their targets, respectively. Accordingly, the dominant cleavage signals were detected in the degradome libraries prepared during the vegetative stage of rice (Fig. 2a-1,c-1). Similar agreement between the sRNA expression patterns and the target cleavage signals was also observed for the reproductive-specific sRNAs (Fig. 2b-1,d-1), such as miR2118d/f/m/j and sRNA460 (refer to Notes S4). For another portion of the sRNA–target pairs, although cleavage signals with consistent stage-specific patterns could be detected within the target binding sites, more prominent signals were detected in the degradome libraries prepared at the different developmental stages. For instance, miR393b and sRNA177 were specifically expressed at the reproductive stage, and the reproductive-specific cleavage signatures were detectable on their targets LOC_Os05g05800.1 (OsFBL21, F-box domain and leucine-rich repeat (LRR)-containing protein, expressed; RAP ID, Os05g0150500) and LOC_Os02g52180.1 (plastocyanin-like domain-containing protein, putative, expressed; RAP ID, Os02g0758800). However, much stronger signals were detected from the vegetative-specific degradome libraries (Fig. 2b-2,d-2). Similar examples were also observed for the sRNAs specifically expressed at the vegetative stage, such as miR172c–LOC_Os07g13170.1 (AP2 domain-containing protein, expressed; RAP ID, Os07g0235800) and sRNA149–LOC_Os02g44360.1 (scarecrow transcription factor family protein, putative, expressed; RAP ID, Os02g0662700; Fig. 2a-2,c-2). For the remaining portion of the sRNA–target pairs, different stage-specific patterns were observed between the sRNA expression and the corresponding cleavage signals. For example, miR2118f/m/j and sRNA367 were specifically expressed at the reproductive stage, but evident cleavage signals could be detected only in the vegetative-specific degradome libraries (Fig. 2b-3,d-3). Similarly, cleavage signatures for the interactions miR159d–LOC_Os07g05720.1 (TCP family transcription factor, putative, expressed; RAP ID, Os07g0152000) and sRNA181/sRNA321/sRNA374–LOC_Os02g45570.1 (growth regulating factor protein, putative, expressed; RAP ID, Os02g0678800) could not be detected in the vegetative-specific degradome libraries (Fig. 2a-3,c-3).
We also observed that one transcript possessed two target binding sites, both of which were supported by degradome signatures. For certain transcripts (i.e. LOC_Os02g14990.1 (zinc finger, C3HC4 type domain-containing protein, expressed), LOC_Os03g22050.1 (CAMK_KIN1/SNF1/Nim1_like.16–CAMK includes calcium/calmodulin-depedent protein kinases, expressed; RAP ID, Os03g0339900) and LOC_Os08g32500.1 (nucleobase-ascorbate transporter, putative, expressed; RAP ID, Os08g0420600)), the two binding sites are highly overlapped, and two neighboring clusters of cleavage signals were observed accordingly (Fig. 2a-4,c-5,d-4). For the other transcripts, the two binding sites either do not overlap with each other (LOC_Os02g13210.1 (transposable element protein, putative, containing Pfam profile: PF03004, Transposase_24, expressed) and LOC_Os02g43560.1 (WRKY34, expressed; RAP ID, Os02g0652100); Fig. 2c-4,d-5), or there is even a long distance between the two sites (LOC_Os04g58070.1 (aspartic proteinase nepenthesin precursor, putative, expressed; RAP ID, Os04g0677100; Fig. 2d-6)). In all cases, prominent cleavage signals could be detected within the middle regions of the target binding sites.
Another intriguing finding is that, for some target genes, some of their alternative splicing forms were subjected to sRNA-guided cleavages, but the other forms were not. For example, only the alternative splicing forms LOC_Os12g12664.2 (expressed protein; RAP ID, Os12g0228400), LOC_Os02g12654.2 (expressed protein; RAP ID, Os02g0218300), LOC_Os02g52900.2 (glutaredoxin 2, putative, expressed) and LOC_Os03g29540.2 (ATP-dependent protease, putative, expressed; RAP ID, Os03g0409100) were cleaved by the vegetative-specific miR442, sRNA278, sRNA148 and sRNA152, respectively (Figs S1, S3), and only LOC_Os02g12654.2 was subjected to cleavage-based repression by the reproductive-specific sRNA520 (Fig. S4). From this point of view, for some rice genes, not all of their gene products are under the surveillance of the same sRNA regulators. Interestingly, a recent report demonstrated that alternative splicing could increase the complexity of miRNA-mediated gene regulation in Arabidopsis (Yang et al., 2012). Thus, the question of whether alternative splicing widely affects the regulatory activities of miRNAs in plants needs to be investigated further. We also noticed that the sequence differences between the sRNA-regulated alternative splicing forms and the other forms that were not repressed always resided within the untranslated regions (UTRs). Considering the fact that the 3′ UTRs of the animal genes serve as the hot target regions for many miRNAs (Brodersen & Voinnet, 2009), our observation points to the novel role of the UTRs in modulating miRNA/sRNA-mediated regulation on specific gene products in plants, which needs further confirmation.
Distribution of the stage-specific sRNAs on the miRNA precursors
Next we investigated the origin of the stage-specific sRNAs on the rice transcriptome. Thus, all the sRNAs excluding miRNAs were mapped to the TIGR-annotated rice cDNAs (release 6.1) and the miRBase-registered miRNA precursors (release 18). As a result, a total of 29 vegetative-specific sRNAs could be perfectly mapped to 42 miRNA precursors, and 39 reproductive-specific sRNAs were perfectly mapped to 39 miRNA precursors (Notes S7, S8). Several distribution patterns of the sRNAs are quite interesting. Many stage-specific sRNAs were perfectly mapped to the miRNA- or miRNA*-coding regions of the precursors MIR159a, MIR396c/g, MIR1429, MIR2103, MIR1862d, and MIR164f (Fig. 3A1,B-1,C,D). Notably, except for the reproductive-specific sRNA98, which shares exactly the same sequence with miR164f*, one to five nucleotide differences exist between the other sRNAs and the miRNA(*)s on the corresponding precursors. Based on previous reports in plants and animals, whether these sRNAs were highly overlapped with the miRNA(*)s as a result of terminal modifications (Lu et al., 2009; Ameres et al., 2010; Ibrahim et al., 2010; Cerutti & Ibrahim, 2011), or of the dicing errors that occurred during DCL1-mediated processing (Rajagopalan et al., 2006; Fahlgren et al., 2007; Voinnet, 2009) needs further interrogation. Another notable phenomenon is that, for the miRNA precursors listed earlier, none of the perfectly matched sRNAs share the same expression patterns with the miRNA(*)s of the corresponding precursors. For most of these stage-specific sRNAs, the miRNA(*)s were not expressed specifically at any developmental stage investigated here. One exception was observed on the precursor MIR396g (Fig. 3B1). The mature miRNA miR396g was specifically expressed at the reproductive stage, whereas the sRNA sRNA321, which was 1 nt shorter at its 3′ end, was exclusively detected at the vegetative stage. This kind of discrepancy does not only exist between the sRNAs and the corresponding miRNA(*)s. sRNA242 and sRNA293, both of which highly overlapped with miR1429-3p, were specifically expressed at the vegetative and reproductive stages, respectively (Fig. 3C). Additionally, sRNA398 and sRNA334, both of which highly overlapped with miR2103, were expressed at the reproductive and vegetative stages, respectively (Fig. 3C). Based on the previous report by Vazquez et al. (2008), Voinnet (2009) proposed another model to interpret the generation of the miRNA(*) variants offset by one or more nucleotides at the 5′/3′ ends: the discrete plant tissues might contribute greatly to distinct miRNA maturation steps. If the stage-specific sRNAs are regarded as the bona fide miRNA(*) variants, the different expression patterns among the miRNA(*) variants on the same precursors could further support the tissue-dependent model of miRNA biogenesis.
The reproductive-specific sRNAs, sRNA181, sRNA433 and sRNA435, were mapped to the sequences not coding for the miRNA/miRNA* duplexes on the precursors MIR441b, MIR808 and MIR809a/d/f/g/h (Fig. 3E1). Similar examples could be found on MIR819a and MIR819e (Fig. 3F1), MIR806g (Fig. 3G1), and MIR445a, MIR445b, MIR445e, MIR445f and MIR445h (Notes S8). According to previous studies (Bonnet et al., 2004; Wang et al., 2006; Fahlgren et al., 2007; Song et al., 2009), the sequences encoding the miRNAs and the miRNA*s are highly conserved, especially for the precursors belonging to the same families. However, the other regions are less well conserved and are considered to be functionless. In this study, the regions with great potential to generate stage-specific sRNAs point to the possibility that the miRNA precursors may retain other conserved regions with novel biological functions. In support of this, all the sRNAs perfectly mapped to these miRNA precursors are enriched in AGO1, and some of them could perform target cleavages (Fig. 3A2, B2, E2, E3, F2, G2).
We also mapped the stage-specific sRNAs to the rice cDNAs. According to the TIGR annotations, a large portion of sRNAs, especially those expressed at the reproductive stage, were perfectly mapped to the transposable element (TE) genes (Notes S9 and S10). Based on previous reports (Kasschau et al., 2007; Li et al., 2008; Lister et al., 2008; Chen et al., 2010), the activities of the TE genes are under strict surveillance of endogenous siRNAs, mainly through a DNA methylation-based pathway. However, the target cleavage activity of the TE-originated sRNAs (Figs S3, S4) points to the possibility that the TE genes are not only the sRNA targets, but also the origin of sRNAs with novel biological roles. Besides, among the genes with perfectly matched sRNAs that were specifically expressed at the vegetative stage, a large portion (nine out of 23 genes) were involved in leaf development or photosynthesis (Notes S9). Three genes, LOC_Os09g14490, LOC_Os11g12340 and LOC_Os11g45970, with perfectly matched reproductive-specific sRNAs encode disease resistance proteins, and one gene, LOC_Os02g34950 (ATP binding protein, putative, expressed), is implicated in rice reproduction based on the GO (Gene Ontology) annotation (Notes S10). Whether these sRNAs were indeed generated from the corresponding gene transcripts through specific biogenesis pathway(s), and whether the biological activities of these stage-specific sRNAs could be linked to the functions of their parental genes need further investigation.
Construction of stage-specific sRNA-mediated regulatory networks, and target expression- and cleavage signal-based analysis
Although the sRNA–target lists obtained earlier (Notes S5, S6) could be used for network construction, we realized that more exquisite and reliable networks could be established based on the stage-specific patterns of the target expression and the corresponding cleavage signals. To this end, we adopted the classification of the degradome sequencing data sets as defined earlier, that is, the libraries prepared from the vegetative tissues (GSM434596 and GSM455938), and those prepared from the reproductive tissues (GSM455939 and GSM476257). In this scenario, the vegetative-specific sRNAs should exert repressive regulation on their targets mainly at the vegetative stage. Thus, the cleavage signals detected for the sRNA–target pair should be found predominantly in the ‘vegetative’ degradome data group. The same is true for the reproductive-specific sRNAs. However, as previously described, although evident cleavage signals could be detected within the target binding sites, not all the extracted sRNA–target pairs possess the dominant degradome signatures with the stage-specific patterns similar to the expression of the sRNA regulators (Figs 2, S1–S4). In this regard, the sRNA–target pairs identified based on the degradome data were classified into two categories: ‘well supported’ by the degradome data, the pairs possessing the dominant cleavage signals with the stage-specific patterns similar to the expression patterns of the corresponding sRNAs; and ‘not well supported’, those without coincidental stage-specific patterns. Both types of sRNA–target pairs were used for network construction, but the ‘well supported’ interactions were represented by solid edges, and the ‘not well supported’ ones were denoted by dashed edges (Figs S5–S8). At first glance, 12 out of 22 (54.55%), 28 out of 32 (87.50%), 62 out of 117 (52.99%), and 113 out of 191 (59.16%) target genes are ‘well supported’ within the vegetative-specific miRNA-, the reproductive-specific miRNA-, the vegetative-specific sRNA-, and the reproductive-specific sRNA-mediated networks, respectively (Tables S2–S5). Based on these results, the ‘well supported’ sRNA–target pairs occupy a significant portion of the interactions utilized for network construction, indicating the high reliability of the networks.
Although the stage-specific sRNA-mediated cleavages could be supported by the stage-specific patterns of the corresponding cleavage signals, we recognized that the coexpression of the sRNA regulators and their target genes at a specific developmental stage might also be the key factor for successful interactions. To this end, we retrieved the RNA-seq-based expression data from TIGR rice (Yuan et al., 2003). According to the criteria in the footnote to Table 1, the target genes highly expressed at the vegetative and reproductive stages were identified (Tables 1, S2–S5). Different percentages of the target genes highly expressed at a specific stage were observed between the ‘well supported’ and the ‘not well supported’ interactions (Table 1). For the ‘well supported’ interactions, the high expression percentages of the target genes correlate well with the stage-specific expression patterns of the sRNA regulators (41.67% highly expressed at the vegetative stage vs only 25.00% highly expressed at the reproductive stage for the targets of the vegetative-specific miRNAs; 42.86% highly expressed at the reproductive stage vs only 7.14% highly expressed at the vegetative stage for the targets of the reproductive-specific miRNAs; 43.36% highly expressed at the reproductive stage vs only 8.85% highly expressed at the vegetative stage for the targets of the reproductive-specific sRNAs), although one exception was observed for the targets of the vegetative stage-specific sRNAs (22.58% highly expressed at the vegetative stage vs 24.19% highly expressed at the reproductive stage for the targets of the vegetative-specific sRNAs). Moreover, for the ‘not well supported’ interactions, the high expression percentages of the target genes are always contrary to the stage-specific expression patterns of the sRNA regulators. For example, 60.00% of the targets of the vegetative-specific miRNAs and 52.73% of the targets of the vegetative-specific sRNAs were highly expressed at the reproductive stage, while 0.00% and 1.82% of the targets, respectively, were highly expressed at the vegetative stage. Based on these results, we concluded that more similar expression patterns between the target genes and their miRNA/sRNA regulators existed within the ‘well supported’ interactions of the networks.
Table 1. Expression level-based investigation of the target genes of the developmental stage-specific microRNAs (miRNAs) and small RNAs (sRNAs) in rice (Oryza sativa)
Targets of the vegetative stage-specific miRNAs/sRNAs
Targets of the reproductive stage-specific miRNAs/sRNAs
The gene expression data (RNA-seq data; see detailed information of these data sets in the footnote of Table S2) were retrieved from TIGR rice (http://rice.plantbiology.msu.edu/expression.shtml). The average expression level of a specific target gene at a specific developmental stage was calculated based on the four RNA sequencing libraries prepared from this stage (SRX016110, SRX016111, SRX016112, and SRX016113 were prepared at the vegetative stage, and OSN_AB, OSN_AC, OSN_AD, and OSN_AE were prepared at the reproductive stage; the average values were designated as ‘Vegetative_average’ and ‘Reproductive_average’).
If the value ‘Vegetative_average’ is ≥ twice the ‘Reproductive_average’, or the value ‘Reproductive_average’ is 0, and the ‘Vegetative_average’ is > 1, the corresponding target gene was recognized to be highly expressed at the vegetative stage.
If the value ‘Reproductive_average’ is ≥ twice the ‘Vegetative_average’, or, the value ‘Vegetative_average’ is 0, and the ‘Reproductive_average’ is > 1, the corresponding target gene was recognized to be highly expressed at the reproductive stage.
Interactions well supported by degradome data (%)
Interactions not well supported by degradome data (%)
Certain subnetworks were extracted for further investigations. miR393b, specifically expressed at the reproductive stage, targets two F-box domain- and LRR-containing protein-coding genes, both of which are involved in flower development based on the GO annotations (Fig. 4a-1). Both target genes were highly expressed at the reproductive stage. Although the interaction miR393b–LOC_Os05g05800.1 (OsFBL21: F-box domain- and LRR-containing protein, expressed; RAP ID, Os05g0150500) is ‘not well supported’ based on our filtering criteria, evident cleavage signal could be found in the ‘reproductive’ degradome library GSM476257, strongly indicating the existence of this interaction in planta. LOC_Os02g47420 (ATROPGEF7/ROPGEF7, putative, expressed), also highly expressed at the reproductive stage, is targeted by miR5156. Interestingly, the orthologous gene of LOC_Os02g47420 in maize encodes a pollen-specific kinase (Fig. 4a-3). The target gene LOC_Os03g19480 (SET domain-containing protein, expressed) of miR2275d is implicated in reproduction, although it is not highly expressed at the reproductive stage (Fig. 4a-4). We also recognized that certain genes highly expressed at the vegetative stage were also included in the reproductive-specific miRNA-mediated subnetworks, such as miR2118c/q–LOC_Os07g04040.1 (expressed protein; RAP ID, Os07g0132500) and miR5527–LOC_Os03g32040.1 (phenazine biosynthesis protein, putative, expressed; RAP ID, Os03g0434400; Figs 4a-2, S6). Many targets within the network mediated by the reproductive-specific sRNAs are involved in reproduction-related biological processes, such as LOC_Os03g22570 (MIF4G domain-containing protein, putative, expressed) and LOC_Os03g44200 (beclin-1, putative, expressed) targeted by sRNA490 (Fig. 4b-1), LOC_Os03g57870 (RFC3: Putative clamp loader of PCNA, replication factor C subunit 3, expressed) targeted by sRNA508 (Fig. 4b-2), LOC_Os07g06970 (HEN1, putative, expressed) targeted by sRNA53 (Fig. 4b-3), and LOC_Os03g29680 (EARLY flowering protein, putative, expressed) simultaneously targeted by sRNA353, sRNA399 and sRNA430 (Fig. 4b-4). Within the network mediated by the vegetative-specific miRNAs, LOC_Os05g03040 encoding an AP2 domain-containing protein is targeted by miR172c (Fig. 4c). Both miR172c and LOC_Os05g03040 were highly expressed at the vegetative stage, and the interaction miR172c–LOC_Os05g03040 was ‘well supported’ by the degradome sequencing data. In Arabidopsis, the regulatory cascade miR172–AP2 plays an important role in flower development. And miR172 regulates AP2 genes through translational repression (Aukerman & Sakai, 2003; Chen, 2004). Intriguingly, the cleavage-based interaction between miR172c and LOC_Os05g03040 identified in rice indicates another novel role of the miR172–AP2 cascade in developmental phase maintenance or transition. This plausible conclusion needs further confirmation. Within the network mediated by the vegetative-specific sRNAs, many targets are involved in postembryonic development or leaf senescence, such as LOC_Os03g48970 (nuclear transcription factor Y subunit, putative, expressed) targeted by sRNA37 (Fig. 4d-1), LOC_Os06g12870 (leaf senescence related protein, putative, expressed) targeted by sRNA407 and sRNA46 (Fig. 4d-2), LOC_Os10g04580 (ras-related protein, putative, expressed) targeted by sRNA277 (Fig. 4d-3), LOC_Os05g48870 (auxin response factor 15, putative, expressed) targeted by sRNA395 (Fig. 4d-4), and LOC_Os02g49840 (OsMADS57: MADS-box family gene with MIKCc type-box, expressed) targeted by sRNA35 (Fig. 4d-6). For all the networks constructed in this study, we found that the interactions ‘well supported’ by degradome data tended to be the sRNA–target pairs with the sRNA regulators and the target genes sharing consistent expression patterns (Fig. 4d-6), but not the pairs with different expression patterns (Fig. 4d-5).
Taken together, based on the degradome sequencing data and the expression patterns of the target genes, highly reliable and biologically meaningful networks that have great potential in maintaining developmental stages or regulating the vegetative-to-reproductive phase transition in rice have been constructed. These networks provide biologists with valuable hints for further studies on rice phase maintenance and transition.
In this study, we performed a comprehensive search for the miRNAs and other sRNAs specifically expressed at the vegetative and reproductive stages. By integrative use of degradome and RNA high-throughput sequencing data, the targets of these stage-specific sRNAs were identified and validated through comparing the stage-specific patterns among the sRNAs/miRNAs, the target genes and the cleavage signals. Although the target cleavages guided by the stage-specific sRNAs need further fine-scale experimental confirmation, several points support our observations: all the stage-specific sRNAs were enriched in AGO1, which was demonstrated to possess RNA slicing activity; sequence feature analysis also indicates that, like the canonical miRNAs, these sRNAs tend to incorporate into AGO1-associated RISCs; the statistical results shown in Table 1 and the subnetwork analysis indicate the high reliability of the constructed networks; several reports on the target cleavages performed by ta-siRNAs, nat-siRNAs, mitrons and miRNA*s (Allen et al., 2005; Borsani et al., 2005; Williams et al., 2005; Meng et al., 2011ab; Meng & Shao, 2012) strongly indicate the possibility that a large portion of endogenous sRNAs, other than the miRNAs, could exert cleavage-based regulatory actions.
All the stage-specific sRNAs along with their target genes were deposited into RAP-DB (http://rapdb.dna.affrc.go.jp/; Itoh et al., 2007; Tanaka et al., 2008). Specifically, the sRNAs with unique genomic loci can be visited through GBrowse by using sRNA IDs or by defining a specific genomic region on the rice chromosome. The full lists of the stage-specific sRNAs, including those with multiple mapped genomic positions, can be retrieved from the ‘Download’ page of RAP-DB (http://rapdb.dna.affrc.go.jp/download/irgsp1.html). The full lists provide the IDs and the genomic locations of the sRNAs, both of which are useful for GBrowse-based queries. The full lists also include the sRNA sequence information and the corresponding targets. Obviously, the sRNAs present in our work have not been thoroughly characterized, and their origin, biogenesis and classification need further investigation.
The authors would like to thank all the publicly available datasets and the scientists behind them. The authors would also like to thank Prof. Steven Strauss, Dr Pankaj Jaiswal and the anonymous referees for their constructive comments; and Dr Takeshi Itoh and his colleague for their kind assistance with integrating the data sets into RAP-DB. This work was funded by the National Natural Science Foundation of China (31100937), (31271380) and (30971743), the Starting Grant funded by Hangzhou Normal University to Y.M. (2011QDL60).