SEARCH

SEARCH BY CITATION

Keywords:

  • deep sequencing;
  • bioinformatics;
  • miRBase;
  • genetics;
  • amphibian

Abstract

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. METHODS
  7. REFERENCES
  8. Supporting Information

Using a combination of deep sequencing and bioinformatics approach, we for the first time identify miRNAs and their relative abundance in mature, metaphase II arrested eggs in Xenopus laevis. We characterize 115 miRNAs that have been described either in Xenopus tropicalis (85), X. laevis (9), or other vertebrate species (21) that also map to known Xenopus pre-miRNAs and to the X. tropicalis genome. In addition, 72 new X. laevis putative candidate miRNAs are identified based on mapping to X. tropicalis genome within regions that have the propensity to form hairpin loops. These data expand on the availability of genetic information in X. laevis and identify target miRNAs for future functional studies. genesis 50:286–299, 2012. © 2012 Wiley Periodicals, Inc.


INTRODUCTION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. METHODS
  7. REFERENCES
  8. Supporting Information

In recent years, microRNAs (miRNAs) have emerged as an important class of gene regulators. They mediate complex post-transcriptional regulatory activities by interacting with coding and noncoding regions of messenger RNAs (Abdelmohsen et al.,2008; Tay et al.,2008). While they recognize their targets with incomplete complementarity, the hallmark of these short sequences is perfect base pairing of nucleotides 2–7 at the 5' end of a mature miRNA, referred to as “seed” sequence, to its target mRNA. This seed-mediated binding is required for modulation of mRNA expression (Doench and Sharp,2004; Grimson et al.,2007). Their initial role in post-transcriptional repression was more recently expanded to include post-transcriptional activation/derepression (Cordes et al.,2009; Mortensen et al.,2011) and mRNA stability (Valencia-Sanchez et al.,2006).

Biogenesis of miRNAs is quite elaborate. Primary miR transcripts are produced mostly by RNA Pol II as kilobase long transcripts from either intronic sequences of protein coding genes or their own transcription units that are not associated with protein coding genes. The primary transcripts (pri-miRNAs) are processed in the nucleus by RNase-III enzyme Drosha and cleaved into 60–100 nt long pre-miRNAs that form a secondary hairpin structure. The pre-miRNAs are exported into the cytoplasm and processed to double stranded stretches of miRNAs by another RNase-III enzyme, Dicer. The passenger strands are removed and degraded and the remaining guide strands are loaded onto RNA-induced silencing complex (RISC) where they become single stranded mature miRNAs (reviewed by (Kim et al.,2009). The location of their interaction within the target mRNA as well as the degree of complementarity determines their activity as translational repressors, translational activators, or mRNA stabilizers.

MicroRNAs participate in various cellular functions and are increasingly being identified as diagnostic markers in several diseases and cancers (Brase et al.,2010; Chen et al.,2011; Cortez and Calin,2009; Kroh et al.,2010; Lawrie et al.,2008; Mahn et al.,2011; Shah et al.,2009; Weigel and Dowsett,2010; Wittmann and Jäck,2010; Zen and Zhang,2010). The role of miRNAs in modification of chromatin remodeling complexes and their ability to promote trans-differentiation of cell phenotype have been described more recently (Cardinali et al.,2009; Chen et al.,2006; Fineberg et al.,2009; Yoo et al.,2009,2011). MicroRNAs exhibit a high level of conservation across species (Altuvia et al.,2005; Liu et al.,2008). Computational prediction of miRNAs suggests thousands of miRNA molecules for human and other species (Mazière and Enright,2007; Olena and Patton,2010). The most current version of miRBase contains 16,772 entries representing hairpin precursor miRNAs, expressing 19,724 mature miRNA products in 153 species (miRBase Registry, release 17; Kozomara and Griffiths-Jones,2011).

The African clawed frog Xenopus laevis is an important model organism that has been used in developmental biology research for decades. X. laevis egg extracts have been invaluable in studying biological processes such as chromatin remodeling and acquisition of transcriptional competence (Blow and Laskey,1986; Dimitrov and Wolffe,1996; Kikyo et al.,2000; Lohka and Masui,1983a,b), cell cycle (Lohka,1989), and DNA replication (Blow,1993). One of the most compelling abilities of X. laevis metaphase II arrested egg extracts is its ability to reprogram differentiated somatic cells into stem cell gene expressing cells (Alberio et al.,2005; Byrne et al.,2003; Gurdon et al.,1958;2005;; Hansis et al.,2004; Miyamoto et al.,2007). While an expanding database of X. laevis genomic and genetic information is emerging, sequencing of the X. laevis genome has not yet been completed. Similarly, the X. laevis transcriptome, small RNAome and proteome remain incomplete compared to other species. For example, 1,902 mature miRNAs have been published for human, 207 for X. tropicalis, and only 22 for X. laevis (miRBase, version 17). All the miRNA sequences in X. laevis are derived from a single published study (Watanabe et al.,2005) and from unpublished data from Biasci et al. (2008) (miRBase).

Traditionally, miRNAs have been discovered by cloning of small RNAs (Watanabe et al.,2005) following the commonly used Sanger sequencing methods. However, this methodology has limited application in discovering low abundant miRNAs. Another avenue of miRNA discovery is computational analysis based on RNA secondary structure predictions and sequence conservation across species. Even though computational methods are useful in predicting miRNAs, expression of these transcripts still needs to be verified by experimentation. Recent advances in high-throughput sequencing technologies are paving the way for discovering new and rare species of expressed small RNAs. Illumina sequencing (formerly Solexa) is an ideal platform for miRNA discovery due to the 35 bp read lengths and the level of sequencing depth afforded by this platform (Szittya et al.,2008). Therefore, a combination of deep sequencing and computational analysis is an attractive proposition for identifying new and rarely expressed, nonconserved, and species-specific miRNAs.

Here, we undertook a deep sequencing approach to identify miRNAs in metaphase II arrested X. laevis eggs. Combined with bioinformatics and interrogation of genomic sequences available for X. tropicalis, we characterize populations of miRNAs in metaphase II egg extracts, describe their likely precursor sequences (pre-miRNAs), identify putative new miRNAs, map their locations to the genomic scaffolds of X. tropicalis, and identify their mRNA targets.

RESULTS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. METHODS
  7. REFERENCES
  8. Supporting Information

Small RNA Reads in X. laevis Metaphase II Arrested Eggs

A total of 12,526,420 raw reads were obtained from sequencing short RNAs from X. laevis metaphase II arrested eggs. Reads were filtered to 11,302,087 mappable reads using the criteria described in Table 1 and assigned to groups described in detail in Figure 1. Only reads between 15 and 24 nucleotides, corresponding to conventionally accepted miRNA length, and mapping perfectly to the available X. tropicalis genome scaffolds were included in the dataset. All identified sequences were able to fold into the hairpin-loop structure characteristic of a folded pre-miRNA. As X. laevis genome sequence data becomes available, additional sequences identified (but not presented) in this study may be revisited in the future. The comprehensive dataset is included in Supporting Information Table S1 and available at http://users.wpi.edu/∼tdominko/XenopusProject/. A total of 115 unique reads in the 15–24 nucleotide size range mapped to mature miRNAs and pre-miRNAs of X. laevis, X. tropicalis, and other vertebrate species in the current version of miRBase and pre-miRNAs further mapped to the X. tropicalis genome (Figure 2, Groups 1a, b, c and Group 2). Additional 72 putative candidate—PC miRNAs were identified that did not map to any known pre-miRNAs, but mapped to X. tropicalis genome within sequences with propensity to form hairpin structures (Figure 2, Group 4). Distribution of small reads is presented in Figure 3 .

thumbnail image

Figure 1. Data analysis flowchart.

Download figure to PowerPoint

thumbnail image

Figure 2. Criteria used to group miRNAs identified in X. laevis metaphase II arrested eggs.

Download figure to PowerPoint

thumbnail image

Figure 3. Length distribution of sequencing reads between 15 and 25 nucleotides.

Download figure to PowerPoint

Table 1. Criteria Used for miRNA Annotation and Hairpin Structure Determination
miR annotation presented in Supporting Information Tables
miRNA_name is the name of detected miRNA sequence.
The miR_name is composed of the 1st known miR name in a cluster, an underscore, and a matching annotation, such as:
L-n means the miRNA_seq (detected) is n bases less than known rep_miRSeq in the left side
R-n means the miRNA_seq (detected) is n bases less than known rep_miRSeq in the right side
L+n means the miRNA_seq (detected) is n bases more than known rep_miRSeq in the left side
R+n means the miRNA_seq (detected) is n bases more than known rep_miRSeq in the right side
2ss5TC13TA means 2 “sequence substitutions” (ss), which are T>C at position 5 and T>A at position 13 of the representative miRNA.
Hairpin determinationin Supporting Information Tables
Definition of MFEI: MFEI = −dG*100/mirLen/CG%. Reference: Cell Mol Life Sci 63 (2006) 246-254.
Definition of #base_in Loop: This is the maximum number of bases appearing in hairpin loop region. This number is only for gp1c and gp2.
Criteria:
1. Number of allowed errors in one bulge in stem: ≤ 12
2. Number of basepairs (bp) in stem region: ≥16
3. Free energy (dG in kcal/mol): ≤−15
4. Length of hairpin (up and down stem + terminal loop): ≥50
5. Length of terminal loop: ≤20
6. Number of allowed errors in one bulge in mature region: ≤8
7. Number of allowed biased errors in one bulge in mature region: ≤4
8. Number of allowed biased bulges in mature region: ≤2
9. Number of basepairs (bp) in mature or mature* region: ≥12
10. Percentage of small RNA in stem region (pm): ≥80%
11. Number of allowed errors in mature region: ≤7

Reads Mapping to Known miRNAs in X. laevis or X. tropicalis (Group1a)

A total of 94 reads mapped to previously described X. laevis (Table 2) or X. tropicalis (Table 3) miRNAs/pre-miRNAs in miRBase (miRBase, Release 17) and these pre-miRNAs further mapped to X. tropicalis genome (JGI v4.1, www.jgi.doe.gov).

Table 2. Nine Known X. laevis Mature miRNAs Identified in X. laevis Metaphase II Arrested Eggs (Group 1a)
#Reference miR/ pre-miR_namemiR_seqAccession #
1xla-miR-15cTAGCAGCACATCATGGTTTGTAJN795018
2xla-miR-18TAAGGTGCATCTAGTGCAGTTAJN795022
3xla-miR-19bTGTGCAAATCCATGCAAAACTGAJN795025
4xla-miR-20CAAAGTGCTCATAGTGCAGGTAGJN795026
5xla-miR-23aATCACATTGCCAGGGATTTCCAJN795031
6xla-miR-92aTATTGCACTTGTCCCGGCCTGTJN795051
7xla-miR-205TCCTTCATTCCACCGGAGTCTGTJN795102
8xla-miR-363CGGGTGGATCACGATGCAATTTJN795115
9xla-miR-427AAAGTGCTTTCTGTTTTGGGCGTJN795119
Table 3. Eighty-five miRNAs Identified in X. laevis Metaphase II Arrested Eggs that Correspond To Known X. tropicalis Mature miRNAs/pre-miRNAs (Group 1a)
#Reference miR/ pre-miR_namemiR_seqSuggested nameAccession #
1xtr-let-7aTGAGGTAGTAGGTTGTATAGTTxla-let-7aJN795004
2xtr-let-7bTGAGGTAGTAGTTTGTGTAGTTxla-let-7bJN795006
3xtr-let-7cTGAGGTAGTAGGTTGTATGGTTxla-let-7cJN795007
4xtr-let-7eTGAGGTAGTAGGTTGTTTAGTTxla-let-7eJN795008
5xtr-let-7fTGAGGTAGTAGATTGTATAGTTxla-let-7fJN795009
6xtr-let-7gTGAGGTAGTAGTTTGTACAGTTxla-let-7gJN795010
7xtr-let-7iTGAGGTAGTAGTTTGTGCTGTTxla-let-7iJN795011
8xtr-miR-1aTGGAATGTAAAGAAGTATGTATxla-miR-1aJN795012
9xtr-miR-7TGGAAGACTAGTGATTTTGTTGTxla-miR-7JN795013
10xtr-miR-10aTACCCTGTAGATCCGAATTTGTxla-miR-10aJN795014
11xtr-miR-10bTACCCTGTAGAACCGAATTTGTxla-miR-10bJN795015
12xtr-miR-15aTAGCAGCACATAATGGTTTGTGAxla-miR-15aJN795016
13xtr-miR-15bTAGCAGCACATCATGATTTGCAxla-miR-15bJN795017
14xtr-miR-16bTAGCAGCACGTAAATATTGGGTxla-miR-16bJN795019
15xtr-miR-16cTAGCAGCACGTAAATACTGGAGxla-miR-16cJN795020
16xtr-miR-17-5pCAAAGTGCTTACAGTGCAGGTAGxla-miR-17-5pJN795021
17xtr-miR-18aTAAGGTGCATCTAGTGCAGATAxla-miR-18aJN795023
18xtr-miR-20aTAAAGTGCTTATAGTGCAGGTAGxla-miR-20aJN795027
19xtr-miR-22AAGCTGCCAGTTGAAGAACTGTxla-miR-22JN795029
20xtr-miR-22*AGTTCTTCAGTGGCAAGCTTTxla-miR-22*JN795030
21xtr-miR-23bATCACATTGCCAGGGATTxla-miR-23bJN795032
22xtr-miR-24aTGGCTCAGTTCAGCAGGAACAGxla-miR-24aJN795034
23xtr-miR-25CATTGCACTTGTCTCGGTCTGAxla-miR-25JN795035
24xtr-miR-26TTCAAGTAATCCAGGATAGGCTxla-miR-26JN795036
25xtr-miR-27aTTCACAGTGGCTAAGTTCCGxla-miR-27aJN795038
26xtr-miR-27bTTCACAGTGGCTAAGTTCTGCxla-miR-27bJN795039
27xtr-miR-27cTTCACAGTGGCTAAGTTCxla-miR-27cJN795040
28xtr-miR-29cTAGCACCATTTGAAATCGGTTAxla-miR-29cJN795041
29xtr-miR-30a-3pCTTTCAGTCAGATGTTTGCAGCxla-miR-30a-3pJN795042
30xtr-miR-30a-5pTGTAAACATCCTCGACTGGAAGCxla-miR-30a-5pJN795043
31xtr-miR-30bTGTAAACATCCTACACTCAGCTxla-miR-30bJN795044
32xtr-miR-30cTGTAAACATCCTACACTCTCAGCTxla-miR-30cJN795045
33xtr-mir-30c-1GTGAACATAAGGTGGCTGGGAGAAxla-mir-30c-1-3pJN795046
34xtr-miR-30dTGTAAACATCCCCGACTGGAAGCTxla-miR-30dJN795047
35xtr-miR-30eTGTAAACATCCTTGACTGGAAGCTxla-miR-30eJN795048
36xtr-miR-34aTGGCAGTGTCTTAGCTGGTTGTxla-miR-34aJN795049
37xtr-miR-34bAGGCAGTGTAGTTAGCTGATTGxla-miR-34bJN795050
38xtr-miR-92bTATTGCACTCGTCCCGGCCTCCxla-miR-92bJN795053
39xtr-miR-93aCAAAGTGCTGTTCGTGCAGGTAGxla-miR-93aJN795054
40xtr-miR-99AACCCGTAGATCCGATCTTGTGxla-miR-99JN795057
41xtr-miR-100AACCCGTAGATCCGAACTTGTGxla-miR-100JN795058
42xtr-miR-101aTACAGTACTGTGATAACTGAAxla-miR-101aJN795059
43xtr-miR-106AAAAGTGCTTATAGTGCAGGTAGxla-miR-106JN795060
44xtr-miR-107AGCAGCATTGTACAGGGCTATxla-miR-107JN795061
45xtr-miR-122TGGAGTGTGACAATGGTGTTTxla-miR-122JN795062
46xtr-miR-124TAAGGCACGCGGTGAATGCCAAxla-miR-124JN795064
47xtr-miR-125aTCCCTGAGACCCTTAACCTGTGAxla-miR-125aJN795065
48xtr-miR-125bTCCCTGAGACCCTAACTTGTGAxla-miR-125bJN795066
49xtr-miR-128TCACAGTGAACCGGTCTCTTTxla-miR-128JN795067
50xtr-miR-130aCAGTGCAATGTTAAAAGGGCATxla-miR-130aJN795069
51xtr-miR-130bCAGTGCAATGATGAAAGGGCATxla-miR-130bJN795070
52xtr-miR-130cCAGTGCAATATTAAAAGGGCATxla-miR-130cJN795071
53xtr-miR-132TAACAGTCTACAGCCATGGTCxla-miR-132JN795072
54xtr-miR-135TATGGCTTTTTATTCCTATGTxla-miR-135JN795074
55xtr-miR-140CAGTGGTTTTACCCTATGGTAxla-miR-140JN795075
56xtr-miR-143TGAGATGAAGCACTGTAGCTCxla-miR-143JN795077
57xtr-miR-145GTCCAGTTTTCCCAGGAATCCCTxla-miR-145JN795079
58xtr-miR-146TGAGAACTGAATTCCxla-miR-146JN795080
59xtr-miR-146bTGAGAACTGAATTCCATGGACTxla-miR-146bJN795081
60xtr-miR-148aTCAGTGCACTACAGAACTTTGTxla-miR-148aJN795082
61xtr-miR-148bTCAGTGCATCACAGAACTTTGTxla-miR-148bJN795084
62xtr-miR-153TTGCATAGTCACAAAAGTGATTxla-miR-153JN795086
63xtr-miR-184TGGACGGAGAACTGATAAGGGxla-miR-184JN795089
64xtr-miR-184TGGACGGAGAACTGATAAGGxla-miR-184-2JN795090
65xtr-miR-191CAACGGAATCCCAAAAGCAGCTGTxla-miR-191JN795091
66xtr-miR-192ATGACCTATGAATTGACAGCCAxla-miR-192JN795092
67xtr-mir-194-2CCGGTGGAGATGCTGTTATCTTxla-mir-194-2-3pJN795094
68xtr-miR-199a*ACAGTAGTCTGCACATTGGTTxla-miR-199a*JN795095
69xtr-miR-200aTAACACTGTCTGGTAACGATGTTxla-miR-200aJN795096
70xtr-miR-200bAATACTGCCTGGTAATGATGATTxla-miR-200bJN795097
71xtr-miR-202*TTCCTATGCATATACCTCTTTxla-miR-202*JN795098
72xtr-miR-203GTGAAATGTTTAGGACCACTTGxla-miR-203JN795099
73xtr-miR-204TTCCCTTTGTCATCCTATGCCTxla-miR-204JN795100
74xtr-miR-206TGGAATGTAAGGAAGTGTGTGGxla-miR-206JN795103
75xtr-miR-210CTGTGCGTGTGACAGCGGCTAAxla-miR-210JN795104
76xtr-miR-215ATGACCTATGAAATGACAGCCAxla-miR-215JN795107
77xtr-miR-216TAATCTCAGCTGGCAACTGTGAxla-miR-216JN795108
78xtr-miR-217ATACTGCATCAGGAACTGATTGxla-miR-217JN795109
79xtr-miR-221AGCTACATTGTCTGCTGGGTTTxla-miR-221JN795110
80xtr-miR-222AGCTACATCTGGCTACTGGGTCTCxla-miR-222JN795111
81xtr-miR-338TCCAGCATCAGTGATTTTGTTGxla-miR-338JN795113
82xtr-miR-363-3pAATTGCACGGTATCCATCTGTAAxla-miR-363-3pJN795116
83xtr-miR-375TTTGTTCGTTCGGCTCGCGTTAxla-miR-375JN795118
84xtr-miR-455TATGTGCCCTTGGACTACATCGxla-miR-455JN795120
85xtr-miR-499TTAAGACTTGCAGTGATGTTTAxla-miR-499JN795124

Thirty-nine of the 94 Xenopus miRNA identified here (4 out of 9 laevis and 35 out of 85 tropicalis) showed a perfect match, while the remaining reads contained changes when compared to Xenopus miRNAs in miRBase. These changes most frequently included single nucleotide additions or deletions to either the 5' or the 3' end of mature reads, base substitutions, or a combination of these changes. The majority of the reads containing changes had a single or double base addition or deletion at their 3' end (5 in laevis and 39 in tropicalis). Three reads contained a single nucleotide substitution. None of these changes affected the seed sequence, located at the 5' end of the miRNA (5' nucleotides 2 through 8). Four reads contained base additions or deletions at their 5' ends and four reads contained changes at both ends; these changes consequently altered the seed sequence. Eight reads showed a 5' or 3' location shift on known Xenopus pre-miRNAs (Supporting Information Table S2).

Reads Mapping to Other Known Vertebrate (Other Than Xenopus) miRNAs and Pre-miRNAs (Group 1b)

Three of 21 reads that mapped to X. tropicalis genome matched sequences described in zebrafish (dre-mir-24) and human (hsa-mir-98 and hsa-miR-129). Eleven reads matched miRNAs in other species and mapped to new regions within previously described pre-miRNAs; and seven reads matched miRNAs in other species but mapped to novel regions in X. tropicalis genome that have not been previously identified as pre-miRNAs (Table 4). Hence, these last seven represent potentially new miRNAs and new pre-miRNAs for X. tropicalis and X. laevis. The genomic sequences flanking the reads could form stable hairpin structures (Supporting Information Table S2).

Table 4. Twenty-one miRNAs Identified in X. laevis Metaphase II Arrested Eggs that Correspond to Known Vertebrate (non-Xenopus) Mature miRNAs/pre-miRNAs and Map to X. tropicalis Genome (Groups 1b and Group 2)
#Reference miR/ pre-miR_namemiR_seqSuggested nameAccession #
 Group 1b   
1cfa-miR-212ACCTTGGCTCTAGACTGCTTACTxla-miR-212-5pJN795106
2dre-miR-210*AAGCCACTGACTAACGCACATTxla-miR-210*JN795105
3hsa-let-7a*ATACAATCTACTGTCTTTCTTxla-let-7a*JN795005
4hsa-miR-20b*ACTGTAATATGGGCACTTACAxla-miR-20b*JN795028
5hsa-miR-26a-2*CCTATTCTTGATTACTTGTTTCxla-miR-26a-2*JN795037
6hsa-miR-92a-1*AGGTTGGGATTGGTTGCAATGCTxla-miR-92a-1*JN795052
7hsa-mir-98-p5TGAGGTAGTAAGTTGTATTGTTxla-miR-98-5pJN795056
8hsa-miR-129-3pAAGCCCTTACCCCAAAAAGCAxla-miR-129-3pJN795068
9hsa-miR-132*CCGTGGCTTTAGATTGTTACTxla-miR-132*JN795073
10hsa-miR-140TACCACAGGGTAGAACCACGGAxla-miR-140-3pJN795076
11hsa-miR-143*GGTGCAGTGCTGCATCTCTGGxla-miR-143*JN795078
12hsa-miR-148b*GAAGTTCTGTTATACACTCCGGCTxla-miR-148b*JN795085
13gga-miR-456CAGGCTGGTTAGATGGTTGTCAxla-miR-456JN795121
14gga-miR-458ATAGCTCTTTGAATGGTACTGCxla-miR-458JN795122
15gga-miR-460b-5pTCCTCATTGTACATGCTGTGTGxla-miR-460b-5pJN795123
16gga-miR-1805-5pGAGTTGTAGTCTTTCAAACAGxla-miR-1805-5pJN795126
17mmu-miR-204*GCTGGGAAGGCAAAGGGACGTxla-miR-204*JN795101
18mmu-mir-1983GCTCCAGTGGCGCAAxla-mir-1983-5pJN795127
19ssc-mir-1285GTAGTGGGATCGCGCxla-mir-1285-5pJN795125
 Group 2   
20dre-miR-24TGGCTCAGTTCAGCAGGAACAGGTxla-miR-24JN795033
21mmu-mir-5102GGGAGTTTGACTGGGGCGxla-mir-5102-5pJN795128

Reads Mapping to X. tropicalis Genomic Regions— “Putative Candidate” miRNAs (Group 4)

A third category of 72 unique reads did not map to any known vertebrate miRNAs or pre-miRNAs in miRBase (Table 5). However, they all mapped to X. tropicalis genome in regions that showed propensity to form thermodynamically stable hairpin structures. These are considered Putative Candidate—PC—miRNAs for X. laevis and X. tropicalis. Importantly, none of these mapped to Xenopus genomic repeats (repbase). Fifty-eight reads mapped to a single locus and 15 mapped to multiple loci on X. tropicalis scaffolds (Supporting Information Table S3).

Table 5. Seventy-two Putative Candidate miRNAs Identified in X. laevis Metaphase II Arrested Eggs that Map to X. tropicalis Genome (Group 4)
#Read namemiR_seqAccession #
1PC-3p-82353TGGAATGTTAAGAAGTATGTAAJN934795
2PC-3p-473580TGCTCTGAGAGACCTTCCTAACAJN934797
3PC-3p-559940TGAGAGAAGAGTTGAGTAGJN934798
4PC-3p-572193AAAGTGCTTCTCGTTCGGCTGAJN934799
5PC-3p-590570AGGGATGGAATGGAAAGGAATGCJN934800
6PC-3p-722125AACTAGTACATGCAGCACJN934801
7PC-3p-777332AGCATGATGGGAGTTGTAJN934802
8PC-3p-912538TGTATAGCAAAATAGJN934803
9PC-3p-997070ATGGGGAATGGTTCTGATTGGCJN934804
10PC-3p-1026732TGCTACAGGAGTCCTGTGTCTJN934805
11PC-3p-1217232TGCAGTCTGAGGGATGGAGTGGAJN934806
12PC-3p-1416830TCTACAGTCCGAGAATCJN934807
13PC-3p-1429392TTTCAGGAAAATGACACAAAJN934808
14PC-3p-1500718ATTTTGTTGGTTTTCGJN934809
15PC-3p-1558848CTGTGATTTTGACTGCAGAJN934810
16PC-3p-1570955TAGAAGATCATTGGTTAJN934811
17PC-3p-1825226TGAAGAAACTGTGTTAJN934812
18PC-3p-2128722AGTAGTTTAGAGATGAAAGGAAGCJN934813
19PC-3p-2265908TGTATGGCATTGTGGAAAJN934814
20PC-3p-2419584TATTTGTCTATAATGTCCTJN934815
21PC-3p-2500255TGCTGCTGTTGCTAAGTCTGGTJN934816
22PC-3p-2526314TGAGGGATGGAATGGAAAGGAATJN934817
23PC-3p-2608329TGACAAGGGATGCTGGGAAAJN934818
24PC-3p-2838765TTTTCTTCTAATCGGAATGGGTGJN934819
25PC-3p-2858122TGCATTCTCTTCACTGJN934820
26PC-3p-2866177CATTAGAGGACATTATAGACAJN934821
27PC-3p-3225349TGGATCAATGTGAGAJN934822
28PC-3p-3396719TGTAGACACAGAGGGATGATTCJN934823
29PC-3p-3404189CACGTAGTCTGAACACTGGGGGJN934824
30PC-3p-3462114TCTGTAGGAAAAAGGTATJN934825
31PC-3p-3927070GCTGTACTGCGAACTJN934826
32PC-5p-183320CTGGTCAATAGCAGCTGAGCCATGJN934827
33PC-5p-184903TAAGTAATGAAGAAGAATATGJN934828
34PC-5p-377209TGCACACAGGTCTGGGGAAJN934830
35PC-5p-429687CCAGTTGGGTGATAAATGAJN934831
36PC-5p-508025AGAGTAGAGATTGTGCAGGGCAJN934832
37PC-5p-524204TTGAGATGTAGGTTATAAGATTJN934833
38PC-5p-605340TGATCGGATCTTGTACAATTTGAJN934834
39PC-5p-661517GAAAGAAGACCCTGTJN934835
40PC-5p-661540AATGAAGAGATAGCATTGCTGGTTJN934836
41PC-5p-705780GACAGCAGGACGGTGGJN934837
42PC-5p-895083AATTCAAGGAAAAATGTAJN934838
43PC-5p-904518ACCATTGGATTGTGGGJN934839
44PC-5p-1057474TATGACAGGCTCTTTGCATJN934840
45PC-5p-1096123ACCAGTGTGTAGACTACTGTTCJN934841
46PC-5p-1107030AGGGATAGGAACTAGTCGGTTCAJN934842
47PC-5p-1212794TTAACTGTAACAGAAGTCJN934843
48PC-5p-1258591TGCACACAGGTCTGGGGAJN934844
49PC-5p-1564208CGTAGTGGAAGGAGGCGTGCCJN934845
50PC-5p-1623845GTGGAAAACAGACTGTGJN934846
51PC-5p-1703422AGTTAAATGCGTCATGACTCAJN934847
52PC-5p-1750291TTCATGTAGAATGTTGGTAJN934848
53PC-5p-1824198AGGAAGTGGCTAATTCJN934849
54PC-5p-2103624TAGTTCTACCGTCTTTCGACCJN934851
55PC-5p-2169326AAGCTTCTGAATGTTCCAGTGAJN934852
56PC-5p-2291026TCTGTAATACCCACTGATJN934853
57PC-5p-2416157TGTTTTGTTGTGTTGGATGJN934854
58PC-5p-2567058TTATCTATTGTTTTTTGJN934855
59PC-5p-2589352TGTAAGATCTTGAGGGTATJN934856
60PC-5p-2703994GGGAAGTAGAAGTGGGTGGJN934857
615p-2773462TGAGTTAAATGCGTCATGACTJN934858
62PC-5p-3009997AGCAGGATCAGATGAACAGACAGJN934859
63PC-5p-3118828TAGAGAGGAAGGTGGGGAJN934860
64PC-5p-3179154TTAGTTTGTGGATGATAJN934861
65PC-5p-3372430TTTAGGCTGAGGGCACACJN934862
66PC-5p-3455914ATCGCGTTAACCGGAAGTCTCAJN934863
67PC-5p-3532296TTGGGGTCGTTGTCTJN934864
68PC-5p-3615992TGCTGCCAGATCTGATACCGTJN934865
69PC-5p-3630407TTTACACTAAAACTTGAAJN934866
70PC-5p-3648555TATTTAGACGGAACCTAJN934867
71PC-5p-3770263CCTGTGGGAATACTGCCAGCJN934868
72PC-5p-3828717GCGGCTAGACGAGGCGJN934869

Genomic Clustering and Expression of X. laevis miRNAs

The location of the reads was analyzed within X. tropicalis genomic scaffolds. Clustering was examined for multiple copies of the same read or single copies of multiple reads. Any reads that localized within approximately 15 kb within the same genome ID were assigned to a cluster (Table 6 and Supporting Information Table S3). About half of all the known miRNAs localized to one of the clusters described previously (Tang and Maxwell,2008). The size of clusters ranged from 142 bp (cluster #2) to 14,946 bp (cluster #1). The largest cluster (#5) contained eight known miRNAs, three of which have not been previously described for X. laevis (Supporting Information Table S3). As observed previously, some miRNA localized to multiple genomic loci, and the rest of the miRNAs appeared to be single copy genes.

Table 6. Expression of miRNAs in X. laevis Metaphase II Arrested Eggs Localizing to Gene Clusters
#miRNA gene clusterLength (nt)
1xtr-let-7a/xtr-mir100/miR-125a14946
2xtr-let-7a/ xtr-let-7a*/xtr-let-7e142
3xtr-mir-15b/xtr-mir-16b245
4xtr-mir-15c/xtr-mir-16c2088
5xtr-mir-106/xla-mir-18/xla-mir-19b/xla-mir-20/ hsa-mir-20b*/xla-mir-92a/xla-mir-363/xtr-mir-363-3p724
6xtr-mir-17-5p/xtr-mir-18a/xtr-mir-19b/xtr-mir-20a/ xtr-mir-92a729
7xtr-mir-23b/dre-mir-24/ xtr-mir-24a/xtr-mir-27b911
8xtr-mir-23a/xtr-mir-27a684
9xtr-mir-25/xtr-mir-93a206
10xtr-mir-30a-3p/xtr-mir-30a-5p/xtr-mir-30c7491
11xtr-mir-30b/xtr-mir-30d4325
12xtr-mir-30c/xtr-mir-30c-1-p3/xtr-mir-30e1093
13xtr-mir-34b/xtr-mir-34b7678
14hsa-mir-98-p5/xtr-let-7f286
15xtr-mir-99/xtr-let-7c779
16xtr-mir-130b/xtr-mir-130c7810
17xtr-mir-132/hsa-mir-132*/cfa-mir-2125818
18xtr-mir-143/hsa-mir-143*/xtr-mir-1451048
19xtr-mir-192/xtr-mir-194-2-p31664
20xtr-mir-200a/xtr-mir-200b1765
21xtr-mir-202*/xtr-mir-202*7814
22xtr-mir-216/xtr-mir-217460
23xtr-mir-221/xtr-mir-222602
24xla-mir-427/xla-mir-427493
#Putative candidate miRNA clustersLength (nt)
25xtr-let-7e/PC-5p-1833207722
26PC-3p-572193/ PC-3p-572193380
27PC-5p-1107030/ PC-5p-11070302188
28PC-3p-590570/PC-3p-25263142331
29PC-3p-2589352/ PC-3p-258935211972

Expression of clustered miRNAs varied significantly between different clusters. Some clustered miRNAs exhibited similar expression pattern (e.g., xtr-miR-130b/130c in cluster #11 and miR-23a/27a in cluster #8), consistent with co-transcription of a common pri-miRNA (Baskerville and Bartel,2005; Tang and Maxwell,2008). Examination of other X. laevis clustered miRNAs, however, indicated their post-transcriptional regulation. One of the most striking examples was expression of seven miRNAs from the 724 bp cluster #5 (miR-363/363-3p/92a/19b/20/20b/18/106). The number of copies for individual miRNAs in this cluster ranged from 1 to 461 (Supporting Information Table S3). If these miRNAs are expressed from a single pri-miRNA, then differential processing of the pri-miRNA may be responsible for the observed variation. Alternatively, different miRNAs in the cluster could be expressed under control of different promoters. We observed other expression imbalances including miR-98-p5 and miR-let-7f in cluster #14 (2 and 2,162 copies, respectively). Interestingly, both miRNAs are co-expressed from the same pri-miRNA and demonstrate significant quantitative difference in all X. laevis tissues examined but could not be detected in X. laevis oocytes (Tang and Maxwell,2008). The expression of miRNA from a specific cluster could reflect developmental stage-specific transcription or differential pre-miRNA post-transcriptional processing.

Only five PC miRNAs localized to four clusters, each containing multiple copies of the same PC miRNA (clusters #26, 27, and 29) or two copies of two PC miRNAs (cluster #28). Expression of these PC miRNAs was low and differential expression from different loci could not be determined by sequencing alone. Cluster #25 contained one known miRNAs and one PC miRNA (Table 6).

Some miRNAs Are Expressed from the Same Pre-miRNA Sequence

Mapping of reads to the genome identified sequences of 17 pre-miRNAs that expressed two miRNAs each. Mature and star miRNAs (*) and those with the suffix 5p, 3p and p5, p3 represent miRNAs derived from different regions of the same pre-miRNA (Supporting Information Table S3). We observed differential expression of miRNA pairs derived from the same pre-miRNA. Among the known miRNAs, the expression ratio between the pairs varied widely. For example, xtr-let-7a and hsa-let-7a* were expressed at a ratio of 1,055 to 1 while xtr-miR-132 and newly identified hsa-miR-132* exhibited similar copy numbers (Supporting Information Table S3). Expression of the majority of PC miRNA pairs expressed from the same pre-miRNA showed much less variation, possibly due to their relatively low levels of expression.

Target Predictions for Putative Candidate miRNAs and Seed-Shifted miRNAs

Target prediction was performed using custom TargetScan (Release 5.2, June 2011) based on complementarity between the seed sequences of mature miRNAs (positions 2–8) and frog genes for 72 PC miRNAs identified in the study. Four reads contained seed sequences perfectly matched to seed sequences of known miRNAs (miR-106/93a/302; miR-199/199-5p; miR-139-5p; miR-1/206). Seed sequences of 68 PC miRNAs did not match previously identified miRNA seed sequences but all recognized frog transcript 3' UTR sequences. The number of target genes for individual reads ranged between 1 and 142 (Supporting Information Table S4).

Target prediction was also performed for 12 known miRNAs that showed “seed shift” occurring as a result of base changes (addition or deletion) at the 5' end of the reads (Supporting Information Table S5). Shift in the seed sequence for any miRNA resulted in loss of some target genes, in acquisition of some new target genes, and in maintenance of some of the same target genes. For example, comparison of target genes for xtr-miR-124_L-1R+1 (242 targets) and its representative xtr-miR-124 (174 targets) reveals that 117 targets remain the same even though the seed sequence has shifted. As a diametrically opposite example, seed sequence of hsa-miR-132* does not have any reported targets in Xenopus genome, but its seed-shifted variant hsa-miR-132*_L-1, detected in Xenopus eggs, recognizes three gene targets.

DISCUSSION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. METHODS
  7. REFERENCES
  8. Supporting Information

X. laevis represents an important model system in cell and developmental biology. In particular, the availability of its large size oocytes and eggs has established X. laevis as a unique source of cytoplasmic extracts. While much has been learned about its smaller cousin X. tropicalis, genomic and genetic information for X. laevis remains largely unavailable. Here we report the first complete analysis of miRNA populations in X. laevis metaphase II arrested eggs using deep sequencing and bioinformatics analysis. As deep sequencing approach is independent of known genome sequence, our data will remain suitable for further analysis when the genome information for X. laevis becomes available.

Available bioinformatics tools and approaches for analysis of deep sequencing data in any species rely on comparison of identified sequences to an existing genome sequence database. As sequencing of the genome of X. laevis is still in progress, our analysis included only sequences matching to the X. tropicalis. The genome assembly for X. tropicalis is available (JGI v4.1, www.jgi.doe.gov) with an estimated coverage of 90% on 19,759 scaffolds. While still not assembled into chromosomes, this resource represents the only tool for sequence comparison and analysis for X. laevis. Even though they share the same genus, significant differences between the two genomes exist (Cannatella and de S,1993). X. laevis has a genome size of 3.1 Gb distributed over 18 pairs of chromosomes, while X. tropicalis is diploid with a genome size of 1.7 Gb distributed over 10 pairs of chromosomes.

Due to the differences in ploidy and chromosome number, it is reasonable to expect that comparison between X. laevis and X. tropicalis will identify sequence variations. The majority of Xenopus miRNA sequences in miRBase have been deposited based on similarity to vertebrate miRNAs. Recent reports on deep sequencing data for other species indicate that variations have been observed compared to miRBase sequences (Fernandez-Valverde et al.,2010; Lee et al.,2010). Such variations are attributed to RNA degradation, to sequencing errors, and to differences in miRNA processing in a cell- and tissue-dependent manner. In an allopolyploid species such as X. laevis, in which genomic sequence information is scant, it is difficult to tease out the exact causes of observed sequence variations. The variations observed in this study possibly represent X. laevis specific variations. It is also possible that factors such as handling of small RNAs, depth of sequencing, and default factors set for specific algorithms used for analysis can impact sequence analysis results. Alternatively, dynamic expression of miRNAs can change during development and the identity of small RNA species captured at a particular developmental stage will vary. Twenty-five miRNAs we found in common with Armisen et al. (2009) contained the same sequence variations reported here. Further probing into Xenopus miRNA sequences in miRBase revealed that a majority of the sequences were deposited purely based on sequence similarity.

We confirmed expression of many miRNAs described previously in adult X. laevis tissues (Michalak and Malone,2008; Tang and Maxwell,2008), but describe expression of the majority of these miRNAs now also for the metaphase II arrested X. laevis eggs. Some previous studies failed to detect a number of miRNAs in X. laevis oocytes and eggs by Northern blotting (Tang and Maxwell,2008). This is not surprising as the method may not be sensitive enough for low expressing miRNAs. Among the 22 known X. laevis miRNAs in miRBase, only 7 have been verified by cloning (Watanabe et al.,2005) and the remaining 15 were presented based on similarity (Biasci et al., 2008, unpublished). Of the 191 X. tropicalis miR/pre-miRNAs in miRBase, only 2 sequences have been verified by sequencing (Armisen et al.,2009), 41 miRNAs were verified by Northern blotting (Tang and Maxwell,2008), and 148 were deposited based on sequence similarity. Therefore miRBase sequences may not be a true representation of expressed Xenopus miRNAs. Identical sequence observed in the present study and the study of Armisen et al. (2009), particularly of those reads with sequence changes points to the importance of verifying miRNAs by sequencing before miRBase submission. Of the 94 reads only 9 correspond to X.laevis miRs in miRBase. The remaining 85 are therefore new xla-miRs.

Due to high conservation of miRNAs across different species, it is not surprising that additional expressed reads in our study matched miRNA sequences described in other vertebrates. Metaphase II arrested eggs represent a very unique developmental stage during oogenesis, and these miRNAs may be specific to this developmental stage. Conversely, we identified expression of miRNAs in metaphase II eggs that were believed to be testis specific, e.g., miR-100 (Michalak and Malone,2008), or associated with central nervous system miR-124 and miR-23b/miR-24a/miR27b cluster (Walker and Harland,2008).

The 72 putative candidate miRNA reads identified in this study are genuine expressed sequences with strong secondary structure prediction which makes them putative candidate miRNAs for both X. laevis and X. tropicalis. Because they are unique to Xenopus they are most likely genus-specific. It is unlikely these reads belong to piRNAs, the predominant species of small RNAs in oocytes (Armisen et al.,2009). None of the reads map to genomic repeats (Okamura and Lai,2008), they are shorter than 25 nucleotides (Grimson et al.,2007), and their precursor genomic sequences have the propensity to form hairpin structures.

Our data indicate the presence of a repertoire of miRNAs that show high level of conservation with X. tropicalis and a subset that show sequence differences at the RNA level. Some of the identified miRNAs show differences with X. tropicalis genomic sequences suggesting that these sequences may be derived from X. laevis specific loci. Therefore it is possible that miRNAs are expressed from one or more loci from multiple paralogs present in X. laevis genome. Redundant genes, present in allopolyploid X. laevis, could exhibit preferential silencing or differential expression in a cell, tissue or stage-specific manner for the duplicate paralogs, as well as exhibit heterochrony (Adams et al.,2003; Evans,2007; Hellsten et al.,2007; Morin et al.,2006,2008; Sémon and Wolfe,2008; Yanai et al.,2011). Contrary to the gene silencing theory, there is evidence to suggest that duplicated gene functions are preserved rather than silenced (Force et al.,1999; Lynch and Force,2000; Otto and Yong,2002). We also observed many miRNA polymorphisms indicating that they are expressed from more than two genomic loci. The numbers of miR variants, sequence variations and copy numbers are indicative of their origin from different loci. It is unusual to find such a high level of variation in a small stretch of genomic DNA and cannot be explained away as sequencing errors. We hypothesize that the variants with multiple sequence substitutions are derived from X. laevis specific loci. However, lack of availability of corresponding X. laevis genomic sequences makes it difficult to confirm our hypothesis. Since we are mapping the sequences only to X. tropicalis, it is also likely that we are missing out on a number of X.laevis-specific miRNAs simply due to the fact that reference sequences are not available.

Deep sequencing not only allows the identification of miRNAs, but also reflects the absolute expression levels based on copy number. The copy numbers of miRs that are derived from the same pre-miR differ considerably, which is indicative of the dynamics and fine-tuned modulation of miRNA expression despite the same genomic region.

One feature of miRs that has not been discussed extensively in published literature is the significance of miRs mapping to multiple loci. Many miRBase sequences show this trend. We observed that 18 known Xenopus miRNAs identified in this study (and reported in miRBase) map to multiple genomic coordinates on X. tropicalis genome. Sequences flanking the miRNA at all these loci could form thermodynamically stable hairpin structures. Bioinformatics predictions and sequence information of mature sequences alone do not provide sufficient information regarding the locus from which these miRNAs are expressed. The same miRNA could be expressed from different loci depending on its co-localization with gene clusters expressed in specific cell types or at different stages during development, as the majority of Xenopus miRNAs are intronic in nature (Tang and Maxwell,2008). This may reflect a mechanism that allows coordinated expression of genes and corresponding miRNAs in a cell, tissue or stage-specific manner.

Some reads differed from their representative miRNAs by a 5' single nucleotide addition or deletion, therefore shifting the seed sequence. As a consequence, the loss of gene targets, acquisition of new targets and maintenance of some of the same targets can be expected for the shifted miRNA. Depending on the presence of a 3' adenosine preceding the mRNA target sequence (7mer 1A) or true complementarity of the eighth nucleotide in the miRNA seed sequence with the target mRNA (m8), the seed-shifted miRNA can still pair with the same mRNAs (Grimson et al.,2007). These single nucleotide changes in the 5' miRNA sequences may be critically important in the context of developmental regulation.

The current study adds a significant number of novel miRNA sequences to the currently available database for X. laevis. However, much experimental work remains to be performed to confirm their identity, their expression, and their functional significance during oocyte maturation, fertilization, and early embryonic development.

METHODS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. METHODS
  7. REFERENCES
  8. Supporting Information

Preparation of Xenopus Egg Extract and RNA Isolation

Five adult X. laevis females were superovulated and metaphase II arrested eggs collected, processed, and maintained in MII arrest following published protocols (Blow,1993). Total RNA was isolated using TRIZOL reagent (Invitrogen). Total RNA was resuspended in RNase/DNase-free water and treated with RNase-free DNase (Ambion) to eliminate any traces of contaminating DNA. The mixture was re-extracted with TRIZOL and resuspended in RNase/DNase free water. RNA quality was assessed by gel analysis and Bioanalyzer (Agilent Technologies).

Small RNA Sequencing

Deep sequencing of small RNAs extracted from metaphase II arrested X. laevis eggs was performed at LC Sciences (Houston, TX) using Illumina's Genome analyzer II platform. Small RNAs were size-fractionated to isolate RNA in the 15–50 nt range and ligated to SRA5' adapter and size fractionated to isolate 41–76 nt fraction. This was followed by SRA3' adapter ligation and size fractionation to isolate 64–99 nt size RNA. This RNA was reverse transcribed to produce single stranded cDNA and PCR amplified for 20 cycles using Illumina's primer set. This pool was once again size-fractionated to isolate cDNAs in 80–115 bp range that represented miRNA. This fraction was eluted, precipitated, and quantified on Nanodrop (Thermo Scientific). The concentration of cDNA sample was adjusted to 10 nM and 10 μl was used for sequencing reaction. Purified cDNA was used for cluster generation on Illumina's Cluster Station and sequenced using Illumina GAIIx platform. Raw sequencing reads were obtained using Illumina's Pipeline v1.5 software following sequencing image analysis by Pipeline Firecrest Module and base-calling by Pipeline Bustard Module. Filtering of sequences was done as presented in Figure 1.

Computational Analysis

Raw sequences were analyzed using the ACGT101-miR v4.2 software package (LC Sciences, Houston, TX). Several digital filters were applied to remove sequencing errors and eliminate unmappable reads. Unique reads in the 15–24 nucleotide size range were used for mapping to vertebrate microRNAs/pre-miRNAs (miRBase, Release 17, http://www.mirbase.org/index.shtml), to the X. tropicalis genome (JGI, version 4.1, http://genome.jgi-psf.org) and to Xenopus ESTs (Xenbase 2.7, http://www.xenbase.org). A computational approach was utilized to predict new miRNAs based on mapping of unique reads to X. tropicalis genomic sequences. Flanking sequences of mapped reads were subjected to secondary structure analysis to predict pre-miRNA sequences. Target prediction for putative candidate miRNAs and those miRNAs that showed a seed shift due to additions and deletions of bases on their 5' end was performed using custom target scan (http://www.targetscan.org/vert_50/seedmatch.html).

REFERENCES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. METHODS
  7. REFERENCES
  8. Supporting Information

Supporting Information

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RESULTS
  5. DISCUSSION
  6. METHODS
  7. REFERENCES
  8. Supporting Information

Additional Supporting Information may be found in the online version of this article.

FilenameFormatSizeDescription
DVG_22010_sm_SuppTab1.doc551KSupporting Information Table S1
DVG_22010_sm_SuppTab2.doc225KSupporting Information Table S2
DVG_22010_sm_SuppTab3.doc484KSupporting Information Table S3
DVG_22010_sm_SuppTab4.doc156KSupporting Information Table S4
DVG_22010_sm_SuppTab5.doc86KSupporting Information Table S5

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.