Too Many False Targets for MicroRNAs: Challenges and Pitfalls in Prediction of miRNA Targets and Their Gene Ontology in Model and Non‐model Organisms

Short (“seed”) or extended base pairing between microRNAs (miRNAs) and their target RNAs enables post‐transcriptional silencing in many organisms. These interactions allow the computational prediction of potential targets. In model organisms, predicted targets are frequently validated experimentally; hence meaningful miRNA‐regulated processes are reported. However, in non‐models, these reports mostly rely on computational prediction alone. Many times, further bioinformatic analyses such as Gene Ontology (GO) enrichment are based on these in silico projections. Here such approaches are reviewed, their caveats are highlighted and the ease of picking false targets from predicted lists is demonstrated. Discoveries that shed new light on how miRNAs evolved to regulate targets in various phyletic groups are discussed, in addition to the pitfalls of target identification in non‐model organisms. The goal is to prevent the misuse of bioinformatic tools, as they cannot bypass the biological understanding of miRNA–target regulation.


Introduction
MicroRNAs (miRNAs) are small noncoding RNAs of 20-24 nucleotides (nt) that regulate messenger RNA (mRNA) levels in many eukaryotic lineages. [1][2][3] In bilaterian animals (the vast majority of extant animals) and land plants they play important roles in disease and most aspects of physiology because they are major components in canalization of development, fine tuning of expression, and genetic switching. [3][4][5] Their biogenesis begins from accurate cleavage of longer precursors by the RNAse III enzymes Drosha and Dicer in animals, or just Dicer in plants (reviewed by Voinnet [3] and Kim et al. [6] ). The product of this process is a duplex of small RNAs (sRNA), of which one strand is preferentially chosen to load into an Argonaute (AGO) protein, the core of the RNA-induced silencing complex (RISC). The chosen strand then guides the RISC via base pairing to suppress its targets. [7] In theory, the relatively constant regularity of miRNA biogenesis should allow their efficient bioinformatic identification and annotation. However, even this initial stage in the study of miRNA biology raises various challenges that were extensively discussed elsewhere. [8][9][10][11][12][13] Different lineages display various strategies by which miRNAs together with the RISC execute the target downregulation. In plants the dominating mechanism is a high miRNA-target complementarity recognition that is required to mediate the direct cleavage of the target (slicing) and/or its translational inhibition. [14][15][16][17][18][19][20] The majority of animals, however, utilize partial complementarity as a main mode of target recognition. This weak interaction, mediates destabilization by de-adenylation and translational inhibition of the targets (reviewed by Fabian et al. [21] ).
The different modes of target recognition and repression by miRNAs in plants and bilaterians might influence the number of targets each miRNA would have, and the magnitude of their downregulation (reviewed by Moran et al. [22] ). The fact that each miRNA in bilaterian animals can bind to numerous transcripts together with the fact that transcription factors are among the targets, led to the suggestion that whole genetic networks are regulated by miRNAs. [4] Translational inhibition and transcript destabilization, which is carried out by miRNAs in many animals is on one hand very subtle, [23] but on the other can enable reversibility, [4,24] hence contributing to complex regulatory networks. Mild reduction of dosage-sensitive genes such as transcription factors or RNA-binding proteins might be sufficient to generate significant downstream effects as each of these has numerous targets of its own. In plants, slicing is a prominent mechanism that has a stronger and nonreversible effect on targets. [14,17,19] However, the coexistence of milder translational inhibition [15,16,18,20,25] and the tendency of miRNAs to target transcription factors also in plants [26][27][28] imply that complex target regulation is likely to occur also in this kingdom.
Discoveries of miRNA-regulated processes rely on identification of genes regulated by specific miRNAs. For this purpose, target identification methods were developed. [29][30][31][32] These programs rely on principles of miRNA-target interaction in plants and animals and were discussed in contexts of ways of improving predictability of real targets and in contexts of comparing their accuracies. [32,33] These software programs and the discussions regarding their limitations were always oriented to the knowledge we had from a handful of model organisms. However, the recent expansion of miRNA studies, of functional miRNA-target interactions in previously uncharacterized organisms, shed new light on alternative modes of target regulation. [34][35][36] For example, cnidarians (sea anemones, corals, hydroids, and jellyfish) are non-bilaterian animals whose miRNAs bind targets with a high-complementarity, similarly to plants. [34,35] Therefore, it is time to revise the previous concepts of "animal-like" or "plant-like" target regulation as these might not precisely reflect the real biology of miRNAs. Moreover, miRNAs are being discovered in neither plant nor animal lineages of eukaryotes, [37][38][39] where targets are being predicted without experimental validation despite the lack of knowledge of their mode of target recognition.
Using recent experimental data about miRNA-target interactions from model and non-model organisms we re-evaluate here the levels of false positives among predicted targets. We further show that the reported conclusions about miRNAregulated biological processes when based on prediction alone in non-model organisms are most likely false. We postulate that the evolutionary understanding of how the miRNA pathway evolved is essential for understanding the limitations of miRNA-target identification methods and we highlight previously undiscussed considerations that should be taken into account in non-model organisms to help distinguishing real miRNA targets from false.

Why Do Researchers Trust Target Prediction Software?
Some regulatory noncoding RNAs such as long noncoding RNAs (lncRNAs) do not share features that enable an accurate prediction of their function; however, the mechanisms by which miRNAs regulate gene expression are well understood. By base pairing, miRNAs guide the RISC to reduce the levels of the targeted mRNA (reviewed by Bartel [2] and Voinnet [3] ). As it was experimentally shown that the miRNA system is involved in crucial biological processes such as development and disease in plants and animals, [2,3,40,41] many studies aim to identify miRNAs, their targets, and the biological process that they regulate. Base pairing enables us to predict targets of miRNAs; the fly in the ointment is that the binding sites are often as short as 7 nt and can be easily found by chance in large datasets of sequences. It has been more than a decade since it was shown that the exclusion of non-conserved binding sites from the lists of predicted targets considerably reduces such noise. [42] Since then, many prediction software programs were designed, some that take evolutionary conservation of target sites into account (for example: TargetScan [32] ) and some that do not (e.g., miRanda [30] or PITA [31] ). The latter are more frequently used when genomic data from closely related species are unavailable, but not always. It seems that some researchers tend to overestimate the prediction accuracy of such software, to a degree that experimentally non-validated miRNA targets are speculated to control various biological processes. [37][38][39][43][44][45][46][47][48][49][50][51] This is probably due to the misconception that the parameters that these software programs take into account (e.g., computation of miRNA-mRNA favorable thermodynamics and secondary structure of mRNA) enable us to efficiently eliminate most false positives.
We believe that probable or at least reasonable, speculations and suggestions about putative processes promote science. However, how probable is a computationally predicted miRNA target to be real? Now that high-throughput techniques for miRNA target identification are available, [52][53][54][55] we decided to make this estimation.

What Are the Actual Noise Levels of Target Prediction?
We compared the number of predicted targets to the number of targets that were experimentally validated by cross-linking, ligation, and sequencing of hybrids (CLASH) and highthroughput sequencing of RNA isolated by cross-linking and immunoprecipitation (HITS-CLIP) in a HEK293 cell line [55] and in the sea anemone Exaiptasia pallida (E. pallida) [35] (previously called Aiptasia pallida), respectively ( Figure 1A). The first example (HEK293) was chosen as a well-studied reference cell line to highlight the consequences of using prediction tools alone. The second example (Exaiptasia) was chosen to highlight the consequences of applying these tools on a lesser studied organism, where such prediction approaches are used in spite of not knowing [37][38][39] or ignoring [48] the miRNA-mRNA interaction mode. To maximize the accuracy of the analysis we chose 20 abundant HEK293-specific miRNAs [57] and predicted their targets in a dataset that includes 11 741 transcripts that are expressed in this cell line. This was done to minimize the non-relevant biological noise and to provide the best conditions for the software to generate its optimal output. In Exaiptasia, there are no available transcriptomes for specific cell-types; therefore, we predicted targets for all 46 known miRNAs [35] in the entire transcriptome, [56] both derived from whole animals. We compared these results to the average level of noise, which is represented as the mean number of targets for 50 lists of shuffled miRNAs ( Figure 1B). The fact that the peaks of predicted targets of real miRNAs are higher than the peak of the shuffled list might result from some enrichment for real targets. Even if this explanation is correct, then ≈65% and ≈85% of the predicted targets in HEK293 and Exaiptasia, respectively, are noise (calculated by dividing the noise levels [represented by the black and gray bars] by the number of real-miRNA predicted targets [represented by the blue bars]). This analysis strongly suggests that these prediction tools cannot confidently discern real targets from false ones.
Another way to estimate the false-positive rate is to compare the number of targets from the experimental data to the number of targets in the predicted lists. This comparison implies that the actual frequency of false positives might even be higher. In HEK293, only 95 of ≈37 000 PITA and miRanda, and only 22 of 5 820 TargetScan predicted targets overlapped with the canonical targets from the experimental data obtained with CLASH ( Figure 1C, left). In Exaiptasia only 45 of ≈270 000 PITA and miRanda predicted targets overlapped with the experimental data from HITS-CLIP ( Figure 1C, right). The tradeoff between specificity and sensitivity is well known, and high-throughput experimental methods, such as CLIP and CLASH, are missing real targets because of their reliance on the efficiency of enzymatic reactions and stringent purification Figure 1. Predicted lists of miRNA seed matching targets consist mainly of false positives. Target prediction using miRanda and PITA strict conditions (allowing no mismatches in the seed) was performed on the 3′-untranslated regions (UTRs) of HEK293-specific transcripts obtained from The Human Protein Atlas (https://www.proteinatlas.org), and on transcripts of the sea anemone Exaiptasia pallida obtained from Baumgarten et al. [56] TargetScan results were obtained from http://www.targetscan.org [32] and filtered to exclude non-HEK293 transcripts. HEK293 targets were predicted for 20 HEK293 miRNAs from Panwar et al., [57] and targets of Exaiptasia were predicted for all the reported miRNAs from Baumgarten et al. [35] For each organism the noise levels were calculated as follows (see also A for graphic description): B) the miRNAs in the list were shuffled 50 times, and the number of targets was predicted for each of the lists and the mean number of targets is presented by the black and gray bars. The blue bars represent the number of targets predicted for the real miRNAs. The yellow bars represent the number of experimentally validated targets from CLASH experiments in HEK293, [55] and HITS-CLIP experiments in Exaiptasia. [35] Because the prediction in HEK293 was performed on the 3′-UTRs, only experimentally validated targets in 3′-UTRs were counted in the CLASH data. The data used for these analyses are available at https://figshare.com/s/f858ae4266c680d162be. C) Overlapping targets from the outputs of prediction software with experimentally validated targets that contain a canonical seed match. Diagram was generated using http://www.interactivenn.net. [58] conditions during which real targets are getting lost. However, can we confidently treat the extra ≈99% of the predicted targets that are not experimentally supported as a pool that is considerably enriched with real targets? If true, the majority of miRNAs are expected to display enrichment in targets, when searched against the mRNA transcripts from the same organism, compared to when searched against a set of transcripts without any relevance to them, such as for example transcripts of a far-related organism. Although cross-species interacting miRNAs (known as "xenomiRs") have been previously suggested, it is now known that the majority of these are false. [59,60] Even when real, [61] it is only a subset of miRNAs that act in a cross-species manner, while the vast majority are host miRNAs that regulate host transcripts. We chose two evolutionary very distant organisms that are not known to interact with one another, do not share common miRNAs, but share a similar number of genes: Arabidopsis thaliana (A. thaliana) (thale cress) and Rattus norvegicus (R. norvegicus) (brown rat). For each of these organisms we chose 20 highly confident miRNAs, [62][63][64] which were used for target prediction in a reciprocal manner (self-miRNA vs self-transcripts, and self-miRNAs vs non-self-transcripts) ( Figure 2A). The number of predicted targets of each miRNA was normalized to the number of nucleotides in the transcriptome of each dataset. Whether self on self, or self on non-self, the normalized number of predicted targets per a list of miRNAs was high and very similar ( Figure 2B). As for specific miRNAs, there was no tendency to exhibit a speciesspecific enrichment in the number of targets ( Figure 2C). In some cases (9 of 20 miRNAs) miRNA had more predicted targets when checked self on self (yellow shaded); in other cases more targets were predicted for self on non-self (7 of 20 miRNA, blue shaded), and the rest were indistinguishable (white). Altogether, these results ( Figure 2) demonstrate that picking a random target from a predicted list is highly probable to be false unless the target was validated experimentally.

Resist the Urge to Tell Stories Based on Prediction Alone
It can be difficult to resist the temptation not to report the putative role of a predicted target, when the putative binding site falls inside a gene that fits well with the story we desire to tell. However, our analysis shows that even a careful usage of a prediction tool (using cell-type specific transcriptome with the cell-type specific miRNA list as input) ( Figure 1B, left) when lacking experimental support can yield predicted lists Figure 2. Prediction of miRNA targets in the species that express the miRNAs, is not increased compared to cross-species prediction. A) Twenty miRNAs from brown rat (R. norvegicus) or Thale cress (A. thaliana) were used to predict the number of targets that each set generates with the 3′-UTRs of each of the two organisms. B) The sum of the predicted targets for each combination. C) The number of putative targets for each individual rat miRNA in rat 3′-UTRs (pink) compared to the number in A. thaliana. To compare the number of targets between these two species, the number of targets was normalized to the total length of each 3′-UTR dataset. The number of targets is presented as targets per mega base (targets per Mb). Rat miRNAs with high confidence were chosen from MirGeneDB2.0. [64] Arabidopsis highly confident miRNAs were chosen according to Axtell et al. [62] and Arribas-Hernandez et al. [63] The data used for these analyses are available at https://figshare.com/s/7db0263cf3e3085443b6. www.advancedsciencenews.com www.bioessays-journal.com consisting mainly of false positives (Figures 1 and 2). It seems that speculations about miRNA-target regulation, based on prediction alone are less frequent in highly studied model organisms as experimental support can be obtained relatively easily. However, when it comes to non-model organisms, recent publications continue with bold speculations regarding the putative role of miRNAs that regulate targets that were chosen from a predicted list without any experimental support. [37][38][39][43][44][45][46][47][48][49][50][51] A disturbing fact is that in non-model organisms the search is often made by assuming miRNA-target recognition rules that might not reflect the biology of the studied organisms. [38,39,48] Thus, in the next section we will focus on what biological information needs to be taken into account regarding miRNA-target identification in non-model organisms.

Target Prediction in Non-model Organisms
The miRNA pathway originated either once or multiple times from an ancient RNA interference (RNAi) system. [22,65] Therefore, the primary miRNA pathways, similar to the small interfering RNA (siRNA) pathways they evolved from, must have recognized targets through high-complementarity. This feature enables the sRNA carriers, the AGO proteins, to directly cleave target RNA (reviewed by Huntzinger et al. [66] ). Since its appearance, the miRNA pathway had a long time to evolve and hence the mode of action and target recognition significantly vary between different organisms. In highly studied model animals, target slicing occurs on a very small minority of targets (often mediating the degradation of the miRNA itself [67] ), and the vast majority of targets are recognized through a short seed match, and repressed through a mild translational inhibition and destabilization carried out by the components of RISC. [68] On the contrary, in land plants slicing is a prominent mechanism. [14,17,19] Translational inhibition also occurs in plants; however, it still requires highcomplementarity throughout the length of the miRNA, and unlike in bilaterian animals an interaction restricted to the seed sequence would not promote any noticeable target downregulation. [17,69] Such observations from highly studied plants and animals cause many to classify target recognition to be "plant-like" or "animal-like", a statement that is factually false. Nematodes, flies, frogs, fish, mice, and other highly studied model animals are all bilaterians. Cnidaria (sea anemones, corals, hydroids, and jellyfish) is the sister group that separated from Bilateria >600 millions years ago (MYA) (Figure 3). It turned out that cnidarian miRNAs bind their targets with a high degree of complementarity and frequently mediate their cleavage. [34] The fraction of miRNAs with nearly perfectly matching targets in Cnidaria is even higher than in plants ( Figure 4A). [34] The recent AGO-CLIP in Exaiptasia revealed that seed-restricted recognition does not occur in this sea anemone [35] (Figure 4B) and combined with the previous work on other cnidarians [34] strongly suggests that matches restricted to the seed cannot mediate silencing in this animal phylum.
Such findings demonstrate that we cannot automatically apply "animal-like" rules, [48,50] as various binding strategies exist in animals. The fact that in cnidarians miRNAs bind to their targets most frequently by 17 nt (Figure 4B), [35] and not by full complementarity suggests that these basally branching animals possibly exhibit an evolutionary transition state, where the miRNAs evolved to not perfectly bind their targets, but still not reaching the extremely partial recognition that evolved in bilaterians ( Figure 4C). Thus, studies aiming to identify targets in non-bilaterian animals should take into account that it is highly probable that their system could not exert any regulatory effects through a seed match alone.
The existence of alternative binding strategies within a lineage is not restricted to Metazoa. A study on miRNAs in the green alga Chlamydomonas suggests that a match restricted to the seed is sufficient to mediate translational repression of a synthetic reporter. [36] The coexistence of such a system with a slicing mechanism in this species [74,80] suggests that variation in target binding strategies might exist within some members of the Viridiplantae (plants and green algae).
Broadly, there are many eukaryotic lineages where the mechanism by which miRNAs regulate their targets is unknown (Figure 3). Some of them, like dinoflagellates, being extremely distant from both plants and animals, leave us clueless about their miRNA mode of action. Yet studies of members of these species arbitrarily choose to apply "animallike" or "plant-like" rules for target identification without justifying their choice. For example, two recent studies predicted targets in different dinoflagellate species (Symbiodinium kawagutii and Prorocentrum donghaiense) applying opposing prediction approaches. [38,39] Notably, faulty target prediction is not restricted only to nonbilaterian animals and lineages where the mechanism is unknown as some studies in bilaterian non-model organisms are suffering from the same problems. [43,44,46,47,49,81] These studies searched for seed-matching targets, which is indeed the relevant mechanism for bilaterians; however, their conclusions about the importance of miRNA-related targets are most likely wrong because of the reasons discussed in the previous chapter. To summarize, we identify three sources that may cause erroneous conclusions regarding target regulation by miRNAs: a) biological knowledge regarding the mode of action is present, but there is no awareness to the high degree of noise in a list of predicted targets, b) biological knowledge of the miRNA-target regulation mechanism is absent and a prediction strategy is arbitrarily chosen, and c) lack of biological understanding combined with no awareness of the high degree of noise in the predicted lists.

Problems of Functional Annotation and Enrichment of MiRNA Targets
Studies of miRNAs often do not stop at lists of predicted targets. Many perform functional annotation on the predicted targets, followed by Gene Ontology (GO) enrichment analysis for finding biological pathways regulated by certain miRNAs. [38,39,[43][44][45][46]82] When significant enrichment for biological processes is detected, the level of confidence in the "newly discovered" miRNA-regulated pathway is increased. There are several problems with this approach. The first concerns the functional annotation step. The most prevalent way by which annotation algorithms annotate an uncharacterized protein is by BLAST or PSI-BLAST search of homologs with a known function, assuming that the function is conserved. [83] This assumption might be a source for biases as conserved sequences between very distant organisms would get the annotation of the better studied organism, despite the fact that their function might have considerably changed in the distant relative.
The next two problems are related to two assumptions made by the researchers regarding enrichment analysis of predicted targets: 1. Predicted targets from the list are affected by the miRNA to a degree that is sufficient to drive changes in biological processes. 2. The predicted list is enriched with real targets.
The problem with the first assumption is that a seed-based interaction in animals reduces the target expression level by up to twofold, [84,85] while the majority of the genes are not dosage sensitive. [86] Therefore, it was shown that such mild reduction in target levels, which often does not exceed the intrinsic variability of their expression levels, might have only negligible biological consequences. [87] The second assumption was revoked in the first chapter, and ideally, under such circumstances we would expect the analysis not to detect any significant enrichment in predicted lists consisting mainly of noise. Previously, it was shown in a cell line that enrichment analysis produced different results when applied on predicted targets or validated targets. [52] However, one Figure 3. Phylogenetic topology of the major eukaryotic groups. Groups known to have miRNAs appear in bold based on the following publications: Cnidaria and Porifera, [70] Choanozoa, [71] Slime molds, [72] Excavates, [50] Green algae, [73,74] Brown algae, [75] Dinoflagellates, [37] Plants, [76] Bilateria. [77,78] The current knowledge of the required complementarity levels between the miRNAs and their targets in each group is indicated in colors. The tree topology is based on a previously published phylogeny. [79] www.advancedsciencenews.com www.bioessays-journal.com might think that such differences might result from miRNA targets that were missed by the experimental approach and are present in the predicted list. If true, it is reasonable to expect that enrichment for biological processes would be detected almost exclusively for predicted targets of real miRNAs.
To test this idea we selected 20 sea anemone miRNAs from Exaiptasia that are not conserved with humans; thus predicting their targets in a human transcriptome would reflect noise. Targets of these miRNAs were predicted in a transcriptome of HEK293 cell line using miRanda, and an enrichment analysis for the sets of predicted genes of each miRNA was performed ( Figure 5). Strikingly, significantly enriched biological processes were found in the predicted lists for all of the sea anemone miRNAs ( Figure 5). A non-plausible interpretation of this absurd example is that sea anemones evolved to regulate human transcriptomes. The plausible interpretation from this analysis is that enrichment for predicted targets should not be assumed to be real.

What Can Be Done?
In the study of the miRNAs of model organisms many of the problems we presented throughout this paper rarely occur because of the available experimental approaches to study targets. Hence, we would like to focus on what we believe should be done in order to get meaningful insights about miRNA-regulated processes in non-model organisms where the experimental possibilities are much more limited.

Taking into Account the Evolutionary History of the miRNA Pathway
We need to start by asking ourselves whether we have an indication about the target repression mechanism of miRNAs in . Adapted with permission. [34] Copyright 2014, Cold Spring Harbor Laboratory Press. B) AGO-CLIP summary of the miRNA-target complementarity levels that occur in a bilaterian and a cnidarian. Adapted with permission. [35] Copyright 2018, John Wiley & Sons. C) Evolutionary scenario suggesting how miRNAs evolved to regulate targets, from ancestral high-complementarity to seedbased target recognition in bilaterians.
www.advancedsciencenews.com www.bioessays-journal.com the examined organism ( Figure 3). Phylogenetic relationships might assist at this step. For example, because in all studied members of Bilateria such as insects, nematodes and vertebrates, target repression is mediated through a short seed match, it is very likely that non-model bilaterians exhibit the same mode of regulation. In more basally branching animals, such as cnidarians, a seed recognition is most probably insufficient to mediate repression as it was experimentally shown that high-complementarity-based target cleavage occurs in both sea anemones and hydroids. [34] As these are distantly related cnidarians, it is most probable that all cnidarians regulate their targets in a similar manner. This notion is also supported by AGO-CLIP experiments from a sea anemone [35] ( Figure 4B).

High-and Low-Throughput Methods
Whether dealing with a seed or high-complementarity-based target regulation, high-throughput methods such as CLIP variants or CLASH can provide a list of highly confident targets. [52][53][54][55] Success in CLIP and CLASH depends on available antibodies capable of efficiently precipitating the AGOs. If the non-model organism is close enough to a model organism the antibodies of the latter might be suitable. Otherwise, the procedure depends on a successful generation of custom antibodies, which is far from guaranteed to succeed. In organisms with a high-complementarity-based mechanism such as cnidarians and plants, degradome sequencing is an additional high-throughput method that enables us to detect mRNAs that are cleaved at the position where they are bound by the miRNA. [34,[91][92][93][94] However, such high-throughput techniques are challenging, as they are expensive, require time-consuming experimental calibrations and adjustments, and require nontrivial analysis skills. Nevertheless, their big advantage is that they can be used to study organisms lacking genetic manipulation tools. When high-throughput techniques are not available because of limitations (resources, analysis skills, technical aspects, etc.), there are some low-throughput options. This will limit the researchers to test a smaller number of target candidates and some of these methods are only possible if genetic manipulation techniques are Figure 5. GO enrichment analysis on HEK293 predicted targets of sea anemone (Exaiptasia) miRNAs. Target prediction was performed on the 3′-UTR of HEK293 transcripts using miRanda with strict conditions allowing no mismatches in the seed. Enrichment analyses were performed on predicted target lists of each of the individual miRNAs. The most highly enriched GO biological process for each miRNA is presented. GOs were obtained from the GO Consortium [88,89] and enrichment analysis was performed using Protein ANalysis THrough Evolutionary Relationships (PANTHER). [90] The data used for these analyses are available at https://figshare.com/s/f50b76a2ffd0153c813f. www.advancedsciencenews.com www.bioessays-journal.com available for the organism. For high-complementarity-based interactions, RNA ligase-mediated-rapid amplification of cDNA ends (RLM-RACE) can be used to detect sliced targets. This method enables us to assay if the putative target transcript is cleaved at the conserved position of AGO-slicing within the miRNA binding site. [26,28,34,91] This method is also a good supplement when performing degradome sequencing, as it is highly sensitive and can help testing targets missed by the degradome. [34] In both high-complementarity and seed-restricted interactions, if genetic manipulations and/or transfections are available, individual targets can be validated using reporters with miRNA binding sites. [36,95] Alternatively, one can knockdown, knockout, or mis-express individual miRNAs and then measure changes in target levels. [96] When such genetic manipulations are not possible, one can test the ability to repress by less direct methods such as co-transfection of a miRNA and a reporter construct harboring an miRNA binding site into a cell line. [97,98] It is important to keep in mind that some of the described methods are made in artificial contexts that might not reflect the real biological conditions, and caveats of such approaches were previously discussed. [99] There is a very important point to consider regarding the interpretation of the above-described methods that is equally relevant to model and to non-model organisms. These methods can provide reliable information about miRNA-mRNA interactions and levels of target downregulation. However, they do not guarantee that the consequence of each of these interactions is biologically meaningful. [87] Therefore, we encourage researchers to test the biological relevance of a miRNA target by genetic tests, whenever possible.

A Computational Approach for Finding High-Complementary Targets
Sometimes evolutionary relations cannot provide any meaningful clues regarding how targets are regulated by miRNAs because of the large distance of the studied organism from relatives with a known mechanism of target recognition (e.g., dinoflagelates; see Figure 3). In such cases, we advise taking a computational approach that was previously described to provide an estimation if high matches are prevalent in the studied organisms. [34] By counting the fraction of miRNAs with at least one nearly perfect target, enrichment for highly complement target sites can be considered reliable. This is because unlike seed-restricted matches, the long nearly perfect matches are far less prone to be noise. [34] Such analyses can provide a good indication of whether it is worth proceeding with experimental approaches to identify targets. However, a negative result of this approach is not a proof that binding via seed-restricted matches is a functional mechanism in the tested organism.
In the case of seed-restricted matches, programs such as miRanda, TargetScan, PITA, psRNATarget, [29] and others are useful for getting the initial lists of predicted targets, from which it is easy to identify which transcripts had a potential binding site to which miRNAs. The predicted lists generated by different programs might differ from each other and such comparisons of their performances have been previously discussed. [32,33] In cases when studying an organism whose genome and the genomes of its closely related species were sequenced, prediction programs that take evolutionary conservation into account would probably provide lists of more reliable targets. Yet, because of the high noise levels generated by seed-restricted target prediction algorithms (Figures 1 and 2), discussions regarding their biological relevance should preferably be limited to cases where additional experimental support is available. Some studies try to bypass the experimental validation step by discussing only targets that were predicted by several different algorithms. We show here that this approach would probably not increase the credibility of these predicted targets: out of the 4 371 targets that overlapped between PITA, miRanda, and TargetScan, only 17 were shared with the experimentally validated targets in HEK293 ( Figure 1C).

Enrichment Analysis on Predicted Targets Requires Caution
When performing enrichment analysis we highly advise to include only experimentally validated targets. It was previously shown that the results of enrichment analysis on experimentally validated targets vary from the enrichments obtained with predicted ones. [52] In any case, as a result of the ease by which significant but irrelevant enrichments can be obtained ( Figure 5) we advise to be cautious about the interpretation of such results and treat them as not more than a hint for processes that are regulated by miRNAs. To further increase the chances for getting meaningful enrichments, the genes included in such analyses should be carefully annotated. However, the last suggestion depends on the community that studies the given non-model organism to constantly update the terms of newly studied genes in the relevant databases.

Conclusions and Outlook
miRNA-regulated processes are frequently reported in the literature. While in model organisms such reports are many times reliable because of the capabilities of experimental confirmation, in non-model organisms putative miRNA-based regulation of processes relies mostly on target prediction alone. In the first section, we exploited the recent high-throughput experimental information available from model and non-model organisms and provided analyses demonstrating that the signal-to-noise ratios are insufficient to reliably identify miRNA targets only by computational means. Based on findings that shed new light on how the miRNA pathway evolved in animals; in the second section we highlighted the considerations that have to be taken into account to enable the identification of real targets. Moreover, we discussed the additional problems of target prediction in non-model organisms for which we lack the understanding of how targets are regulated by miRNAs. We showed that the GO enrichment analyses, which are often performed on experimentally nonvalidated targets, are most likely meaningless, and should not be coupled to target prediction. To the best of our knowledge this work is the first to discuss caveats and pitfalls of target prediction in non-model organisms where the conclusions about miRNA-regulated processes are frequently reported without experimental www.advancedsciencenews.com www.bioessays-journal.com validation of the targets. In addition to raising these glaring issues, we provide advice on how to avoid the common pitfalls, and analyze miRNAs and their interaction with their targets even in little-studied species in a biologically meaningful way.