Dynamic Linear Model for the Identification of miRNAs in Next-Generation Sequencing Data
Article first published online: 8 MAR 2011
© 2011, The International Biometric Society
Volume 67, Issue 4, pages 1206–1214, December 2011
How to Cite
Evan Johnson, W., Welker, N. C. and Bass, B. L. (2011), Dynamic Linear Model for the Identification of miRNAs in Next-Generation Sequencing Data. Biometrics, 67: 1206–1214. doi: 10.1111/j.1541-0420.2010.01570.x
- Issue published online: 14 DEC 2011
- Article first published online: 8 MAR 2011
- Received May 2009. Revised September 2010. Accepted December 2010.
- Bayesian methods;
- Dynamic linear model;
- Markov chain Monte Carlo;
- miRNA prediction;
- Smith–Waterman algorithm;
- Solexa/Illumina sequencing
Summary Next-generation sequencing technologies are poised to revolutionize the field of biomedical research. The increased resolution of these data promise to provide a greater understanding of the molecular processes that control the morphology and behavior of a cell. However, the increased amounts of data require innovative statistical procedures that are powerful while still being computationally feasible. In this article, we present a method for identifying small RNA molecules, called miRNAs, which regulate genes by targeting their mRNAs for degradation or translational repression. In the first step of our modeling procedure, we apply an innovative dynamic linear model that identifies candidate miRNA genes in high-throughput sequencing data. The model is flexible and can accurately identify interesting biological features while accounting for both the read count, read spacing, and sequencing depth. Additionally, miRNA candidates are also processed using a modified Smith–Waterman sequence alignment that scores the regions for potential RNA hairpins, one of the defining features of miRNAs. We illustrate our method on simulated datasets as well as on a small RNA Caenorhabditis elegans dataset from the Illumina sequencing platform. These examples show that our method is highly sensitive for identifying known and novel miRNA genes.