Revisiting bounded context block-sorting transformations
Article first published online: 23 AUG 2011
Copyright © 2011 John Wiley & Sons, Ltd.
Software: Practice and Experience
Volume 42, Issue 8, pages 1037–1054, August 2012
How to Cite
Culpepper, J. S., Petri, M. and Puglisi, S. J. (2012), Revisiting bounded context block-sorting transformations. Softw: Pract. Exper., 42: 1037–1054. doi: 10.1002/spe.1112
- Issue published online: 6 JUL 2012
- Article first published online: 23 AUG 2011
- Manuscript Accepted: 31 JUL 2011
- Manuscript Revised: 5 JUL 2011
- Manuscript Received: 24 FEB 2011
- data compression;
- Burrows–Wheeler transform;
- suffix array
The Burrows–Wheeler Transform (BWT) produces a permutation of a string X, denoted X∗, by sorting the n cyclic rotations of X into full lexicographical order and taking the last column of the resulting n×n matrix to be X∗. The transformation is reversible in time. In this paper, we consider an alteration to the process, called k-BWT, where rotations are only sorted to a depth k. We propose new approaches to the forward and reverse transform, and show that the methods are efficient in practice. More than a decade ago, two algorithms were independently discovered for reversing k-BWT, both of which run in time. Two recent algorithms have lowered the bounds for the reverse transformation to and, respectively. We examine the practical performance for these reversal algorithms. We find that the original approach is most efficient in practice, and investigates new approaches, aimed at further speeding reversal, which store precomputed context boundaries in the compressed file. By explicitly encoding the context boundaries, we present an reversal technique that is both efficient and effective. Finally, our study elucidates an inherently cache-friendly – and hitherto unobserved – behavior in the reverse k-BWT, which could lead to new applications of the k-BWT transform. In contrast to previous empirical studies, we show that the partial transform can be reversed significantly faster than the full transform, without significantly affecting compression effectiveness. Copyright © 2011 John Wiley & Sons, Ltd.