SEARCH

SEARCH BY CITATION

Keywords:

  • data compression;
  • Burrows–Wheeler transform;
  • block-sorting;
  • suffix array

SUMMARY

The Burrows–Wheeler Transform (BWT) produces a permutation of a string X, denoted X, by sorting the n cyclic rotations of X into full lexicographical order and taking the last column of the resulting n×n matrix to be X. The transformation is reversible ininline image time. In this paper, we consider an alteration to the process, called k-BWT, where rotations are only sorted to a depth k. We propose new approaches to the forward and reverse transform, and show that the methods are efficient in practice. More than a decade ago, two algorithms were independently discovered for reversing k-BWT, both of which run ininline image time. Two recent algorithms have lowered the bounds for the reverse transformation toinline image andinline image, respectively. We examine the practical performance for these reversal algorithms. We find that the originalinline image approach is most efficient in practice, and investigates new approaches, aimed at further speeding reversal, which store precomputed context boundaries in the compressed file. By explicitly encoding the context boundaries, we present aninline image reversal technique that is both efficient and effective. Finally, our study elucidates an inherently cache-friendly – and hitherto unobserved – behavior in the reverse k-BWT, which could lead to new applications of the k-BWT transform. In contrast to previous empirical studies, we show that the partial transform can be reversed significantly faster than the full transform, without significantly affecting compression effectiveness. Copyright © 2011 John Wiley & Sons, Ltd.