HOLZ: High-Order Entropy Encoding of Lempel-Ziv Factor Distances
- Citation Author(s):
- Submitted by:
- Dominik Koeppl
- Last updated:
- 18 February 2022 - 5:37am
- Document Type:
- Presentation Slides
- Document Year:
We propose a new representation of the offsets of the Lempel-Ziv (LZ) factorization
based on the co-lexicographic order of the text's prefixes.
The selected offsets tend to approach the k-th order empirical entropy.
Our evaluations show that this choice is superior to
the rightmost and bit-optimal LZ parsings on datasets with small high-order entropy.