Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Approximating Optimal Bidirectional Macro Schemes.

Abstract: 

Lempel-Ziv is an easy-to-compute member of a wide family of so-called macro schemes; it restricts pointers to go in one direction only. Optimal bidirectional macro schemes are NP-complete to find, but they may provide much better compression on highly repetitive sequences. We consider the problem of approximating optimal bidirectional macro schemes. We describe a simulated annealing algorithm that usually converges quickly. Moreover, in some cases, we obtain bidirectional macro schemes that are provably a 2-approximation of the optimal. We test our algorithm on a number of artificial repetitive texts and verify that it is efficient in practice and outperforms Lempel-Ziv, sometimes by a wide margin.

up
0 users have voted:

Comments

Thanks for the video!
I already wondered about how to efficiently compute such an approximation, but with the link/cut tree data structure, it seems feasible.
If I understood correctly, the gist is that in some cases, your compression scheme based on simulated annealing has a size of at most 2γ, where γ is the size of the smallest string attractor.

While watching your video, I've collected some small questions:

With what configuration do you start the simulated annealing? Do you start with the LZ77 phrases (in particular, the LZ77 scheme that introduces a new character at the end of each referencing phrase)?

If I interpreted the figure in Slide 23 correctly, the black-shaded characters are the endings of the phrases. However, are there any explicit characters?

How did you evaluate the 2-approximation? Do you compute the smallest bidirectional macro scheme or the smallest string attractor via brute force?

Some references that could be interesting for future evaluation:
LZRR (https://ieeexplore.ieee.org/document/8712715) and plcpcomp (https://doi.org/10.4230/LIPIcs.ESA.2019.41) are other bidirectional macro schemes with empirically well-received approximations.
In particular, the plcpcomp approach uses ideas similar to the link/cut tree data structure for merging or splitting phrases, but with the difference that there the focus was put on improving the decompression in external memory.

For the future work: If you propose tests with large files, I wonder how fast your algorithm is.
Simulated annealing and the merge operation on the link cut tree seems to be very slow.

Finally, a didactic hint: Instead of introducing LZ78 in detail (Slide 4), it would have been better to introduce string attractors (needed in Slide 24) and a common macro scheme (e.g. LZ77).

Paper Details

Authors:
Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco
Submitted On:
31 March 2020 - 4:25pm
Short Link:
Type:
Presentation Slides
Event:
Session:
Session 5

Document Files

DCC-2020-Approximating_Optimal_Bidirectional_Macro_Schemes.pptx

(44)

Keywords

Additional Categories

Subscribe

[1] Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco, "Approximating Optimal Bidirectional Macro Schemes.", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5083. Accessed: Jul. 07, 2020.
@article{5083-20,
url = {http://sigport.org/5083},
author = {Luís M. S. Russo; Ana D. Correia; Gonzalo Navarro; Alexandre P. Francisco },
publisher = {IEEE SigPort},
title = {Approximating Optimal Bidirectional Macro Schemes.},
year = {2020} }
TY - EJOUR
T1 - Approximating Optimal Bidirectional Macro Schemes.
AU - Luís M. S. Russo; Ana D. Correia; Gonzalo Navarro; Alexandre P. Francisco
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5083
ER -
Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco. (2020). Approximating Optimal Bidirectional Macro Schemes.. IEEE SigPort. http://sigport.org/5083
Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco, 2020. Approximating Optimal Bidirectional Macro Schemes.. Available at: http://sigport.org/5083.
Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco. (2020). "Approximating Optimal Bidirectional Macro Schemes.." Web.
1. Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco. Approximating Optimal Bidirectional Macro Schemes. [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5083