Computing Lexicographic Parsings

We give memory-friendly algorithms computing the compression schemes lexparse in linear time.

Comments

Decompression speed

Permalink Submitted by Igor Zavadskyi on 14 March 2022 - 6:55am

Hi Dominik,
Do I correctly understand that the decompression of the lexparse (as well as other bidirectional macro schemes) would be very long, e.g. in comparison with LZ77? Is there any practical decompression method?

About Decompression speed

Permalink Submitted by Dominik Koeppl on 18 March 2022 - 3:55am

Hi Igor,
thanks for your question. Indeed, decompression can be slow. When considering decoding LZ77, where a factor always refers to the left, we can easily decode an LZ77-encoded file by scanning the factor pairs from left to right since when visiting a factor pair, we already have decoded its reference (or the necessary parts for reconstruction in case of an overlapping factor).
For lexicographic parsings, although we prohibit cycles during the creation of factors referring to characters substituted by other factors, these kind of dependency chains can be very long. So it could be that we need many random accesses on a partially decoded text, which can be slow.
Actually, we tackled this problem in the external memory model, proposing an algorithm that splits up factors into these dependency chains:

Patrick Dinklage, Jonas Ellert, Johannes Fischer, Dominik Köppl, Manuel Penschuck: "Bidirectional Text Compression in External Memory". ESA 2019: 41:1-41:16, https://doi.org/10.4230/LIPIcs.ESA.2019.41

CITE

Documents

Presentation Slides

Computing Lexicographic Parsings

lexparse_dcc.pdf

Comments

Decompression speed

About Decompression speed

QUESTIONS?