We give memory-friendly algorithms computing the compression schemes lexparse in linear time.
Do I correctly understand that the decompression of the lexparse (as well as other bidirectional macro schemes) would be very long, e.g. in comparison with LZ77? Is there any practical decompression method?
thanks for your question. Indeed, decompression can be slow. When considering decoding LZ77, where a factor always refers to the left, we can easily decode an LZ77-encoded file by scanning the factor pairs from left to right since when visiting a factor pair, we already have decoded its reference (or the necessary parts for reconstruction in case of an overlapping factor).
For lexicographic parsings, although we prohibit cycles during the creation of factors referring to characters substituted by other factors, these kind of dependency chains can be very long. So it could be that we need many random accesses on a partially decoded text, which can be slow.
Actually, we tackled this problem in the external memory model, proposing an algorithm that splits up factors into these dependency chains:
Patrick Dinklage￼, Jonas Ellert￼, Johannes Fischer, Dominik Köppl￼, Manuel Penschuck: "Bidirectional Text Compression in External Memory". ESA 2019: 41:1-41:16, https://doi.org/10.4230/LIPIcs.ESA.2019.41
Copy and paste the HTML code below to embed your dataset:
Click the buttons below: