Sorry, you need to enable JavaScript to visit this website.

Computing Lexicographic Parsings

Primary tabs

Citation Author(s):
Submitted by:
Dominik Koeppl
Last updated:
18 February 2022 - 5:38am
Document Type:
Presentation Slides
Document Year:
2022
Event:
Presenters Name:
Dominik Köppl
Categories:

Abstract 

Abstract: 

We give memory-friendly algorithms computing the compression schemes lexparse in linear time.

up
0 users have voted:

Comments

Hi Dominik,
Do I correctly understand that the decompression of the lexparse (as well as other bidirectional macro schemes) would be very long, e.g. in comparison with LZ77? Is there any practical decompression method?

Hi Igor,
thanks for your question. Indeed, decompression can be slow. When considering decoding LZ77, where a factor always refers to the left, we can easily decode an LZ77-encoded file by scanning the factor pairs from left to right since when visiting a factor pair, we already have decoded its reference (or the necessary parts for reconstruction in case of an overlapping factor).
For lexicographic parsings, although we prohibit cycles during the creation of factors referring to characters substituted by other factors, these kind of dependency chains can be very long. So it could be that we need many random accesses on a partially decoded text, which can be slow.
Actually, we tackled this problem in the external memory model, proposing an algorithm that splits up factors into these dependency chains:

Patrick Dinklage, Jonas Ellert, Johannes Fischer, Dominik Köppl, Manuel Penschuck: "Bidirectional Text Compression in External Memory". ESA 2019: 41:1-41:16, https://doi.org/10.4230/LIPIcs.ESA.2019.41

Dataset Files

lexparse_dcc.pdf

(34)