Sorry, you need to enable JavaScript to visit this website.

Compressing and Randomly Accessing Sequences

Primary tabs

Citation Author(s):
Laith Ali Abdulsahib, Diego Arroyuelo, Rajeev Raman
Submitted by:
Rajeev Raman
Last updated:
21 April 2020 - 10:31am
Document Type:
Poster
Document Year:
2020
Event:
Presenters Name:
Rajeev Raman
Paper Code:
DCC-197

Abstract 

Abstract: 

In this paper we consider the problem of storing sequences of symbols in
a compressed format, while supporting random access to the symbols without
decompression. Although this is a well-studied problem when the data is
textual, the kind of sequences we look at are not textual, and we argue
that traditional compression methods used in the text algorithms community
(such as compressors targeting $k$-th order empirical entropy) do not
perform as well on these sequential data, and simpler methods such
as Huffman-coding the deltas between sequence elements give better
compression performance. We discuss data structures that allow
random access to sequence elements that target such measures.

up
0 users have voted:

Comments

The problem dealt in this poster seems to be the same as https://sigport.org/documents/towards-better-compressed-representations,
but here you follow a delta-encoded approach.
Since you said that the theorem shown in the poster is unattractive because $S$ may be very large, I wonder how large $S$ actually is on the used datasets.
We know that $S <= nσ$, but it could be much smaller.

Dataset Files

main.pdf

(226)