Sorry, you need to enable JavaScript to visit this website.

Towards Better Compressed Representations

Citation Author(s):
Michał Gańczorz
Submitted by:
Michal Ganczorz
Last updated:
25 March 2020 - 1:11pm
Document Type:
Poster
Event:
Presenters:
Michał Gańczorz
Paper Code:
152
Categories:
Keywords:
 

We introduce the problem of computing a parsing where each phrase is of length at most m and which minimizes the zeroth order entropy of parsing. Based on the recent theoretical results we devise a heuristic for this problem. The solution has straightforward application in succinct text representations and gives practical improvements. Moreover the proposed heuristic yields structure which size can be bounded both by |S|H_{m-1}(S) and by |S|/m(H_0(S) + ... + H_{m-1}(S)),where H_{k}(S) is the k-th order empirical entropy of S. We also consider a similar problem in which the first-order entropy is minimized.

up
0 users have voted: