Documents
Poster
Poster
Towards Better Compressed Representations
- Citation Author(s):
- Submitted by:
- Michal Ganczorz
- Last updated:
- 25 March 2020 - 1:11pm
- Document Type:
- Poster
- Event:
- Presenters:
- Michał Gańczorz
- Paper Code:
- 152
- Categories:
- Keywords:
- Log in to post comments
We introduce the problem of computing a parsing where each phrase is of length at most m and which minimizes the zeroth order entropy of parsing. Based on the recent theoretical results we devise a heuristic for this problem. The solution has straightforward application in succinct text representations and gives practical improvements. Moreover the proposed heuristic yields structure which size can be bounded both by |S|H_{m-1}(S) and by |S|/m(H_0(S) + ... + H_{m-1}(S)),where H_{k}(S) is the k-th order empirical entropy of S. We also consider a similar problem in which the first-order entropy is minimized.
poster_dcc.pdf
poster_dcc.pdf (307)