Sorry, you need to enable JavaScript to visit this website.

It is known that a context-free grammar (CFG) that produces a single string can be derived from the compact directed acyclic word graph (CDAWG) for the same string. In this work, we show that the CFG derived from a CDAWG is deeply connected to the maximal repeat content of the string it produces and thus has O(m) rules, where m is the number of maximal repeats in the string. We then provide a generic algorithm based on this insight for constructing the CFG from the LCP-intervals of a string in O(n) time, where n is the length of the string.

Categories:
92 Views

Canonical binary AIFV coding contains two trees $T_0$ and $T_1$. We show the method to compress $T_0$, and the method to compress $T_1$ is with a similar way. We provide a new method to store the number of leaves, master nodes and complete internal nodes in each layer and compactly encode the string of numbers according to the specific property between the nodes.

Categories:
19 Views

Given a string S over an alphabet of size σ, we consider practical implementations of extended compressed RAM on S, which supports access, replace, insert, and delete operations on S while maintaining S in compressed form. In this paper, we proposed two implementations where each of them is based on the compressed RAM of Jansson et al. [ICALP 2012], and Grossi et al. [ICALP 2013], respectively. Experimental results show that our implementations support the operations efficiently while keeping the space proportional to the entropy of the input during the updates.

Categories:
47 Views

In this paper, we propose a novel approach to succinct coding of permutations taking advantage of the “divide-and-conquer” strategy. In addition, we provide a theoretical analysis of the proposed approach leading to formulations allowing to calculate precise bounds (minimum, average, maximum) to the length of permutation coding expressed in a number of bits per permutation element for various sizes of permutations n being integer powers of 2.

Categories:
84 Views

The variable-length Reverse Multi-Delimiter (RMD) codes are known to represent sequences of unbounded and unordered integers. When applied to data compression, they combine a good compression ratio with fast decoding. In this paper, we investigate another property of RMD-codes - the ability of direct access to codewords in the encoded bitstream. We present the method allowing us to extract and decode a codeword from an RMD-bitstream in almost constant time with the tiny space overhead, and make experiments on its application to natural language text compression.

Categories:
44 Views

Generative Adversarial Networks (GANs) have shown promise in augmenting datasets and boosting convolutional neural networks' (CNN) performance on image classification tasks. But they introduce more hyperparameters to tune as well as the need for additional time and computational power to train supplementary to the CNN. In this work, we examine the potential for Auxiliary-Classifier GANs (AC-GANs) as a 'one-stop-shop' architecture for image classification, particularly in low data regimes.

Categories:
44 Views

Pages