Sorry, you need to enable JavaScript to visit this website.

We describe methods to support fast rank queries on the Burrows-Wheeler transform (BWT) string S of an input string T on alphabet Σ, in order to support pattern counting queries. Our starting point is an approach previously adopted by several authors, which is to represent S as |Σ| bitvectors, where the bitvector for symbol c has a 1 at position i if and only if S[i] = c, with the bitvec- tors stored in Elias-Fano (EF) encodings, to enable binary rank queries. We first show that the clustering of symbols induced by the BWT makes standard implementations of EF unattractive.

Categories:
97 Views

Deep neural networks (DNNs), despite their performance on a wide variety of tasks, are still out of reach for many applications as they require significant computational resources. In this paper, we present a low-rank based end-to-end deep neural network compression frame- work with the goal of enabling DNNs performance to computationally constrained devices. The proposed framework includes techniques for low-rank based structural approximation, quantization and lossless arithmetic coding.

Categories:
114 Views

Little attention has been given to language support for block-based compression algorithms, despite their high implementation complexity. Current implementations have to deal with both the intricacies of the algorithm itself, as well as the low-level optimizations necessary for generating fast code. However, many block-based compression algorithms share a common structure in terms of their data representations, data partitioning operations, and data traversals.
In this work, we propose a set of high-level language abstractions that can succinctly capture this structure.

Categories:
143 Views

Extending recently suggested methods, a new dynamic compression algorithm is proposed, which assigns larger weights to characters that have just been coded by means of an increasing weight function. Empirical results present its efficient compression performance, which, for input files with locally skewed distributions, can improve beyond the lower bound given by the entropy for static encoding, at the price of slower running times for compression, and comparable time for decompression.

Categories:
61 Views

The topological model for spatial objects identifies common boundaries between regions, explicitly storing adjacency relations, which not only improves the efficiency of topology-related queries, but also provides advantages such as avoiding data duplication and facilitating data consistency. Recently, a compact representation of the topological model based on planar graph embeddings was proposed.

Categories:
34 Views

A binary Write Once Memory (wom) device is a storage mechanism in which a 0-bit can be overwritten much more easily than a 1-bit. A famous example is the flash memory technology, where $0 \rightarrow 1$ transitions are allowed, but $1\rightarrow 0$ transitions require a costly erase procedure and are therefore prohibited. A {\sc wom} code is a coding scheme that permits multiple writes to the {\sc wom} without violating the {\sc wom} rule.
The properties of {\sc wom}
attracted attention even before flash memory was invented.

Categories:
41 Views

A particular form of lossless data compression is known as deduplication, which is often applied in a scenario in which a large data repository is given and we wish to store a new, updated, version of it, in which the changes account only for a tiny fraction of the accumulated information. The idea is then to find duplicated parts and store only one copy P of them; the second and subsequent occurrences of these parts can then be replaced by pointers to P.

Categories:
34 Views

Pages