ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data

Fabian Knorr, Peter Thoman, Thomas Fahringer
Fabian Knorr
3 March 2021 - 1:34am
Presentation Slides
Fabian Knorr



Exchanging large amounts of floating-point data is common in distributed scientific computing applications. Data compression, when fast enough, can speed up such workloads by reducing the time spent waiting for data transfers. We propose ndzip, a high-throughput, lossless compression algorithm for multi-dimensional univariate regular grids of single- and double-precision floating point data. Tailored towards efficient implementation on modern SIMD-capable multicore processors, it compresses and decompresses data at speeds close to main memory bandwidth, significantly outperforming existing schemes. We evaluate this novel method using a representative set of scientific data, demonstrating a competitive trade-off between compression effectiveness and throughput.

