The successor and predecessor problem consists of obtaining the closest value in a set of integers, greater or smaller than a given value. This problem has interesting applications, like the intersection of inverted lists. It can be easily modeled by using a bitvector of size n and its operations rank and select. However, there is a practical approach [1], which keeps the best theoretical bounds, and allows to solve successor and predecessor more efficiently. Based on that technique, we designed a novel compact data structure for bitvectors with k runs that achieves access, rank, and successor/predecessor in O(1) time by consuming space O(sqrt(kn)) bits. In practice, it obtains a compression ratio of 0.04%-26.33% when the runs are larger than 100, and becomes the fastest technique, which considers compressibility, in successor/predecessor queries. Besides, we present a recursive variant of our structure, which tends to O(k) bits and takes O(log (n/k)) time.

### Paper Details

- Authors:
- Submitted On:
- 20 March 2020 - 3:47pm
- Short Link:
- Type:
- Presentation Slides
- Event:
- Presenter's Name:
- Adrian Gomez Brandon
- Session:
- Session 4

### Subscribe

url = {http://sigport.org/5009},

author = { },

publisher = {IEEE SigPort},

title = {Bitvectors with runs and the successor/predecessor problem},

year = {2020} }

T1 - Bitvectors with runs and the successor/predecessor problem

AU -

PY - 2020

PB - IEEE SigPort

UR - http://sigport.org/5009

ER -

## Comments

## Thank's for your well

Thank's for your well-presented talk! Having watched it, I'd like to ask the following two questions:

Slide 8: $k$ is the number of runs in your bit vector, $m$ the minimum min(#1, #0) of the number of ones or the number of zeros, but what is $b$ (in space for hybrid)? As far as I understood, hybrid selects between the oz-vector and the sd-array representation?

Slide 11: Each block $X_i$ of the partitioning has a fixed length $\beta$.

Does it make sense to maximize the Z and O type blocks and use a (hopefully very sparse and hence compressible) bit vector marking the beginning positions of the $X_i$'s?

## Hi Dominik! Thanks for your

Hi Dominik! Thanks for your questions. I hope you have enjoyed the talk!

With respect to the first question, hybrid splits the bitvector into blocks of a fixed size $b$. The hybrid selects between three kind of representations i) sparse, when the number of $m$ is quite small, ii) runs, there are different areas with the same consecutive values, and iii) a plain representation. Sparse blocks are represented by storing the location of the 1-bit or 0-bit within the block and blocks with runs store the lengths of each run. Different compressions for those parts could be applied, like sd-array or oz-vector, but they can suppose slower response times.

The second question is quite interesting, I think it is another possibility , but I was looking for a way to solve successor/predecessor in O(1) time. By adding that sparse bitmap, I think the constant time is lost because of the rank in sparse bitvectors. However, it is a good option if you look for a better compression ratio.

If you have any doubt or need some additional information, do not hesitate to contact me.