Parallel Processing of Grammer Compression

Masaski Matsushita, Yasushi Inoguchi
Masaki Matsushita
27 February 2021 - 10:09am
Masaki Matsushita
Thank you for your presentation!

May I ask you whether you can you shed more light on the details of your algorithm?
If it is just a parallelization, then the output of the original Re-Pair
algorithm should coincide with your parallel version. Yet, you report
that your compression ratio is slightly worse. My guess is that you
keep a global frequency table, and every CPU processes its own part
of the text (called "Block" in your slides) without synchronization
barriers such that the not most-frequent bigrams.

Also, what is the Re-Pair software you compare with? (There are multiple
available, all with different trade-offs.)

presentation slide