Sorry, you need to enable JavaScript to visit this website.

Edge minimization in de Bruijn graphs

Citation Author(s):
Thomas Büchler, Enno Ohlebusch, Pascal Weber
Submitted by:
Uwe Baier
Last updated:
24 March 2020 - 5:56am
Document Type:
Presentation Slides
Document Year:
2020
Event:
Presenters:
Uwe Baier, Thomas Büchler
Categories:
Keywords:
 

This paper introduces the de Bruijn graph edge minimization problem, which is related to the compression of de Bruijn graphs: find the order-k de Bruijn graph with minimum edge count among all orders. We describe an efficient algorithm that solves this problem. Since the edge minimization problem is connected to the BWT compression technique called "tunneling", the paper also describes a way to minimize the length of a tunneled BWT in such a way that useful properties for sequence analysis are preserved. Although being a restriction, this is significant progress towards a solution to the open problem of finding optimal disjoint blocks that minimize space, as stated in Alanko et al. (DCC 2019).

up
1 user has voted: Dominik Koeppl

Comments

Thanks for the illustrative video!
The gist is that, while computing the space-optimal tunneling is NP-complete in general, computing the tunneling of non-overlapping prefix intervals can be done in linear time (?). As an application, you draw connections to computing the $k$ for which the number of edges of a de Brujin graph is minimal after merging edges as it happens that this computation can be carried out by tunneling the non-overlapping prefix intervals.
I was wondering in the experiments whether this compressed FM-index can be put into relation with the run-length compressed FM-index, as the run-length compression is kind of similar (it needs also additional bit vectors).