Documents
Presentation Slides
Presentation Slides
Edge minimization in de Bruijn graphs
- Citation Author(s):
- Submitted by:
- Uwe Baier
- Last updated:
- 24 March 2020 - 5:56am
- Document Type:
- Presentation Slides
- Document Year:
- 2020
- Event:
- Presenters:
- Uwe Baier, Thomas Büchler
- Categories:
- Keywords:
- Log in to post comments
This paper introduces the de Bruijn graph edge minimization problem, which is related to the compression of de Bruijn graphs: find the order-k de Bruijn graph with minimum edge count among all orders. We describe an efficient algorithm that solves this problem. Since the edge minimization problem is connected to the BWT compression technique called "tunneling", the paper also describes a way to minimize the length of a tunneled BWT in such a way that useful properties for sequence analysis are preserved. Although being a restriction, this is significant progress towards a solution to the open problem of finding optimal disjoint blocks that minimize space, as stated in Alanko et al. (DCC 2019).
Comments
Thanks for the illustrative
Thanks for the illustrative video!
The gist is that, while computing the space-optimal tunneling is NP-complete in general, computing the tunneling of non-overlapping prefix intervals can be done in linear time (?). As an application, you draw connections to computing the $k$ for which the number of edges of a de Brujin graph is minimal after merging edges as it happens that this computation can be carried out by tunneling the non-overlapping prefix intervals.
I was wondering in the experiments whether this compressed FM-index can be put into relation with the run-length compressed FM-index, as the run-length compression is kind of similar (it needs also additional bit vectors).