Sorry, you need to enable JavaScript to visit this website.

Constructing the CDAWG CFG using LCP-Intervals

Citation Author(s):
Alan Cleary, Jordan Dood
Submitted by:
Alan Cleary
Last updated:
21 February 2023 - 12:25pm
Document Type:
Presentation Slides
Document Year:
2023
Event:
Presenters:
Jordan Dood
Paper Code:
240
Categories:
Keywords:
 

It is known that a context-free grammar (CFG) that produces a single string can be derived from the compact directed acyclic word graph (CDAWG) for the same string. In this work, we show that the CFG derived from a CDAWG is deeply connected to the maximal repeat content of the string it produces and thus has O(m) rules, where m is the number of maximal repeats in the string. We then provide a generic algorithm based on this insight for constructing the CFG from the LCP-intervals of a string in O(n) time, where n is the length of the string. This includes a novel data-structure to support stabbing queries on LCP-intervals in O(1 + k) time after O(n) preprocessing time, where k is the number of intervals stabbed. These results connect the CFG to properties of the string it produces and relates it to other string data-structures, allowing it to be studied independently of the CDAWG and providing opportunity for innovation of grammar-based compression algorithms.

up
0 users have voted: