Sorry, you need to enable JavaScript to visit this website.

Knowledge and Data Engineering

Compressing and Randomly Accessing Sequences


In this paper we consider the problem of storing sequences of symbols in
a compressed format, while supporting random access to the symbols without
decompression. Although this is a well-studied problem when the data is
textual, the kind of sequences we look at are not textual, and we argue
that traditional compression methods used in the text algorithms community
(such as compressors targeting $k$-th order empirical entropy) do not
perform as well on these sequential data, and simpler methods such

Paper Details

Authors:
Laith Ali Abdulsahib, Diego Arroyuelo, Rajeev Raman
Submitted On:
21 April 2020 - 10:31am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Session:
Document Year:
Cite

Document Files

main.pdf

(60)

Keywords

Additional Categories

Subscribe

[1] Laith Ali Abdulsahib, Diego Arroyuelo, Rajeev Raman, "Compressing and Randomly Accessing Sequences", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5109. Accessed: Sep. 25, 2020.
@article{5109-20,
url = {http://sigport.org/5109},
author = {Laith Ali Abdulsahib; Diego Arroyuelo; Rajeev Raman },
publisher = {IEEE SigPort},
title = {Compressing and Randomly Accessing Sequences},
year = {2020} }
TY - EJOUR
T1 - Compressing and Randomly Accessing Sequences
AU - Laith Ali Abdulsahib; Diego Arroyuelo; Rajeev Raman
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5109
ER -
Laith Ali Abdulsahib, Diego Arroyuelo, Rajeev Raman. (2020). Compressing and Randomly Accessing Sequences. IEEE SigPort. http://sigport.org/5109
Laith Ali Abdulsahib, Diego Arroyuelo, Rajeev Raman, 2020. Compressing and Randomly Accessing Sequences. Available at: http://sigport.org/5109.
Laith Ali Abdulsahib, Diego Arroyuelo, Rajeev Raman. (2020). "Compressing and Randomly Accessing Sequences." Web.
1. Laith Ali Abdulsahib, Diego Arroyuelo, Rajeev Raman. Compressing and Randomly Accessing Sequences [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5109

Decompressing Lempel-Ziv compressed text


We consider the problem of decompressing the Lempel--Ziv 77 representation of a string $S$ of length $n$ using a working space as close as possible to the size $z$ of the input. The folklore solution for the problem runs in $O(n)$ time but requires random access to the whole decompressed text. Another folklore solution is to convert LZ77 into a grammar of size $O(z\log(n/z))$ and then stream $S$ in linear time. In this paper, we show that $O(n)$ time and $O(z)$ working space can be achieved for constant-size alphabets.

Paper Details

Authors:
Philip Bille, Mikko Berggren Ettienne, Travis Gagie, Inge Li Gortz, Nicola Prezza
Submitted On:
24 April 2020 - 12:40pm
Short Link:
Type:
Event:
Presenter's Name:
Session:
Document Year:
Cite

Document Files

DCC '20 slides for Travis Gagie (embedded audio files)

(89)

Keywords

Additional Categories

Subscribe

[1] Philip Bille, Mikko Berggren Ettienne, Travis Gagie, Inge Li Gortz, Nicola Prezza, "Decompressing Lempel-Ziv compressed text", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5090. Accessed: Sep. 25, 2020.
@article{5090-20,
url = {http://sigport.org/5090},
author = {Philip Bille; Mikko Berggren Ettienne; Travis Gagie; Inge Li Gortz; Nicola Prezza },
publisher = {IEEE SigPort},
title = {Decompressing Lempel-Ziv compressed text},
year = {2020} }
TY - EJOUR
T1 - Decompressing Lempel-Ziv compressed text
AU - Philip Bille; Mikko Berggren Ettienne; Travis Gagie; Inge Li Gortz; Nicola Prezza
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5090
ER -
Philip Bille, Mikko Berggren Ettienne, Travis Gagie, Inge Li Gortz, Nicola Prezza. (2020). Decompressing Lempel-Ziv compressed text. IEEE SigPort. http://sigport.org/5090
Philip Bille, Mikko Berggren Ettienne, Travis Gagie, Inge Li Gortz, Nicola Prezza, 2020. Decompressing Lempel-Ziv compressed text. Available at: http://sigport.org/5090.
Philip Bille, Mikko Berggren Ettienne, Travis Gagie, Inge Li Gortz, Nicola Prezza. (2020). "Decompressing Lempel-Ziv compressed text." Web.
1. Philip Bille, Mikko Berggren Ettienne, Travis Gagie, Inge Li Gortz, Nicola Prezza. Decompressing Lempel-Ziv compressed text [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5090

Approximating Optimal Bidirectional Macro Schemes.


Lempel-Ziv is an easy-to-compute member of a wide family of so-called macro schemes; it restricts pointers to go in one direction only. Optimal bidirectional macro schemes are NP-complete to find, but they may provide much better compression on highly repetitive sequences. We consider the problem of approximating optimal bidirectional macro schemes. We describe a simulated annealing algorithm that usually converges quickly. Moreover, in some cases, we obtain bidirectional macro schemes that are provably a 2-approximation of the optimal.

Paper Details

Authors:
Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco
Submitted On:
31 March 2020 - 4:25pm
Short Link:
Type:
Event:
Session:

Document Files

DCC-2020-Approximating_Optimal_Bidirectional_Macro_Schemes.pptx

(70)

Keywords

Additional Categories

Subscribe

[1] Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco, "Approximating Optimal Bidirectional Macro Schemes.", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5083. Accessed: Sep. 25, 2020.
@article{5083-20,
url = {http://sigport.org/5083},
author = {Luís M. S. Russo; Ana D. Correia; Gonzalo Navarro; Alexandre P. Francisco },
publisher = {IEEE SigPort},
title = {Approximating Optimal Bidirectional Macro Schemes.},
year = {2020} }
TY - EJOUR
T1 - Approximating Optimal Bidirectional Macro Schemes.
AU - Luís M. S. Russo; Ana D. Correia; Gonzalo Navarro; Alexandre P. Francisco
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5083
ER -
Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco. (2020). Approximating Optimal Bidirectional Macro Schemes.. IEEE SigPort. http://sigport.org/5083
Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco, 2020. Approximating Optimal Bidirectional Macro Schemes.. Available at: http://sigport.org/5083.
Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco. (2020). "Approximating Optimal Bidirectional Macro Schemes.." Web.
1. Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco. Approximating Optimal Bidirectional Macro Schemes. [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5083

Grammar compression with probabilistic context-free grammar


We propose a new approach for universal lossless text compression, based on grammar compression. In the literature, a target string T has been compressed as a context-free grammar G in Chomsky normal form satisfying L(G) = T. Such a grammar is often called a straight-line program (SLP). In this work, we consider a probabilistic grammar G that generates T, but not necessarily as a unique element of L(G). In order to recover the original text T unambiguously, we keep both the grammar G and the derivation tree of T from the start symbol in G, in compressed form.

Paper Details

Authors:
Hiroaki Naganuma, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara, Naoki Kobayashi
Submitted On:
31 March 2020 - 2:53am
Short Link:
Type:
Event:
Presenter's Name:
Session:
Document Year:
Cite

Document Files

dcc_poster.pdf

(72)

Keywords

Additional Categories

Subscribe

[1] Hiroaki Naganuma, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara, Naoki Kobayashi, "Grammar compression with probabilistic context-free grammar", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5074. Accessed: Sep. 25, 2020.
@article{5074-20,
url = {http://sigport.org/5074},
author = {Hiroaki Naganuma; Diptarama Hendrian; Ryo Yoshinaka; Ayumi Shinohara; Naoki Kobayashi },
publisher = {IEEE SigPort},
title = {Grammar compression with probabilistic context-free grammar},
year = {2020} }
TY - EJOUR
T1 - Grammar compression with probabilistic context-free grammar
AU - Hiroaki Naganuma; Diptarama Hendrian; Ryo Yoshinaka; Ayumi Shinohara; Naoki Kobayashi
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5074
ER -
Hiroaki Naganuma, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara, Naoki Kobayashi. (2020). Grammar compression with probabilistic context-free grammar. IEEE SigPort. http://sigport.org/5074
Hiroaki Naganuma, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara, Naoki Kobayashi, 2020. Grammar compression with probabilistic context-free grammar. Available at: http://sigport.org/5074.
Hiroaki Naganuma, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara, Naoki Kobayashi. (2020). "Grammar compression with probabilistic context-free grammar." Web.
1. Hiroaki Naganuma, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara, Naoki Kobayashi. Grammar compression with probabilistic context-free grammar [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5074

Reverse Multi-Delimiter Compression Codes


An enhanced version of a recently introduced family of variable length binary codes with multiple pattern delimiters is presented and discussed. These codes are complete, universal, synchronizable, they have monotonic indexing and allow a standard search in compressed files. Comparing the compression rate on natural language texts demonstrates that introduced codes appear to be much superior to other known codes with similar properties. A fast byte-aligned decoding algorithm is constructed, which operates much faster than the one for Fibonacci codes.

Paper Details

Authors:
Anatoly Anisimov
Submitted On:
29 March 2020 - 4:53am
Short Link:
Type:
Event:
Presenter's Name:
Session:
Document Year:
Cite

Document Files

presentation.pptx

(70)

Keywords

Additional Categories

Subscribe

[1] Anatoly Anisimov, "Reverse Multi-Delimiter Compression Codes", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5039. Accessed: Sep. 25, 2020.
@article{5039-20,
url = {http://sigport.org/5039},
author = {Anatoly Anisimov },
publisher = {IEEE SigPort},
title = {Reverse Multi-Delimiter Compression Codes},
year = {2020} }
TY - EJOUR
T1 - Reverse Multi-Delimiter Compression Codes
AU - Anatoly Anisimov
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5039
ER -
Anatoly Anisimov. (2020). Reverse Multi-Delimiter Compression Codes. IEEE SigPort. http://sigport.org/5039
Anatoly Anisimov, 2020. Reverse Multi-Delimiter Compression Codes. Available at: http://sigport.org/5039.
Anatoly Anisimov. (2020). "Reverse Multi-Delimiter Compression Codes." Web.
1. Anatoly Anisimov. Reverse Multi-Delimiter Compression Codes [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5039

An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition


In this paper, we examine the problem of modeling overdispersed frequency vectors that are naturally generated by several machine learning and computer vision applications.

Paper Details

Authors:
Nuha Zamzami, and Nizar Bouguila
Submitted On:
9 November 2019 - 7:08am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

MSD_MeshPres.pdf

(84)

Subscribe

[1] Nuha Zamzami, and Nizar Bouguila , "An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4940. Accessed: Sep. 25, 2020.
@article{4940-19,
url = {http://sigport.org/4940},
author = {Nuha Zamzami; and Nizar Bouguila },
publisher = {IEEE SigPort},
title = {An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition},
year = {2019} }
TY - EJOUR
T1 - An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition
AU - Nuha Zamzami; and Nizar Bouguila
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4940
ER -
Nuha Zamzami, and Nizar Bouguila . (2019). An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition. IEEE SigPort. http://sigport.org/4940
Nuha Zamzami, and Nizar Bouguila , 2019. An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition. Available at: http://sigport.org/4940.
Nuha Zamzami, and Nizar Bouguila . (2019). "An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition." Web.
1. Nuha Zamzami, and Nizar Bouguila . An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4940

An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition


In this paper, we examine the problem of modeling overdispersed frequency vectors that are naturally generated by several machine learning and computer vision applications.

Paper Details

Authors:
Nuha Zamzami, and Nizar Bouguila
Submitted On:
9 November 2019 - 7:05am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

MSD_Mesh.pdf

(93)

Subscribe

[1] Nuha Zamzami, and Nizar Bouguila , "An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4939. Accessed: Sep. 25, 2020.
@article{4939-19,
url = {http://sigport.org/4939},
author = {Nuha Zamzami; and Nizar Bouguila },
publisher = {IEEE SigPort},
title = {An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition},
year = {2019} }
TY - EJOUR
T1 - An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition
AU - Nuha Zamzami; and Nizar Bouguila
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4939
ER -
Nuha Zamzami, and Nizar Bouguila . (2019). An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition. IEEE SigPort. http://sigport.org/4939
Nuha Zamzami, and Nizar Bouguila , 2019. An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition. Available at: http://sigport.org/4939.
Nuha Zamzami, and Nizar Bouguila . (2019). "An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition." Web.
1. Nuha Zamzami, and Nizar Bouguila . An Accurate Evaluation of MSD Log-likelihood and its Application in Human Action Recognition [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4939

Fuzzy Personalized Scoring Model for Recommendation System


In this research, we aim to propose a data preprocessing framework particularly for financial sector to generate the rating data as input to the collaborative system. First, clustering technique is applied to cluster all users based on their demographic information which might be able to differentiate the customers’ background. Then, for each customer group, the importance of demographic characteristics which are highly associated with financial products purchasing are analyzed by the proposed fuzzy integral technique.

Paper Details

Authors:
Chao-Lung Yang, Shang-Che Hsu, Kai-Lung Hua, Wen-Huang Cheng
Submitted On:
8 May 2019 - 7:40am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Poster_design_20190508_final.pdf

(129)

Subscribe

[1] Chao-Lung Yang, Shang-Che Hsu, Kai-Lung Hua, Wen-Huang Cheng, "Fuzzy Personalized Scoring Model for Recommendation System", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4077. Accessed: Sep. 25, 2020.
@article{4077-19,
url = {http://sigport.org/4077},
author = {Chao-Lung Yang; Shang-Che Hsu; Kai-Lung Hua; Wen-Huang Cheng },
publisher = {IEEE SigPort},
title = {Fuzzy Personalized Scoring Model for Recommendation System},
year = {2019} }
TY - EJOUR
T1 - Fuzzy Personalized Scoring Model for Recommendation System
AU - Chao-Lung Yang; Shang-Che Hsu; Kai-Lung Hua; Wen-Huang Cheng
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4077
ER -
Chao-Lung Yang, Shang-Che Hsu, Kai-Lung Hua, Wen-Huang Cheng. (2019). Fuzzy Personalized Scoring Model for Recommendation System. IEEE SigPort. http://sigport.org/4077
Chao-Lung Yang, Shang-Che Hsu, Kai-Lung Hua, Wen-Huang Cheng, 2019. Fuzzy Personalized Scoring Model for Recommendation System. Available at: http://sigport.org/4077.
Chao-Lung Yang, Shang-Che Hsu, Kai-Lung Hua, Wen-Huang Cheng. (2019). "Fuzzy Personalized Scoring Model for Recommendation System." Web.
1. Chao-Lung Yang, Shang-Che Hsu, Kai-Lung Hua, Wen-Huang Cheng. Fuzzy Personalized Scoring Model for Recommendation System [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4077

GlobalSIP 2018 Keynote: Tensors and Probability: An Intriguing Union (N. Sidiropoulos, N. Kargas, X. Fu)


We reveal an interesting link between tensors and multivariate statistics. The rank of a multivariate probability tensor can be interpreted as a nonlinear measure of statistical dependence of the associated random variables. Rank equals one when the random variables are independent, and complete statistical dependence corresponds to full rank; but we show that rank as low as two can already model strong statistical dependence.

Paper Details

Authors:
N.D. Sidiropoulos, N. Kargas, X. Fu
Submitted On:
24 December 2018 - 8:25pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

GlobalSIP 2018 Keynote: Tensors and Probability: An Intriguing Union (N. Sidiropoulos, N. Kargas, X. Fu)

(254)

Subscribe

[1] N.D. Sidiropoulos, N. Kargas, X. Fu, "GlobalSIP 2018 Keynote: Tensors and Probability: An Intriguing Union (N. Sidiropoulos, N. Kargas, X. Fu)", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3842. Accessed: Sep. 25, 2020.
@article{3842-18,
url = {http://sigport.org/3842},
author = {N.D. Sidiropoulos; N. Kargas; X. Fu },
publisher = {IEEE SigPort},
title = {GlobalSIP 2018 Keynote: Tensors and Probability: An Intriguing Union (N. Sidiropoulos, N. Kargas, X. Fu)},
year = {2018} }
TY - EJOUR
T1 - GlobalSIP 2018 Keynote: Tensors and Probability: An Intriguing Union (N. Sidiropoulos, N. Kargas, X. Fu)
AU - N.D. Sidiropoulos; N. Kargas; X. Fu
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3842
ER -
N.D. Sidiropoulos, N. Kargas, X. Fu. (2018). GlobalSIP 2018 Keynote: Tensors and Probability: An Intriguing Union (N. Sidiropoulos, N. Kargas, X. Fu). IEEE SigPort. http://sigport.org/3842
N.D. Sidiropoulos, N. Kargas, X. Fu, 2018. GlobalSIP 2018 Keynote: Tensors and Probability: An Intriguing Union (N. Sidiropoulos, N. Kargas, X. Fu). Available at: http://sigport.org/3842.
N.D. Sidiropoulos, N. Kargas, X. Fu. (2018). "GlobalSIP 2018 Keynote: Tensors and Probability: An Intriguing Union (N. Sidiropoulos, N. Kargas, X. Fu)." Web.
1. N.D. Sidiropoulos, N. Kargas, X. Fu. GlobalSIP 2018 Keynote: Tensors and Probability: An Intriguing Union (N. Sidiropoulos, N. Kargas, X. Fu) [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3842

Dynamic Network Identification From Non-stationary Vector Autoregressive Time Series


Learning the dynamics of complex systems features a large number of applications in data science. Graph-based modeling and inference underpins the most prominent family of approaches to learn complex dynamics due to their ability to capture the intrinsic sparsity of direct interactions in such systems. They also provide the user with interpretable graphs that unveil behavioral patterns and changes.

Paper Details

Authors:
Daniel Romero, Bakht Zaman, Baltasar Beferull-Lozano
Submitted On:
1 March 2019 - 9:08pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Dynamic network identification poster

(176)

Keywords

Additional Categories

Subscribe

[1] Daniel Romero, Bakht Zaman, Baltasar Beferull-Lozano, "Dynamic Network Identification From Non-stationary Vector Autoregressive Time Series", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3841. Accessed: Sep. 25, 2020.
@article{3841-18,
url = {http://sigport.org/3841},
author = {Daniel Romero; Bakht Zaman; Baltasar Beferull-Lozano },
publisher = {IEEE SigPort},
title = {Dynamic Network Identification From Non-stationary Vector Autoregressive Time Series},
year = {2018} }
TY - EJOUR
T1 - Dynamic Network Identification From Non-stationary Vector Autoregressive Time Series
AU - Daniel Romero; Bakht Zaman; Baltasar Beferull-Lozano
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3841
ER -
Daniel Romero, Bakht Zaman, Baltasar Beferull-Lozano. (2018). Dynamic Network Identification From Non-stationary Vector Autoregressive Time Series. IEEE SigPort. http://sigport.org/3841
Daniel Romero, Bakht Zaman, Baltasar Beferull-Lozano, 2018. Dynamic Network Identification From Non-stationary Vector Autoregressive Time Series. Available at: http://sigport.org/3841.
Daniel Romero, Bakht Zaman, Baltasar Beferull-Lozano. (2018). "Dynamic Network Identification From Non-stationary Vector Autoregressive Time Series." Web.
1. Daniel Romero, Bakht Zaman, Baltasar Beferull-Lozano. Dynamic Network Identification From Non-stationary Vector Autoregressive Time Series [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3841

Pages