Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Deep Tree Models for ‘Big’ Biological Data

Abstract: 

The identification of useful temporal dependence structure in discrete time series data is an important component of algorithms applied to many tasks in statistical inference and machine learning, and used in a wide variety of problems across the spectrum of biological studies. Most of the early statistical approaches were ineffective in practice, because the amount of data required for reliable modelling grew exponentially with memory length. On the other hand, many of the more modern methodological approaches that make use of more flexible and parsimonious models result in algorithms that do not scale well and are computationally ineffective for larger data sets. In this paper we describe a class of novel methodological tools for effective Bayesian inference for general discrete time series, motivated primarily by questions regarding data originating from studies in genetics and neuroscience.

Our starting point is the development of a rich class of Bayesian hierarchical models for variable-memory Markov chains. The particular prior structure we adopt makes it possible to design effective, linear-time algorithms that can compute most of the important features of the relevant posterior and predictive distributions without resorting to Markov chain Monte Carlo simulation. The origin of some of these algorithms can be traced to the family of Context Tree Weighting (CTW) algorithms developed for data compression since the mid-1990s. We have used the resulting methodological tools in numerous application-specific tasks (including prediction, segmentation, classification, anomaly detection, entropy estimation, and causality testing) on data from different areas of application. The results obtained compare quite favourably with those obtained using earlier approaches, such as Probabilistic Suffix Trees (PST), Variable-Length Markov Chains (VLMC), and the class of Markov Transition Distributions (MTD).

up
0 users have voted:

Paper Details

Authors:
Lambros Mertzanis, Athina Panotopoulou, Maria Skoularidou
Submitted On:
24 June 2018 - 9:36am
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
Ioannis Kontoyiannis
Document Year:
2018
Cite

Document Files

Kontoyianni_slides.pdf

(70 downloads)

Subscribe

[1] Lambros Mertzanis, Athina Panotopoulou, Maria Skoularidou, "Deep Tree Models for ‘Big’ Biological Data", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3320. Accessed: Nov. 16, 2018.
@article{3320-18,
url = {http://sigport.org/3320},
author = {Lambros Mertzanis; Athina Panotopoulou; Maria Skoularidou },
publisher = {IEEE SigPort},
title = {Deep Tree Models for ‘Big’ Biological Data},
year = {2018} }
TY - EJOUR
T1 - Deep Tree Models for ‘Big’ Biological Data
AU - Lambros Mertzanis; Athina Panotopoulou; Maria Skoularidou
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3320
ER -
Lambros Mertzanis, Athina Panotopoulou, Maria Skoularidou. (2018). Deep Tree Models for ‘Big’ Biological Data. IEEE SigPort. http://sigport.org/3320
Lambros Mertzanis, Athina Panotopoulou, Maria Skoularidou, 2018. Deep Tree Models for ‘Big’ Biological Data. Available at: http://sigport.org/3320.
Lambros Mertzanis, Athina Panotopoulou, Maria Skoularidou. (2018). "Deep Tree Models for ‘Big’ Biological Data." Web.
1. Lambros Mertzanis, Athina Panotopoulou, Maria Skoularidou. Deep Tree Models for ‘Big’ Biological Data [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3320