Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Summarization of Human Activity Videos Using a Salient Dictionary

Abstract: 

Video summarization has become more prominent during the last decade, due to the massive amount of available digital video content. A video summarization algorithm is typically fed an input video and expected to extract a set of important key-frames which represent the entire content, convey semantic meaning and are significantly more concise than the original input. The most wide-spread approach relies on video frame clustering and extraction of the frames closest to the cluster centroids as key-frames. Such a process, although efficient, offloads the burden of semantic scene content modelling exclusively to the employed video frame description/representation scheme, while summarization itself is approached simply as a distance-based data partitioning problem. This work focuses on videos depicting human activities (e.g., from surveillance feeds) which display an attractive property, i.e., each video frame can be seen as a linear combination of elementary visual words (i.e., basic activity components). This is exploited so as to identify the video frames containing only the elementary visual building blocks, which ideally form a set of independent basis vectors that can linearly reconstruct the entire video. In this manner, the semantic content of the scene is considered by the video summarization process itself. The above process is modulated by a traditional distance-based video frame saliency estimation, biasing towards more spread content coverage and outlier inclusion, under a joint optimization framework derived from the Column Subset Selection Problem (CSSP). The proposed algorithm results in a final key-frame set which acts as as salient dictionary for the input video. Empirical evaluation conducted on a publicly available dataset suggest that the presented method outperforms both a baseline clustering-based approach and a state-of-the-art sparse dictionary learning-based algorithm.

up
0 users have voted:

Paper Details

Authors:
Anastasios Tefas, Ioannis Pitas
Submitted On:
13 September 2017 - 11:18am
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
Ioannis Mademlis
Paper Code:
2917
Document Year:
2017
Cite

Document Files

Key-frame extraction from human activity videos via salient dictionary learning-based video summarization

(8 downloads)

Subscribe

[1] Anastasios Tefas, Ioannis Pitas, "Summarization of Human Activity Videos Using a Salient Dictionary", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1976. Accessed: Sep. 24, 2017.
@article{1976-17,
url = {http://sigport.org/1976},
author = {Anastasios Tefas; Ioannis Pitas },
publisher = {IEEE SigPort},
title = {Summarization of Human Activity Videos Using a Salient Dictionary},
year = {2017} }
TY - EJOUR
T1 - Summarization of Human Activity Videos Using a Salient Dictionary
AU - Anastasios Tefas; Ioannis Pitas
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1976
ER -
Anastasios Tefas, Ioannis Pitas. (2017). Summarization of Human Activity Videos Using a Salient Dictionary. IEEE SigPort. http://sigport.org/1976
Anastasios Tefas, Ioannis Pitas, 2017. Summarization of Human Activity Videos Using a Salient Dictionary. Available at: http://sigport.org/1976.
Anastasios Tefas, Ioannis Pitas. (2017). "Summarization of Human Activity Videos Using a Salient Dictionary." Web.
1. Anastasios Tefas, Ioannis Pitas. Summarization of Human Activity Videos Using a Salient Dictionary [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1976