Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION

Abstract: 

Despite the recent success of multi-modal action recognition in videos, in reality, we usually confront the situation that some data are not available beforehand, especially for multimodal data. For example, while vision and audio data are required to address the multi-modal action recognition, audio tracks in videos are easily lost due to the broken files or the limitation of devices. To cope with this sound-missing problem, we present an approach to simulating deep audio feature from merely spatial-temporal vision data. We demonstrate that adding the simulating sound feature can significantly assist the multi-modal action recognition task. Evaluating our method on the Moments in Time (MIT) Dataset , we show that our proposed method performs favorably against the two-stream architecture, enabling a richer understanding of multi-modal action recognition in video.

up
0 users have voted:

Paper Details

Authors:
Hu-Cheng Lee, Chih-Yu Lin, Pin-Chun Hsu, Winston H. Hsu
Submitted On:
14 May 2019 - 5:08am
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
HU-CHENG LEE
Paper Code:
ICASSP19005

Document Files

20190516_AUDIO_FEATURE_GENERATION_FOR_MISSING_MODALITY_PROBLEM_IN_VIDEO_ACTION_RECOGNITION.pptx

(21)

Subscribe

[1] Hu-Cheng Lee, Chih-Yu Lin, Pin-Chun Hsu, Winston H. Hsu, "AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4504. Accessed: Jul. 20, 2019.
@article{4504-19,
url = {http://sigport.org/4504},
author = {Hu-Cheng Lee; Chih-Yu Lin; Pin-Chun Hsu; Winston H. Hsu },
publisher = {IEEE SigPort},
title = {AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION},
year = {2019} }
TY - EJOUR
T1 - AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION
AU - Hu-Cheng Lee; Chih-Yu Lin; Pin-Chun Hsu; Winston H. Hsu
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4504
ER -
Hu-Cheng Lee, Chih-Yu Lin, Pin-Chun Hsu, Winston H. Hsu. (2019). AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION. IEEE SigPort. http://sigport.org/4504
Hu-Cheng Lee, Chih-Yu Lin, Pin-Chun Hsu, Winston H. Hsu, 2019. AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION. Available at: http://sigport.org/4504.
Hu-Cheng Lee, Chih-Yu Lin, Pin-Chun Hsu, Winston H. Hsu. (2019). "AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION." Web.
1. Hu-Cheng Lee, Chih-Yu Lin, Pin-Chun Hsu, Winston H. Hsu. AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4504