Sorry, you need to enable JavaScript to visit this website.

Action Recognition In RGB-D Egocentric Videos

Citation Author(s):
Yansong Tang, Yi Tian, Jiwen Lu, Jianjiang Feng, Jie Zhou
Submitted by:
Yansong Tang
Last updated:
6 September 2017 - 9:35am
Document Type:
Document Year:
Presenters Name:
Yansong Tang
Paper Code:



In this paper, we investigate the problem of action recognition in RGB-D egocentric videos. These self-generated and embodied videos provide richer semantic cues than the conventional videos captured from the third-person view for action recognition. Moreover, they contain both appearance information and 3D structure of the scenes from the RGB modality and depth modality respectively. Motivated by these advantages,
we first collect a video-based RGB-D egocentric dataset (THU-READ) with diverse types of daily-life actions. Then we evaluate several approaches including hand-crafted features and deep learning methods on THU-READ. To improve the performance, we further develop a tri-stream convolutional network (TCNet) method, which learns to exploit the fuse with both the RGB and depth modalities for action recognition.
Experimental results show that our model achieves competitive performance with state-of-the-art methods.

0 users have voted:

Dataset Files