Action Recognition In RGB-D Egocentric Videos

In this paper, we investigate the problem of action recognition in RGB-D egocentric videos. These self-generated and embodied videos provide richer semantic cues than the conventional videos captured from the third-person view for action recognition. Moreover, they contain both appearance information and 3D structure of the scenes from the RGB modality and depth modality respectively. Motivated by these advantages,
we first collect a video-based RGB-D egocentric dataset (THU-READ) with diverse types of daily-life actions. Then we evaluate several approaches including hand-crafted features and deep learning methods on THU-READ. To improve the performance, we further develop a tri-stream convolutional network (TCNet) method, which learns to exploit the fuse with both the RGB and depth modalities for action recognition.
Experimental results show that our model achieves competitive performance with state-of-the-art methods.

ICIP_poster.pdf

ICIP_poster.pdf (784)

Thumbs Up

CITE

Documents

Poster

Action Recognition In RGB-D Egocentric Videos

ICIP_poster.pdf

QUESTIONS?