Documents
Poster
Action Recognition In RGB-D Egocentric Videos
- Citation Author(s):
- Submitted by:
- Yansong Tang
- Last updated:
- 6 September 2017 - 9:35am
- Document Type:
- Poster
- Document Year:
- 2017
- Event:
- Presenters:
- Yansong Tang
- Paper Code:
- ICIP-2674
- Categories:
- Log in to post comments
In this paper, we investigate the problem of action recognition in RGB-D egocentric videos. These self-generated and embodied videos provide richer semantic cues than the conventional videos captured from the third-person view for action recognition. Moreover, they contain both appearance information and 3D structure of the scenes from the RGB modality and depth modality respectively. Motivated by these advantages,
we first collect a video-based RGB-D egocentric dataset (THU-READ) with diverse types of daily-life actions. Then we evaluate several approaches including hand-crafted features and deep learning methods on THU-READ. To improve the performance, we further develop a tri-stream convolutional network (TCNet) method, which learns to exploit the fuse with both the RGB and depth modalities for action recognition.
Experimental results show that our model achieves competitive performance with state-of-the-art methods.