Sorry, you need to enable JavaScript to visit this website.

LEARNING GEOMETRIC FEATURES WITH DUAL-STREAM CNN FOR 3D ACTION RECOGNITION

Citation Author(s):
Cam-Hao Hua, Nguyen Anh Tu, Dong-Seong Kim
Submitted by:
Thien Huynh-The
Last updated:
14 May 2020 - 10:53am
Document Type:
Presentation Slides
Document Year:
2020
Event:
Presenters:
Thien Huynh-The
Paper Code:
IVMSP-P5.5
 

Recently, regarding several beneficial properties of depth camera, numerous 3D action recognition frameworks have studied high-level features by exploiting deep learning techniques, but nevertheless they cannot seize the meaningful characteristics of static human pose and dynamic action motion of a whole action. This paper introduces a deep network configured by two parallel streams of convolutional stacks for fully learning the deep intra-frame joint associations and inter-frame joint correlations, wherein the structure of each stream is learned from Inception-v3. In experiments, besides the compatibility verification with various backbone networks, the proposed approach achieves state-of-the-art performance in battle with several deep learning-based methods on the updated NTU RGB+D 120 dataset.

up
0 users have voted:

Comments

Question: What is network parameters the proposed method, compared with existing methods (in Table of Method Comparison in the slide)?
Answer: Thank for your interesting question. In this paper, we introduce a framework of dual-stream CNN for 3D human action recognition, in which each stream can be the transfer learning from a CNN backbone. For example, in the Table, we show the accuracy with different networks, such as GoogleNet, ResNet, and DenseNet. The network parameter includes the weights and bias of kernels in the convolutional layers and of hidden nodes in the fully connected layers. Following the network architecture of CNNs in the comparison, VGG-19 is the heaviest one. We hope that our answer can satisfy your query.