3D POSE ESTIMATION FROM MONOCULAR VIDEO WITH CAMERA-BONE ANGLE REGULARIZATION ON THE IMAGE FEATURE

DOI:: 10.60864/7zpg-qv60
Citation Author(s):
Submitted by:: Asuka Ishii
Last updated:: 6 June 2024 - 10:32am
Document Type:: Poster
Document Year:: 2024
Event:: IEEE ICASSP 2024
Paper Code:: IVMSP-P17.10

Categories:: Image/Video Processing

In this paper, we propose a monocular 3D pose estimation method which explicitly takes into account the angles between the camera optical axis and bones (camera-bone angles) as well as temporal information. The proposed method combines a 2D-to-3D-based method, which predicts a 3D pose from a sequence of 2D poses, and convolutional neural network (CNN) and includes novel regularization loss to enable the CNN to extract camera-bone-angle information. The camera-bone-angle and temporal information suppress ambiguity of 2D-to-3D-based methods where the same 2D pose can be mapped to multiple 3D poses. Experiments on the Human3.6M and MPI-INF-3DHP datasets showed that the proposed method improved the performance by 5.1 mm and 2.1 mm in terms of mean per joint position error (MPJPE) respectively.

20240402_poster_v2.pdf

20240402_poster_v2.pdf (148)

Thumbs Up

CITE

Documents

Poster

3D POSE ESTIMATION FROM MONOCULAR VIDEO WITH CAMERA-BONE ANGLE REGULARIZATION ON THE IMAGE FEATURE

20240402_poster_v2.pdf

QUESTIONS?