FUSION OF INDEPENDENT AND INTERACTIVE FEATURES FOR HUMAN-OBJECT INTERACTION DETECTION

Human-Object Interaction (HOI) detection, which aims to identify humans and objects with interactive behaviors in images and predict the behaviors between them, is of great significance for semantic understanding. The existing works primarily focus on exploring the fine-grained semantic features of humans and objects, as well as the spatial relationships between them. However, these methods do not leverage the contextual information within the interaction area, which could potentially be valuable for predicting interaction behavior. To investigate the impact of contextual information on behavior prediction, we propose a novel approach to extract both independent and interactive features and fuse them. Specifically, our method is capable of extracting interaction features from the interaction region. These features are then merged with fine-grained independent features of humans and objects. Finally, the fused features are utilized to predict interaction behavior. In addition, the feature fusion module does not add extra storage and computation costs to our method. Experiments demonstrate the effectiveness of our method, achieving state-of-the-art performance on two benchmark HOI datasets, namely HCO-DET and V-COCO.

icip.pptx

icip.pptx (383)

Thumbs Up

CITE

Documents

Presentation Slides

FUSION OF INDEPENDENT AND INTERACTIVE FEATURES FOR HUMAN-OBJECT INTERACTION DETECTION

icip.pptx

QUESTIONS?