Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering

Abstract: 

Multi-modality fusion technologies have greatly improved the performance of neural network-based Video Description/Caption, Visual Question Answering (VQA) and Audio Visual Scene-aware Di-alog (AVSD) over the recent years. Most previous approaches only explore the last layers of multiple layer feature fusion while omit-ting the importance of intermediate layers. To solve the issue for the intermediate layers, we propose an efficient Quaternion Block Net-work (QBN) to learn interaction not only for the last layer but also for all intermediate layers simultaneously. In our proposed QBN, we use the holistic text features to guide the update of visual features. In the meantime, Hamilton quaternion products can efficiently perform information flow from higher layers to lower layers for both visual and text modalities. The evaluation results show our QBN improved the performance on VQA 2.0, furthermore surpassed the approach us-ing large scale BERT or visual BERT pre-trained models. Extensiveablation study has been carried out to examine the influence of each proposed module in this study.

up
0 users have voted:

Paper Details

Authors:
Lei shi,Shijie Geng, Kai Shuang,Chiori Hori, Songxiang Liu, Peng Gao, Sen Su
Submitted On:
16 May 2020 - 1:27am
Short Link:
Type:
Poster
Event:
Paper Code:
MMSP-P1.5 [5942]
Document Year:
2020
Cite

Document Files

Icassp2020_Multi-Layer_Content_Interaction_Through_Quaternion_Product_for_Visual_Question_Answering

(16)

Subscribe

[1] Lei shi,Shijie Geng, Kai Shuang,Chiori Hori, Songxiang Liu, Peng Gao, Sen Su, "Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5370. Accessed: Jul. 15, 2020.
@article{5370-20,
url = {http://sigport.org/5370},
author = {Lei shi;Shijie Geng; Kai Shuang;Chiori Hori; Songxiang Liu; Peng Gao; Sen Su },
publisher = {IEEE SigPort},
title = {Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering},
year = {2020} }
TY - EJOUR
T1 - Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering
AU - Lei shi;Shijie Geng; Kai Shuang;Chiori Hori; Songxiang Liu; Peng Gao; Sen Su
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5370
ER -
Lei shi,Shijie Geng, Kai Shuang,Chiori Hori, Songxiang Liu, Peng Gao, Sen Su. (2020). Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering. IEEE SigPort. http://sigport.org/5370
Lei shi,Shijie Geng, Kai Shuang,Chiori Hori, Songxiang Liu, Peng Gao, Sen Su, 2020. Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering. Available at: http://sigport.org/5370.
Lei shi,Shijie Geng, Kai Shuang,Chiori Hori, Songxiang Liu, Peng Gao, Sen Su. (2020). "Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering." Web.
1. Lei shi,Shijie Geng, Kai Shuang,Chiori Hori, Songxiang Liu, Peng Gao, Sen Su. Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5370