Sorry, you need to enable JavaScript to visit this website.

S2E: Towards an End-to-End Entity Resolution Solution from Acoustic Signal

Citation Author(s):
Kangrui Ruan, Xin He, Jiyang Wang, Xiaozhou Zhou, Helian Feng, Ali Kebarighotbi
Submitted by:
Ali Kebarighotbi
Last updated:
4 April 2024 - 11:53am
Document Type:
Poster
Document Year:
2024
Event:
Presenters:
Ali Kebarighotbi
Paper Code:
SLP-P33.7
 

Traditional cascading Entity Resolution (ER) pipeline suffers from propagated errors from upstream tasks. We address this is-sue by formulating a new end-to-end (E2E) ER problem, Signal-to-Entity (S2E), resolving query entity mentions to actionable entities in textual catalogs directly from audio queries instead of audio transcriptions in raw or parsed format. Additionally, we extend the E2E Spoken Language Understanding framework by introducing a novel dimension to ER research. We adapt three public datasets for the S2E task, and propose a novel solution, which aligns the multimodal signals via an effective retrieval co-attention mechanism and refined multimodal objectives. Despite 42% smaller in terms of the total model size, the proposed design outperforms the cascading baseline by 2.6%, 47.0%, and 73.3%across the three datasets respectively with different acoustic conditions.

up
0 users have voted: