Sorry, you need to enable JavaScript to visit this website.

Most End-to-End (E2E) Spoken Language Understanding (SLU) networks leverage the pre-trained Automatic Speech Recognition (ASR) networks but still lack the capability to understand the semantics of utterances, crucial for the SLU task. To solve this, recently proposed studies use pre-trained Natural Language Understanding (NLU) networks. However, it is not trivial to fully utilize both pre-trained networks; many solutions were proposed, such as Knowledge Distillation (KD), cross-modal shared embedding, and network integration with Interface.

Categories:
5 Views