Sorry, you need to enable JavaScript to visit this website.

MultiWay-Adapter: Adapting Multimodal Large Language Models for scalable image-text retrieval

DOI:
10.60864/s4hn-jg87
Citation Author(s):
George Killick,Richard McCreadie,Gerardo Aragon Camarasa
Submitted by:
ZIJUN LONG
Last updated:
6 June 2024 - 10:27am
Document Type:
Poster
Document Year:
2024
Event:
Presenters:
Zijun Long
Paper Code:
MLSP-P5.1
 

As Multimodal Large Language Models (MLLMs) grow in size, adapting them to specialized tasks becomes increasingly challenging due to high computational and memory demands. While efficient adaptation methods exist, in practice they suffer from shallow inter-modal alignment, which severely hurts model effectiveness. To tackle these challenges, we introduce the MultiWay-Adapter (MWA), which deepens inter-modal alignment, enabling high transferability with minimal tuning effort.

up
1 user has voted: ZIJUN LONG