Documents
Poster
Poster
MultiWay-Adapter: Adapting Multimodal Large Language Models for scalable image-text retrieval
- DOI:
- 10.60864/s4hn-jg87
- Citation Author(s):
- Submitted by:
- ZIJUN LONG
- Last updated:
- 6 June 2024 - 10:27am
- Document Type:
- Poster
- Document Year:
- 2024
- Event:
- Presenters:
- Zijun Long
- Paper Code:
- MLSP-P5.1
- Categories:
- Log in to post comments
As Multimodal Large Language Models (MLLMs) grow in size, adapting them to specialized tasks becomes increasingly challenging due to high computational and memory demands. While efficient adaptation methods exist, in practice they suffer from shallow inter-modal alignment, which severely hurts model effectiveness. To tackle these challenges, we introduce the MultiWay-Adapter (MWA), which deepens inter-modal alignment, enabling high transferability with minimal tuning effort.
Links:
Paper (98)