Automatic Audio Captioning and Retrieval

On Negative Sampling for Contrastive Audio-Text Retrieval

Read more about On Negative Sampling for Contrastive Audio-Text Retrieval
Log in to post comments

This paper investigates negative sampling for contrastive learning in the context of audio-text retrieval. The strategy for negative sampling refers to selecting negatives (either audio clips or textual descriptions) from a pool of candidates for a positive audio-text pair. We explore sampling strategies via model-estimated within-modality and cross-modality relevance scores for audio and text samples. With a constant training setting on the retrieval system from [1], we study eight sampling strategies, including hard and semi-hard negative sampling.

ICASSP2023_5376_slides.pdf

ICASSP2023 Presentation Slides (218)

Categories:: Other

18 Views