Sorry, you need to enable JavaScript to visit this website.

Crowdsourced Pairwise-Comparison for Source Separation Evaluation

Citation Author(s):
Mark Cartwright, Bryan Pardo, Gautham Mysore
Submitted by:
Mark Cartwright
Last updated:
14 April 2018 - 5:27pm
Document Type:
Poster
Document Year:
2018
Event:
Presenters:
Mark Cartwright
Paper Code:
AASP-P9.4
 

Automated objective methods of audio source separation evaluation are fast, cheap, and require little effort by the investigator. However, their output often correlates poorly with human quality assessments and typically require ground-truth (perfectly separated) signals to evaluate algorithm performance. Subjective multi-stimulus human ratings (e.g. MUSHRA) of audio quality are the gold standard for many tasks, but they are slow and require a great deal of effort to recruit participants and run listening tests. Recent work has shown that a crowdsourced multi-stimulus listening test can have results comparable to lab-based multi-stimulus tests. While these results are encouraging, MUSHRA multi-stimulus tests are limited to evaluating 12 or fewer stimuli, and they require ground-truth stimuli for reference. In this work, we evaluate a web-based pairwise-comparison listening approach that promises to speed and facilitate conducting listening tests, while also addressing some of the shortcomings of multi-stimulus tests. Using audio source separation quality as our evaluation task, we compare our web-based pairwise-comparison listening test to both web-based and lab-based multi-stimulus tests. We find that pairwise-comparison listening tests perform comparably to multi-stimulus tests, but without many of their shortcomings.

up
0 users have voted: