Documents
Demo
Crowdsourced Multilingual Speech Intelligibility Testing
- DOI:
- 10.60864/mbd0-q066
- Citation Author(s):
- Submitted by:
- Nerio Moran
- Last updated:
- 6 June 2024 - 10:23am
- Document Type:
- Demo
- Document Year:
- 2024
- Event:
- Presenters:
- Nerio Moran, Ginette Prato, Miguel Plaza, Shirley Pestana, Daniel Arismendi, Jose Kordahi, Cyprian Wronka, Laura Lechler, Kamil Wojcicki
- Categories:
- Keywords:
- Log in to post comments
Advancements in generative algorithms promise new heights in what can be achieved, for example, in the speech enhancement domain. Beyond the ubiquitous noise reduction, destroyed speech components can now be restored—something not previously achievable. These emerging advancements create both opportunities and risks, as speech intelligibility can be impacted in a multitude of beneficial and detrimental ways. As such, there exists a need for methods, materials and tools for enabling rapid and effective assessment of speech intelligibility. Yet, the well established laboratory-based measures are both costly and do not scale well. Furthermore, public availability of multilingual test materials with associated software is lacking. The 2024 ICASSP paper #9588 “Crowdsourced multilingual speech intelligibility testing” aims to address some of the above challenges. This includes a public release of multilingual recordings and software for test survey creation. While the novelty of our approach rests primarily on the adaptation to crowdsourcing, this by no means limits applicability to in-laboratory environments. The proposed “Show and Tell” aims at demonstrating the above contributions to the speech research community at the 2024 ICASSP conference. It will include an overview of the test method, public test data release, open-source test software, along with a demonstration of a test setup. The audience will have the opportunity to take a short version of the test via noise cancelling headphones and hence experience the test first-hand in an interactive way. We note that the demo and associated releases will be useful for evaluation of, for example, speech enhancement and neural codec models, but also other technologies including those that produce or require their own audio recordings, such as text-to-speech, voice conversion, or accent correction systems. As such the proposed "Show and Tell" will be of broad interest to the conference audience.