Audio and Acoustic Signal Processing

Selective Hearing: A Machine Listening Perspective

Read more about Selective Hearing: A Machine Listening Perspective
Log in to post comments

Selective hearing (SH) refers to the listeners' capability to focus their attention on a specific sound source or a group of sound sources in their auditory scene. This in turn implies that the listeners' focus is minimized for sources that are of no interest.
This paper describes the current landscape of machine listening research, and outlines ways in which these technologies can be leveraged to achieve SH with computational means.

MMSP_poster.pdf

MMSP_poster.pdf (534)

Categories:: Audio and Acoustic Signal Processing

75 Views

Learning Multiple Sound Source 2D Localization

Read more about Learning Multiple Sound Source 2D Localization
Log in to post comments

In this paper, we propose novel deep learning based algorithms for multiple sound source localization. Specifically, we aim to find the 2D Cartesian coordinates of multiple sound sources in an enclosed environment by using multiple microphone arrays. To this end, we use an encoding-decoding architecture and propose two improvements on it to accomplish the task. In addition, we also propose two novel localization representations which increase the accuracy.

2019-mmsp-presentation_v6_sigport.pdf

2019-mmsp-presentation_v6_sigport.pdf (522)

Categories:: Audio and Acoustic Signal Processing

116 Views

Poster Imperceptible Audio Communication

Read more about Poster Imperceptible Audio Communication
Log in to post comments

A differential acoustic OFDM technique is presented to embed data imperceptibly in existing music. The method allows playing back music containing the data with a speaker without users noticing the embedded data channel. Using a microphone, the data can be recovered from the recording. Experiments with smartphone microphones show that transmission distances of 24 meters are possible, while achieving bit error ratios of less than 10 percent, depending on the environment.

Poster ICASSP 2019.pdf

Poster ICASSP 2019.pdf (568)

Categories:: Audio and Acoustic Signal Processing

64 Views

ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION

Read more about ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION
Log in to post comments

icassp2019_poster_yzwu_sound_texture.pdf

icassp2019_poster_yzwu_sound_texture.pdf (632)

Categories:: Audio and Acoustic Signal Processing

100 Views

BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING

Read more about BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING
Log in to post comments

The intelligibility of speech in noise can be improved by modifying the speech. But with object-based audio, there
is the possibility of altering the background sound while leaving the speech unaltered. This may prove a less intrusive approach, affording good speech intelligibility without overly compromising the perceived sound quality. In this

ICASSP_TJC.pdf

ICASSP_TJC.pdf (532)

Categories:: Audio and Acoustic Signal Processing

6 Views

In-Car Driver Authentication Using Wireless Sensing

Read more about In-Car Driver Authentication Using Wireless Sensing
Log in to post comments

Automobiles have become an essential part of everyday lives. In this work, we attempt to make them smarter by introducing the idea of in-car driver authentication using wireless sensing. Our aim is to develop a model which can recognize drivers automatically. Firstly, we address the problem of "changing in-car environments", where the existing wireless sensing based human identification system fails. To this end, we build the first in-car driver radio biometric dataset to understand the effect of changing environments on human radio biometrics.

ICASSP_conf_poster_driver_authentication.pdf

ICASSP_conf_poster_driver_authentication.pdf (650)

Categories:: Audio and Acoustic Signal Processing

39 Views

wav2letter++ : A Fast Open-Source Speech Recognition Framework

Read more about wav2letter++ : A Fast Open-Source Speech Recognition Framework
Log in to post comments

This paper introduces wav2letter++, a fast open-source deep learning speech recognition framework. wav2letter++ is written entirely in C++, and uses the ArrayFire tensor library for maximum efficiency. Here we explain the architecture and design of the wav2letter++ system and compare it to other major open-source speech recognition systems. In some cases wav2letter++ is more than 2x faster than other optimized frameworks for training end-to-end neural networks for speech recognition.

wav2letter++-poster.pdf

wav2letter++-poster.pdf (2126)

Categories:: Audio and Acoustic Signal Processing

158 Views

Adversarial Speaker Adaptation

Read more about Adversarial Speaker Adaptation
Log in to post comments

We propose a novel adversarial speaker adaptation (ASA) scheme, in which adversarial learning is applied to regularize the distribution of deep hidden features in a speaker-dependent (SD) deep neural network (DNN) acoustic model to be close to that of a fixed speaker-independent (SI) DNN acoustic model during adaptation. An additional discriminator network is introduced to distinguish the deep features generated by the SD model from those produced by the SI model.

asa_oral_v3.pptx

asa_oral_v3.pptx (518)

Categories:: Speech Processing
Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)
Audio and Acoustic Signal Processing
Machine Learning for Signal Processing

19 Views

Attentive Adversarial Learning for Domain-Invariant Training

Read more about Attentive Adversarial Learning for Domain-Invariant Training
Log in to post comments

Adversarial domain-invariant training (ADIT) proves to be effective in suppressing the effects of domain variability in acoustic modeling and has led to improved performance in automatic speech recognition (ASR). In ADIT, an auxiliary domain classifier takes in equally-weighted deep features from a deep neural network (DNN) acoustic model and is trained to improve their domain-invariance by optimizing an adversarial loss function.

aadit_poster.pptx

aadit_poster.pptx (461)

Categories:: Audio and Acoustic Signal Processing
Speech Processing
Machine Learning for Signal Processing

18 Views

Adversarial Speaker Verification

Read more about Adversarial Speaker Verification
Log in to post comments

The use of deep networks to extract embeddings for speaker recognition has proven successfully. However, such embeddings are susceptible to performance degradation due to the mismatches among the training, enrollment, and test conditions. In this work, we propose an adversarial speaker verification (ASV) scheme to learn the condition-invariant deep embedding via adversarial multi-task training. In ASV, a speaker classification network and a condition identification network are jointly optimized to minimize the speaker classification loss and simultaneously mini-maximize the condition loss.

asv_poster_v3.pptx

asv_poster_v3.pptx (450)

Categories:: Speech Processing
Audio and Acoustic Signal Processing
Machine Learning for Signal Processing

16 Views

Audio and Acoustic Signal Processing

Pages