Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

We present pyroomacoustics, a software package aimed at the rapid development and testing of audio array processing algorithms.

Categories:
149 Views

A user implemented privacy preservation mechanism is proposed for the online gradient descent (OGD) algorithm. Privacy is measured through the information leakage as quantified by the mutual information between the usersʼ outputs and learnerʼs inputs. The input perturbation mechanism proposed can be implemented by individual users with a space and time complexity that is independent of the horizon T. For the proposed mechanism, the information leakage is shown to be bounded by the Gaussian channel capacity in the full information setting.

Categories:
29 Views

In this paper we introduce a speaker verification system deployed on mobile devices that can be used to personalise a keyword spotter. We describe a baseline DNN system that maps an utterance to a speaker embedding, which is used to measure speaker differences via cosine similarity. We then introduce an architectural modification which uses an LSTM system where the parameters are optimised via a curriculum learning procedure to reduce the detection error and improve its generalisability across various conditions.

Categories:
117 Views

In this paper, we propose a speaker-independent multi-speaker monaural speech separation system (CBLDNN-GAT) based on convolutional, bidirectional long short-term memory, deep feed-forward neural network (CBLDNN) with generative adversarial training (GAT). Our system aims at obtaining better speech quality instead of only minimizing a mean square error (MSE). In the initial phase, we utilize log-mel filterbank and pitch features to warm up our CBLDNN in a multi-task manner.

Categories:
88 Views

Pages