Sorry, you need to enable JavaScript to visit this website.

Audio and Acoustic Signal Processing

A-CRNN: A DOMAIN ADAPTATION MODEL FOR SOUND EVENT DETECTION


This paper presents a domain adaptation model for sound event detection. A common challenge for sound event detection is how to deal with the mismatch among different datasets. Typically, the performance of a model will decrease if it is tested on a dataset which is different from the one that the model is trained on. To address this problem, based on convolutional recurrent neural networks (CRNNs), we propose an adapted CRNN (A-CRNN) as an unsupervised adversarial domain adaptation model for sound event detection.

Paper Details

Authors:
Wei Wei, Hongning Zhu, Emmanouil Benetos, Ye Wang
Submitted On:
11 May 2020 - 1:21am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Presentation slides

(18)

Subscribe

[1] Wei Wei, Hongning Zhu, Emmanouil Benetos, Ye Wang, "A-CRNN: A DOMAIN ADAPTATION MODEL FOR SOUND EVENT DETECTION", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5121. Accessed: Jul. 05, 2020.
@article{5121-20,
url = {http://sigport.org/5121},
author = {Wei Wei; Hongning Zhu; Emmanouil Benetos; Ye Wang },
publisher = {IEEE SigPort},
title = {A-CRNN: A DOMAIN ADAPTATION MODEL FOR SOUND EVENT DETECTION},
year = {2020} }
TY - EJOUR
T1 - A-CRNN: A DOMAIN ADAPTATION MODEL FOR SOUND EVENT DETECTION
AU - Wei Wei; Hongning Zhu; Emmanouil Benetos; Ye Wang
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5121
ER -
Wei Wei, Hongning Zhu, Emmanouil Benetos, Ye Wang. (2020). A-CRNN: A DOMAIN ADAPTATION MODEL FOR SOUND EVENT DETECTION. IEEE SigPort. http://sigport.org/5121
Wei Wei, Hongning Zhu, Emmanouil Benetos, Ye Wang, 2020. A-CRNN: A DOMAIN ADAPTATION MODEL FOR SOUND EVENT DETECTION. Available at: http://sigport.org/5121.
Wei Wei, Hongning Zhu, Emmanouil Benetos, Ye Wang. (2020). "A-CRNN: A DOMAIN ADAPTATION MODEL FOR SOUND EVENT DETECTION." Web.
1. Wei Wei, Hongning Zhu, Emmanouil Benetos, Ye Wang. A-CRNN: A DOMAIN ADAPTATION MODEL FOR SOUND EVENT DETECTION [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5121

A Sequence Matching Network for Polyphonic Sound Event Localization and Detection


Polyphonic sound event detection and direction-of-arrival estimation require different input features from audio signals. While sound event detection mainly relies on time-frequency patterns, direction-of-arrival estimation relies on magnitude or phase differences between microphones. Previous approaches use the same input features for sound event detection and direction-of-arrival estimation, and train the two tasks jointly or in a two-stage transfer-learning manner.

Paper Details

Authors:
T. N. T. Nguyen, D. L. Jones, W. S. Gan
Submitted On:
23 April 2020 - 4:54am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP2020 - Tho Nguyen - NTU - Sequence Matching - Sigpost.pdf

(31)

Subscribe

[1] T. N. T. Nguyen, D. L. Jones, W. S. Gan, "A Sequence Matching Network for Polyphonic Sound Event Localization and Detection", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5096. Accessed: Jul. 05, 2020.
@article{5096-20,
url = {http://sigport.org/5096},
author = {T. N. T. Nguyen; D. L. Jones; W. S. Gan },
publisher = {IEEE SigPort},
title = {A Sequence Matching Network for Polyphonic Sound Event Localization and Detection},
year = {2020} }
TY - EJOUR
T1 - A Sequence Matching Network for Polyphonic Sound Event Localization and Detection
AU - T. N. T. Nguyen; D. L. Jones; W. S. Gan
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5096
ER -
T. N. T. Nguyen, D. L. Jones, W. S. Gan. (2020). A Sequence Matching Network for Polyphonic Sound Event Localization and Detection. IEEE SigPort. http://sigport.org/5096
T. N. T. Nguyen, D. L. Jones, W. S. Gan, 2020. A Sequence Matching Network for Polyphonic Sound Event Localization and Detection. Available at: http://sigport.org/5096.
T. N. T. Nguyen, D. L. Jones, W. S. Gan. (2020). "A Sequence Matching Network for Polyphonic Sound Event Localization and Detection." Web.
1. T. N. T. Nguyen, D. L. Jones, W. S. Gan. A Sequence Matching Network for Polyphonic Sound Event Localization and Detection [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5096

HIGH PERFORMANCE SUPERVISED TIME-DELAY ESTIMATION USING NEURAL NETWORKS


Time-delay estimation is an essential building block of many signal processing applications. This paper follows up on earlier work for acoustic source localization and time delay estimation using pattern recognition techniques; it presents high performance results obtained with supervised training of neural networks which challenge the state of the art and compares its performance to that of well-known methods such as the Generalized Cross-Correlation or Adaptive Eigenvalue Decomposition.

Paper Details

Authors:
Pooyan Safari, Climent Nadeu
Submitted On:
20 February 2020 - 1:22pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

paperludwighouegnigan_shortversion.pdf

(52)

Subscribe

[1] Pooyan Safari, Climent Nadeu, "HIGH PERFORMANCE SUPERVISED TIME-DELAY ESTIMATION USING NEURAL NETWORKS", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/4992. Accessed: Jul. 05, 2020.
@article{4992-20,
url = {http://sigport.org/4992},
author = {Pooyan Safari; Climent Nadeu },
publisher = {IEEE SigPort},
title = {HIGH PERFORMANCE SUPERVISED TIME-DELAY ESTIMATION USING NEURAL NETWORKS},
year = {2020} }
TY - EJOUR
T1 - HIGH PERFORMANCE SUPERVISED TIME-DELAY ESTIMATION USING NEURAL NETWORKS
AU - Pooyan Safari; Climent Nadeu
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/4992
ER -
Pooyan Safari, Climent Nadeu. (2020). HIGH PERFORMANCE SUPERVISED TIME-DELAY ESTIMATION USING NEURAL NETWORKS. IEEE SigPort. http://sigport.org/4992
Pooyan Safari, Climent Nadeu, 2020. HIGH PERFORMANCE SUPERVISED TIME-DELAY ESTIMATION USING NEURAL NETWORKS. Available at: http://sigport.org/4992.
Pooyan Safari, Climent Nadeu. (2020). "HIGH PERFORMANCE SUPERVISED TIME-DELAY ESTIMATION USING NEURAL NETWORKS." Web.
1. Pooyan Safari, Climent Nadeu. HIGH PERFORMANCE SUPERVISED TIME-DELAY ESTIMATION USING NEURAL NETWORKS [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/4992

WASPAA 2019 POSTER: MULTIPLE HYPOTHESIS TRACKING FOR OVERLAPPING SPEAKER SEGMENTATION


Speaker segmentation is an essential part of any diarization system.Applications of diarization include tasks such as speaker indexing, improving automatic speech recognition (ASR) performance and making single speaker-based algorithms available for use in multi-speaker environments.This paper proposes a multiple hypothesis tracking (MHT) method that exploits the harmonic structure associated with the pitch in voiced speech in order to segment the onsets and end-points of speech from multiple, overlapping speakers.

Paper Details

Authors:
Aidan O. T. Hogg, Christine Evers, Patrick A. Naylor
Submitted On:
16 October 2019 - 7:03am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Poster___WASPAA_2019.pdf

(91)

Subscribe

[1] Aidan O. T. Hogg, Christine Evers, Patrick A. Naylor, "WASPAA 2019 POSTER: MULTIPLE HYPOTHESIS TRACKING FOR OVERLAPPING SPEAKER SEGMENTATION", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4874. Accessed: Jul. 05, 2020.
@article{4874-19,
url = {http://sigport.org/4874},
author = {Aidan O. T. Hogg; Christine Evers; Patrick A. Naylor },
publisher = {IEEE SigPort},
title = {WASPAA 2019 POSTER: MULTIPLE HYPOTHESIS TRACKING FOR OVERLAPPING SPEAKER SEGMENTATION},
year = {2019} }
TY - EJOUR
T1 - WASPAA 2019 POSTER: MULTIPLE HYPOTHESIS TRACKING FOR OVERLAPPING SPEAKER SEGMENTATION
AU - Aidan O. T. Hogg; Christine Evers; Patrick A. Naylor
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4874
ER -
Aidan O. T. Hogg, Christine Evers, Patrick A. Naylor. (2019). WASPAA 2019 POSTER: MULTIPLE HYPOTHESIS TRACKING FOR OVERLAPPING SPEAKER SEGMENTATION. IEEE SigPort. http://sigport.org/4874
Aidan O. T. Hogg, Christine Evers, Patrick A. Naylor, 2019. WASPAA 2019 POSTER: MULTIPLE HYPOTHESIS TRACKING FOR OVERLAPPING SPEAKER SEGMENTATION. Available at: http://sigport.org/4874.
Aidan O. T. Hogg, Christine Evers, Patrick A. Naylor. (2019). "WASPAA 2019 POSTER: MULTIPLE HYPOTHESIS TRACKING FOR OVERLAPPING SPEAKER SEGMENTATION." Web.
1. Aidan O. T. Hogg, Christine Evers, Patrick A. Naylor. WASPAA 2019 POSTER: MULTIPLE HYPOTHESIS TRACKING FOR OVERLAPPING SPEAKER SEGMENTATION [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4874

Selective Hearing: A Machine Listening Perspective


Selective hearing (SH) refers to the listeners' capability to focus their attention on a specific sound source or a group of sound sources in their auditory scene. This in turn implies that the listeners' focus is minimized for sources that are of no interest.
This paper describes the current landscape of machine listening research, and outlines ways in which these technologies can be leveraged to achieve SH with computational means.

Paper Details

Authors:
Estefanía Cano, Hanna Lukashevich
Submitted On:
26 September 2019 - 8:41pm
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

MMSP_poster.pdf

(95)

Subscribe

[1] Estefanía Cano, Hanna Lukashevich, "Selective Hearing: A Machine Listening Perspective", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4845. Accessed: Jul. 05, 2020.
@article{4845-19,
url = {http://sigport.org/4845},
author = {Estefanía Cano; Hanna Lukashevich },
publisher = {IEEE SigPort},
title = {Selective Hearing: A Machine Listening Perspective},
year = {2019} }
TY - EJOUR
T1 - Selective Hearing: A Machine Listening Perspective
AU - Estefanía Cano; Hanna Lukashevich
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4845
ER -
Estefanía Cano, Hanna Lukashevich. (2019). Selective Hearing: A Machine Listening Perspective. IEEE SigPort. http://sigport.org/4845
Estefanía Cano, Hanna Lukashevich, 2019. Selective Hearing: A Machine Listening Perspective. Available at: http://sigport.org/4845.
Estefanía Cano, Hanna Lukashevich. (2019). "Selective Hearing: A Machine Listening Perspective." Web.
1. Estefanía Cano, Hanna Lukashevich. Selective Hearing: A Machine Listening Perspective [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4845

Learning Multiple Sound Source 2D Localization


In this paper, we propose novel deep learning based algorithms for multiple sound source localization. Specifically, we aim to find the 2D Cartesian coordinates of multiple sound sources in an enclosed environment by using multiple microphone arrays. To this end, we use an encoding-decoding architecture and propose two improvements on it to accomplish the task. In addition, we also propose two novel localization representations which increase the accuracy.

Paper Details

Authors:
Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante
Submitted On:
29 September 2019 - 4:58am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

2019-mmsp-presentation_v6_sigport.pdf

(101)

Keywords

Additional Categories

Subscribe

[1] Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante, "Learning Multiple Sound Source 2D Localization", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4835. Accessed: Jul. 05, 2020.
@article{4835-19,
url = {http://sigport.org/4835},
author = {Guillaume Le Moing; Phongtharin Vinayavekhin; Tadanobu Inoue; Jayakorn Vongkulbhisal; Asim Munawar; Ryuki Tachibana; Don Joven Agravante },
publisher = {IEEE SigPort},
title = {Learning Multiple Sound Source 2D Localization},
year = {2019} }
TY - EJOUR
T1 - Learning Multiple Sound Source 2D Localization
AU - Guillaume Le Moing; Phongtharin Vinayavekhin; Tadanobu Inoue; Jayakorn Vongkulbhisal; Asim Munawar; Ryuki Tachibana; Don Joven Agravante
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4835
ER -
Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante. (2019). Learning Multiple Sound Source 2D Localization. IEEE SigPort. http://sigport.org/4835
Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante, 2019. Learning Multiple Sound Source 2D Localization. Available at: http://sigport.org/4835.
Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante. (2019). "Learning Multiple Sound Source 2D Localization." Web.
1. Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante. Learning Multiple Sound Source 2D Localization [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4835

Poster Imperceptible Audio Communication


A differential acoustic OFDM technique is presented to embed data imperceptibly in existing music. The method allows playing back music containing the data with a speaker without users noticing the embedded data channel. Using a microphone, the data can be recovered from the recording. Experiments with smartphone microphones show that transmission distances of 24 meters are possible, while achieving bit error ratios of less than 10 percent, depending on the environment.

Paper Details

Authors:
Manuel Eichelberger, Simon Tanner, Gabriel Voirol, Roger Wattenhofer
Submitted On:
22 May 2019 - 8:47am
Short Link:
Type:
Event:
Paper Code:
Document Year:
Cite

Document Files

Poster ICASSP 2019.pdf

(122)

Subscribe

[1] Manuel Eichelberger, Simon Tanner, Gabriel Voirol, Roger Wattenhofer, "Poster Imperceptible Audio Communication", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4557. Accessed: Jul. 05, 2020.
@article{4557-19,
url = {http://sigport.org/4557},
author = {Manuel Eichelberger; Simon Tanner; Gabriel Voirol; Roger Wattenhofer },
publisher = {IEEE SigPort},
title = {Poster Imperceptible Audio Communication},
year = {2019} }
TY - EJOUR
T1 - Poster Imperceptible Audio Communication
AU - Manuel Eichelberger; Simon Tanner; Gabriel Voirol; Roger Wattenhofer
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4557
ER -
Manuel Eichelberger, Simon Tanner, Gabriel Voirol, Roger Wattenhofer. (2019). Poster Imperceptible Audio Communication. IEEE SigPort. http://sigport.org/4557
Manuel Eichelberger, Simon Tanner, Gabriel Voirol, Roger Wattenhofer, 2019. Poster Imperceptible Audio Communication. Available at: http://sigport.org/4557.
Manuel Eichelberger, Simon Tanner, Gabriel Voirol, Roger Wattenhofer. (2019). "Poster Imperceptible Audio Communication." Web.
1. Manuel Eichelberger, Simon Tanner, Gabriel Voirol, Roger Wattenhofer. Poster Imperceptible Audio Communication [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4557

ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION

Paper Details

Authors:
Yuzhong Wu, Tan Lee
Submitted On:
21 May 2019 - 12:14pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

icassp2019_poster_yzwu_sound_texture.pdf

(192)

Subscribe

[1] Yuzhong Wu, Tan Lee, "ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4556. Accessed: Jul. 05, 2020.
@article{4556-19,
url = {http://sigport.org/4556},
author = {Yuzhong Wu; Tan Lee },
publisher = {IEEE SigPort},
title = {ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION},
year = {2019} }
TY - EJOUR
T1 - ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION
AU - Yuzhong Wu; Tan Lee
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4556
ER -
Yuzhong Wu, Tan Lee. (2019). ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION. IEEE SigPort. http://sigport.org/4556
Yuzhong Wu, Tan Lee, 2019. ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION. Available at: http://sigport.org/4556.
Yuzhong Wu, Tan Lee. (2019). "ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION." Web.
1. Yuzhong Wu, Tan Lee. ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4556

BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING


The intelligibility of speech in noise can be improved by modifying the speech. But with object-based audio, there
is the possibility of altering the background sound while leaving the speech unaltered. This may prove a less intrusive approach, affording good speech intelligibility without overly compromising the perceived sound quality. In this

Paper Details

Authors:
Yan Tang, Qingju Liu, Bruno Fazenda, Weuwu Wang
Submitted On:
14 May 2019 - 2:49am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP_TJC.pdf

(135)

Subscribe

[1] Yan Tang, Qingju Liu, Bruno Fazenda, Weuwu Wang, "BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4501. Accessed: Jul. 05, 2020.
@article{4501-19,
url = {http://sigport.org/4501},
author = {Yan Tang; Qingju Liu; Bruno Fazenda; Weuwu Wang },
publisher = {IEEE SigPort},
title = {BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING},
year = {2019} }
TY - EJOUR
T1 - BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING
AU - Yan Tang; Qingju Liu; Bruno Fazenda; Weuwu Wang
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4501
ER -
Yan Tang, Qingju Liu, Bruno Fazenda, Weuwu Wang. (2019). BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING. IEEE SigPort. http://sigport.org/4501
Yan Tang, Qingju Liu, Bruno Fazenda, Weuwu Wang, 2019. BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING. Available at: http://sigport.org/4501.
Yan Tang, Qingju Liu, Bruno Fazenda, Weuwu Wang. (2019). "BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING." Web.
1. Yan Tang, Qingju Liu, Bruno Fazenda, Weuwu Wang. BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4501

In-Car Driver Authentication Using Wireless Sensing


Automobiles have become an essential part of everyday lives. In this work, we attempt to make them smarter by introducing the idea of in-car driver authentication using wireless sensing. Our aim is to develop a model which can recognize drivers automatically. Firstly, we address the problem of "changing in-car environments", where the existing wireless sensing based human identification system fails. To this end, we build the first in-car driver radio biometric dataset to understand the effect of changing environments on human radio biometrics.

Paper Details

Authors:
Beibei Wang
Submitted On:
13 May 2019 - 11:17am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP_conf_poster_driver_authentication.pdf

(121)

Keywords

Additional Categories

Subscribe

[1] Beibei Wang, "In-Car Driver Authentication Using Wireless Sensing", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4487. Accessed: Jul. 05, 2020.
@article{4487-19,
url = {http://sigport.org/4487},
author = {Beibei Wang },
publisher = {IEEE SigPort},
title = {In-Car Driver Authentication Using Wireless Sensing},
year = {2019} }
TY - EJOUR
T1 - In-Car Driver Authentication Using Wireless Sensing
AU - Beibei Wang
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4487
ER -
Beibei Wang. (2019). In-Car Driver Authentication Using Wireless Sensing. IEEE SigPort. http://sigport.org/4487
Beibei Wang, 2019. In-Car Driver Authentication Using Wireless Sensing. Available at: http://sigport.org/4487.
Beibei Wang. (2019). "In-Car Driver Authentication Using Wireless Sensing." Web.
1. Beibei Wang. In-Car Driver Authentication Using Wireless Sensing [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4487

Pages