Sorry, you need to enable JavaScript to visit this website.

Audio and Acoustic Signal Processing

Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)


With the strong growth of assistive and personal listening devices, natural sound rendering over headphones is becoming a necessity for prolonged listening in multimedia and virtual reality applications. The aim of natural sound rendering is to naturally recreate the sound scenes with the spatial and timbral quality as natural as possible, so as to achieve a truly immersive listening experience. However, rendering natural sound over headphones encounters many challenges. This tutorial article presents signal processing techniques to tackle these challenges to assist human listening.

Paper Details

Authors:
Kaushik Sunder, Ee-Leng Tan
Submitted On:
23 February 2016 - 1:43pm
Short Link:
Type:

Document Files

SPM15slides_Natural Sound Rendering for Headphones.pdf

(112)

Subscribe

[1] Kaushik Sunder, Ee-Leng Tan, "Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)", IEEE SigPort, 2015. [Online]. Available: http://sigport.org/167. Accessed: Sep. 21, 2019.
@article{167-15,
url = {http://sigport.org/167},
author = {Kaushik Sunder; Ee-Leng Tan },
publisher = {IEEE SigPort},
title = {Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)},
year = {2015} }
TY - EJOUR
T1 - Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)
AU - Kaushik Sunder; Ee-Leng Tan
PY - 2015
PB - IEEE SigPort
UR - http://sigport.org/167
ER -
Kaushik Sunder, Ee-Leng Tan. (2015). Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides). IEEE SigPort. http://sigport.org/167
Kaushik Sunder, Ee-Leng Tan, 2015. Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides). Available at: http://sigport.org/167.
Kaushik Sunder, Ee-Leng Tan. (2015). "Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)." Web.
1. Kaushik Sunder, Ee-Leng Tan. Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides) [Internet]. IEEE SigPort; 2015. Available from : http://sigport.org/167

Poster Imperceptible Audio Communication


A differential acoustic OFDM technique is presented to embed data imperceptibly in existing music. The method allows playing back music containing the data with a speaker without users noticing the embedded data channel. Using a microphone, the data can be recovered from the recording. Experiments with smartphone microphones show that transmission distances of 24 meters are possible, while achieving bit error ratios of less than 10 percent, depending on the environment.

Paper Details

Authors:
Manuel Eichelberger, Simon Tanner, Gabriel Voirol, Roger Wattenhofer
Submitted On:
22 May 2019 - 8:47am
Short Link:
Type:
Event:
Paper Code:
Document Year:
Cite

Document Files

Poster ICASSP 2019.pdf

(24)

Subscribe

[1] Manuel Eichelberger, Simon Tanner, Gabriel Voirol, Roger Wattenhofer, "Poster Imperceptible Audio Communication", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4557. Accessed: Sep. 21, 2019.
@article{4557-19,
url = {http://sigport.org/4557},
author = {Manuel Eichelberger; Simon Tanner; Gabriel Voirol; Roger Wattenhofer },
publisher = {IEEE SigPort},
title = {Poster Imperceptible Audio Communication},
year = {2019} }
TY - EJOUR
T1 - Poster Imperceptible Audio Communication
AU - Manuel Eichelberger; Simon Tanner; Gabriel Voirol; Roger Wattenhofer
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4557
ER -
Manuel Eichelberger, Simon Tanner, Gabriel Voirol, Roger Wattenhofer. (2019). Poster Imperceptible Audio Communication. IEEE SigPort. http://sigport.org/4557
Manuel Eichelberger, Simon Tanner, Gabriel Voirol, Roger Wattenhofer, 2019. Poster Imperceptible Audio Communication. Available at: http://sigport.org/4557.
Manuel Eichelberger, Simon Tanner, Gabriel Voirol, Roger Wattenhofer. (2019). "Poster Imperceptible Audio Communication." Web.
1. Manuel Eichelberger, Simon Tanner, Gabriel Voirol, Roger Wattenhofer. Poster Imperceptible Audio Communication [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4557

ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION

Paper Details

Authors:
Yuzhong Wu, Tan Lee
Submitted On:
21 May 2019 - 12:14pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

icassp2019_poster_yzwu_sound_texture.pdf

(90)

Subscribe

[1] Yuzhong Wu, Tan Lee, "ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4556. Accessed: Sep. 21, 2019.
@article{4556-19,
url = {http://sigport.org/4556},
author = {Yuzhong Wu; Tan Lee },
publisher = {IEEE SigPort},
title = {ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION},
year = {2019} }
TY - EJOUR
T1 - ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION
AU - Yuzhong Wu; Tan Lee
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4556
ER -
Yuzhong Wu, Tan Lee. (2019). ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION. IEEE SigPort. http://sigport.org/4556
Yuzhong Wu, Tan Lee, 2019. ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION. Available at: http://sigport.org/4556.
Yuzhong Wu, Tan Lee. (2019). "ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION." Web.
1. Yuzhong Wu, Tan Lee. ENHANCING SOUND TEXTURE IN CNN-BASED ACOUSTIC SCENE CLASSIFICATION [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4556

BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING


The intelligibility of speech in noise can be improved by modifying the speech. But with object-based audio, there
is the possibility of altering the background sound while leaving the speech unaltered. This may prove a less intrusive approach, affording good speech intelligibility without overly compromising the perceived sound quality. In this

Paper Details

Authors:
Yan Tang, Qingju Liu, Bruno Fazenda, Weuwu Wang
Submitted On:
14 May 2019 - 2:49am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP_TJC.pdf

(31)

Subscribe

[1] Yan Tang, Qingju Liu, Bruno Fazenda, Weuwu Wang, "BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4501. Accessed: Sep. 21, 2019.
@article{4501-19,
url = {http://sigport.org/4501},
author = {Yan Tang; Qingju Liu; Bruno Fazenda; Weuwu Wang },
publisher = {IEEE SigPort},
title = {BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING},
year = {2019} }
TY - EJOUR
T1 - BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING
AU - Yan Tang; Qingju Liu; Bruno Fazenda; Weuwu Wang
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4501
ER -
Yan Tang, Qingju Liu, Bruno Fazenda, Weuwu Wang. (2019). BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING. IEEE SigPort. http://sigport.org/4501
Yan Tang, Qingju Liu, Bruno Fazenda, Weuwu Wang, 2019. BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING. Available at: http://sigport.org/4501.
Yan Tang, Qingju Liu, Bruno Fazenda, Weuwu Wang. (2019). "BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING." Web.
1. Yan Tang, Qingju Liu, Bruno Fazenda, Weuwu Wang. BACKGROUND ADAPTATION FOR IMPROVED LISTENING EXPERIENCE IN BROADCASTING [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4501

In-Car Driver Authentication Using Wireless Sensing


Automobiles have become an essential part of everyday lives. In this work, we attempt to make them smarter by introducing the idea of in-car driver authentication using wireless sensing. Our aim is to develop a model which can recognize drivers automatically. Firstly, we address the problem of "changing in-car environments", where the existing wireless sensing based human identification system fails. To this end, we build the first in-car driver radio biometric dataset to understand the effect of changing environments on human radio biometrics.

Paper Details

Authors:
Beibei Wang
Submitted On:
13 May 2019 - 11:17am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP_conf_poster_driver_authentication.pdf

(39)

Keywords

Additional Categories

Subscribe

[1] Beibei Wang, "In-Car Driver Authentication Using Wireless Sensing", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4487. Accessed: Sep. 21, 2019.
@article{4487-19,
url = {http://sigport.org/4487},
author = {Beibei Wang },
publisher = {IEEE SigPort},
title = {In-Car Driver Authentication Using Wireless Sensing},
year = {2019} }
TY - EJOUR
T1 - In-Car Driver Authentication Using Wireless Sensing
AU - Beibei Wang
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4487
ER -
Beibei Wang. (2019). In-Car Driver Authentication Using Wireless Sensing. IEEE SigPort. http://sigport.org/4487
Beibei Wang, 2019. In-Car Driver Authentication Using Wireless Sensing. Available at: http://sigport.org/4487.
Beibei Wang. (2019). "In-Car Driver Authentication Using Wireless Sensing." Web.
1. Beibei Wang. In-Car Driver Authentication Using Wireless Sensing [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4487

wav2letter++ : A Fast Open-Source Speech Recognition Framework


This paper introduces wav2letter++, a fast open-source deep learning speech recognition framework. wav2letter++ is written entirely in C++, and uses the ArrayFire tensor library for maximum efficiency. Here we explain the architecture and design of the wav2letter++ system and compare it to other major open-source speech recognition systems. In some cases wav2letter++ is more than 2x faster than other optimized frameworks for training end-to-end neural networks for speech recognition.

Paper Details

Authors:
Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert
Submitted On:
13 May 2019 - 8:40am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

wav2letter++-poster.pdf

(63)

Keywords

Additional Categories

Subscribe

[1] Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert, "wav2letter++ : A Fast Open-Source Speech Recognition Framework", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4483. Accessed: Sep. 21, 2019.
@article{4483-19,
url = {http://sigport.org/4483},
author = {Vineel Pratap; Awni Hannun; Qiantong Xu; Jeff Cai; Jacob Kahn; Gabriel Synnaeve; Vitaliy Liptchinsky; Ronan Collobert },
publisher = {IEEE SigPort},
title = {wav2letter++ : A Fast Open-Source Speech Recognition Framework},
year = {2019} }
TY - EJOUR
T1 - wav2letter++ : A Fast Open-Source Speech Recognition Framework
AU - Vineel Pratap; Awni Hannun; Qiantong Xu; Jeff Cai; Jacob Kahn; Gabriel Synnaeve; Vitaliy Liptchinsky; Ronan Collobert
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4483
ER -
Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert. (2019). wav2letter++ : A Fast Open-Source Speech Recognition Framework. IEEE SigPort. http://sigport.org/4483
Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert, 2019. wav2letter++ : A Fast Open-Source Speech Recognition Framework. Available at: http://sigport.org/4483.
Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert. (2019). "wav2letter++ : A Fast Open-Source Speech Recognition Framework." Web.
1. Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert. wav2letter++ : A Fast Open-Source Speech Recognition Framework [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4483

Adversarial Speaker Adaptation


We propose a novel adversarial speaker adaptation (ASA) scheme, in which adversarial learning is applied to regularize the distribution of deep hidden features in a speaker-dependent (SD) deep neural network (DNN) acoustic model to be close to that of a fixed speaker-independent (SI) DNN acoustic model during adaptation. An additional discriminator network is introduced to distinguish the deep features generated by the SD model from those produced by the SI model.

Paper Details

Authors:
Zhong Meng, Jinyu Li, Yifan Gong
Submitted On:
12 May 2019 - 9:26pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

asa_oral_v3.pptx

(48)

Subscribe

[1] Zhong Meng, Jinyu Li, Yifan Gong, "Adversarial Speaker Adaptation", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4475. Accessed: Sep. 21, 2019.
@article{4475-19,
url = {http://sigport.org/4475},
author = {Zhong Meng; Jinyu Li; Yifan Gong },
publisher = {IEEE SigPort},
title = {Adversarial Speaker Adaptation},
year = {2019} }
TY - EJOUR
T1 - Adversarial Speaker Adaptation
AU - Zhong Meng; Jinyu Li; Yifan Gong
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4475
ER -
Zhong Meng, Jinyu Li, Yifan Gong. (2019). Adversarial Speaker Adaptation. IEEE SigPort. http://sigport.org/4475
Zhong Meng, Jinyu Li, Yifan Gong, 2019. Adversarial Speaker Adaptation. Available at: http://sigport.org/4475.
Zhong Meng, Jinyu Li, Yifan Gong. (2019). "Adversarial Speaker Adaptation." Web.
1. Zhong Meng, Jinyu Li, Yifan Gong. Adversarial Speaker Adaptation [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4475

Attentive Adversarial Learning for Domain-Invariant Training


Adversarial domain-invariant training (ADIT) proves to be effective in suppressing the effects of domain variability in acoustic modeling and has led to improved performance in automatic speech recognition (ASR). In ADIT, an auxiliary domain classifier takes in equally-weighted deep features from a deep neural network (DNN) acoustic model and is trained to improve their domain-invariance by optimizing an adversarial loss function.

Paper Details

Authors:
Zhong Meng, Jinyu Li, Yifan Gong
Submitted On:
12 May 2019 - 9:03pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

aadit_poster.pptx

(33)

Subscribe

[1] Zhong Meng, Jinyu Li, Yifan Gong, "Attentive Adversarial Learning for Domain-Invariant Training", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4474. Accessed: Sep. 21, 2019.
@article{4474-19,
url = {http://sigport.org/4474},
author = {Zhong Meng; Jinyu Li; Yifan Gong },
publisher = {IEEE SigPort},
title = {Attentive Adversarial Learning for Domain-Invariant Training},
year = {2019} }
TY - EJOUR
T1 - Attentive Adversarial Learning for Domain-Invariant Training
AU - Zhong Meng; Jinyu Li; Yifan Gong
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4474
ER -
Zhong Meng, Jinyu Li, Yifan Gong. (2019). Attentive Adversarial Learning for Domain-Invariant Training. IEEE SigPort. http://sigport.org/4474
Zhong Meng, Jinyu Li, Yifan Gong, 2019. Attentive Adversarial Learning for Domain-Invariant Training. Available at: http://sigport.org/4474.
Zhong Meng, Jinyu Li, Yifan Gong. (2019). "Attentive Adversarial Learning for Domain-Invariant Training." Web.
1. Zhong Meng, Jinyu Li, Yifan Gong. Attentive Adversarial Learning for Domain-Invariant Training [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4474

Adversarial Speaker Verification


The use of deep networks to extract embeddings for speaker recognition has proven successfully. However, such embeddings are susceptible to performance degradation due to the mismatches among the training, enrollment, and test conditions. In this work, we propose an adversarial speaker verification (ASV) scheme to learn the condition-invariant deep embedding via adversarial multi-task training. In ASV, a speaker classification network and a condition identification network are jointly optimized to minimize the speaker classification loss and simultaneously mini-maximize the condition loss.

Paper Details

Authors:
Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong
Submitted On:
12 May 2019 - 9:24pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

asv_poster_v3.pptx

(46)

Subscribe

[1] Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong, "Adversarial Speaker Verification", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4473. Accessed: Sep. 21, 2019.
@article{4473-19,
url = {http://sigport.org/4473},
author = {Zhong Meng; Yong Zhao; Jinyu Li; Yifan Gong },
publisher = {IEEE SigPort},
title = {Adversarial Speaker Verification},
year = {2019} }
TY - EJOUR
T1 - Adversarial Speaker Verification
AU - Zhong Meng; Yong Zhao; Jinyu Li; Yifan Gong
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4473
ER -
Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong. (2019). Adversarial Speaker Verification. IEEE SigPort. http://sigport.org/4473
Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong, 2019. Adversarial Speaker Verification. Available at: http://sigport.org/4473.
Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong. (2019). "Adversarial Speaker Verification." Web.
1. Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong. Adversarial Speaker Verification [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4473

Conditional Teacher-Student Learning


The teacher-student (T/S) learning has been shown to be effective for a variety of problems such as domain adaptation and model compression. One shortcoming of the T/S learning is that a teacher model, not always perfect, sporadically produces wrong guidance in form of posterior probabilities that misleads the student model towards a suboptimal performance.

Paper Details

Authors:
Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong
Submitted On:
12 May 2019 - 9:23pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

cts_poster.pptx

(39)

Subscribe

[1] Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong, "Conditional Teacher-Student Learning", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4472. Accessed: Sep. 21, 2019.
@article{4472-19,
url = {http://sigport.org/4472},
author = {Zhong Meng; Jinyu Li; Yong Zhao; Yifan Gong },
publisher = {IEEE SigPort},
title = {Conditional Teacher-Student Learning},
year = {2019} }
TY - EJOUR
T1 - Conditional Teacher-Student Learning
AU - Zhong Meng; Jinyu Li; Yong Zhao; Yifan Gong
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4472
ER -
Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong. (2019). Conditional Teacher-Student Learning. IEEE SigPort. http://sigport.org/4472
Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong, 2019. Conditional Teacher-Student Learning. Available at: http://sigport.org/4472.
Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong. (2019). "Conditional Teacher-Student Learning." Web.
1. Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong. Conditional Teacher-Student Learning [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4472

Pages