Sorry, you need to enable JavaScript to visit this website.

Audio and Acoustic Signal Processing

Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)


With the strong growth of assistive and personal listening devices, natural sound rendering over headphones is becoming a necessity for prolonged listening in multimedia and virtual reality applications. The aim of natural sound rendering is to naturally recreate the sound scenes with the spatial and timbral quality as natural as possible, so as to achieve a truly immersive listening experience. However, rendering natural sound over headphones encounters many challenges. This tutorial article presents signal processing techniques to tackle these challenges to assist human listening.

Paper Details

Authors:
Kaushik Sunder, Ee-Leng Tan
Submitted On:
23 February 2016 - 1:43pm
Short Link:
Type:

Document Files

SPM15slides_Natural Sound Rendering for Headphones.pdf

(112)

Subscribe

[1] Kaushik Sunder, Ee-Leng Tan, "Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)", IEEE SigPort, 2015. [Online]. Available: http://sigport.org/167. Accessed: Dec. 02, 2020.
@article{167-15,
url = {http://sigport.org/167},
author = {Kaushik Sunder; Ee-Leng Tan },
publisher = {IEEE SigPort},
title = {Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)},
year = {2015} }
TY - EJOUR
T1 - Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)
AU - Kaushik Sunder; Ee-Leng Tan
PY - 2015
PB - IEEE SigPort
UR - http://sigport.org/167
ER -
Kaushik Sunder, Ee-Leng Tan. (2015). Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides). IEEE SigPort. http://sigport.org/167
Kaushik Sunder, Ee-Leng Tan, 2015. Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides). Available at: http://sigport.org/167.
Kaushik Sunder, Ee-Leng Tan. (2015). "Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)." Web.
1. Kaushik Sunder, Ee-Leng Tan. Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides) [Internet]. IEEE SigPort; 2015. Available from : http://sigport.org/167

Deliberation Model Based Two-Pass End-to-End Speech Recognition

Paper Details

Authors:
Tara Sainath, Ruoming Pang, Rohit Prabhavalkar
Submitted On:
1 June 2020 - 3:13pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Deliberation Model Based Two-Pass End-to-End Speech Recognition (ICASSP 2020).pdf

(66)

Subscribe

[1] Tara Sainath, Ruoming Pang, Rohit Prabhavalkar, "Deliberation Model Based Two-Pass End-to-End Speech Recognition", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5449. Accessed: Dec. 02, 2020.
@article{5449-20,
url = {http://sigport.org/5449},
author = {Tara Sainath; Ruoming Pang; Rohit Prabhavalkar },
publisher = {IEEE SigPort},
title = {Deliberation Model Based Two-Pass End-to-End Speech Recognition},
year = {2020} }
TY - EJOUR
T1 - Deliberation Model Based Two-Pass End-to-End Speech Recognition
AU - Tara Sainath; Ruoming Pang; Rohit Prabhavalkar
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5449
ER -
Tara Sainath, Ruoming Pang, Rohit Prabhavalkar. (2020). Deliberation Model Based Two-Pass End-to-End Speech Recognition. IEEE SigPort. http://sigport.org/5449
Tara Sainath, Ruoming Pang, Rohit Prabhavalkar, 2020. Deliberation Model Based Two-Pass End-to-End Speech Recognition. Available at: http://sigport.org/5449.
Tara Sainath, Ruoming Pang, Rohit Prabhavalkar. (2020). "Deliberation Model Based Two-Pass End-to-End Speech Recognition." Web.
1. Tara Sainath, Ruoming Pang, Rohit Prabhavalkar. Deliberation Model Based Two-Pass End-to-End Speech Recognition [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5449

Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision

Paper Details

Authors:
Aren Jansen, Daniel P. W. Ellis, Shawn Hershey, R. Channing Moore, Manoj Plakal, Ashok Popat, Rif A. Saurous
Submitted On:
28 May 2020 - 1:01am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP 2020_ Coincidence, Categorization, and Consolidation.pdf

(62)

Keywords

Additional Categories

Subscribe

[1] Aren Jansen, Daniel P. W. Ellis, Shawn Hershey, R. Channing Moore, Manoj Plakal, Ashok Popat, Rif A. Saurous, "Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5444. Accessed: Dec. 02, 2020.
@article{5444-20,
url = {http://sigport.org/5444},
author = {Aren Jansen; Daniel P. W. Ellis; Shawn Hershey; R. Channing Moore; Manoj Plakal; Ashok Popat; Rif A. Saurous },
publisher = {IEEE SigPort},
title = {Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision},
year = {2020} }
TY - EJOUR
T1 - Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision
AU - Aren Jansen; Daniel P. W. Ellis; Shawn Hershey; R. Channing Moore; Manoj Plakal; Ashok Popat; Rif A. Saurous
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5444
ER -
Aren Jansen, Daniel P. W. Ellis, Shawn Hershey, R. Channing Moore, Manoj Plakal, Ashok Popat, Rif A. Saurous. (2020). Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision. IEEE SigPort. http://sigport.org/5444
Aren Jansen, Daniel P. W. Ellis, Shawn Hershey, R. Channing Moore, Manoj Plakal, Ashok Popat, Rif A. Saurous, 2020. Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision. Available at: http://sigport.org/5444.
Aren Jansen, Daniel P. W. Ellis, Shawn Hershey, R. Channing Moore, Manoj Plakal, Ashok Popat, Rif A. Saurous. (2020). "Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision." Web.
1. Aren Jansen, Daniel P. W. Ellis, Shawn Hershey, R. Channing Moore, Manoj Plakal, Ashok Popat, Rif A. Saurous. Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5444

Time-Frequency Feature Decomposition Based on Sound Duration for Acoustic Scene Classification

Paper Details

Authors:
Yuzhong Wu, Tan Lee
Submitted On:
15 May 2020 - 5:07am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

icassp2020_ppt_yzwu.pdf

(70)

Subscribe

[1] Yuzhong Wu, Tan Lee, "Time-Frequency Feature Decomposition Based on Sound Duration for Acoustic Scene Classification", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5343. Accessed: Dec. 02, 2020.
@article{5343-20,
url = {http://sigport.org/5343},
author = {Yuzhong Wu; Tan Lee },
publisher = {IEEE SigPort},
title = {Time-Frequency Feature Decomposition Based on Sound Duration for Acoustic Scene Classification},
year = {2020} }
TY - EJOUR
T1 - Time-Frequency Feature Decomposition Based on Sound Duration for Acoustic Scene Classification
AU - Yuzhong Wu; Tan Lee
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5343
ER -
Yuzhong Wu, Tan Lee. (2020). Time-Frequency Feature Decomposition Based on Sound Duration for Acoustic Scene Classification. IEEE SigPort. http://sigport.org/5343
Yuzhong Wu, Tan Lee, 2020. Time-Frequency Feature Decomposition Based on Sound Duration for Acoustic Scene Classification. Available at: http://sigport.org/5343.
Yuzhong Wu, Tan Lee. (2020). "Time-Frequency Feature Decomposition Based on Sound Duration for Acoustic Scene Classification." Web.
1. Yuzhong Wu, Tan Lee. Time-Frequency Feature Decomposition Based on Sound Duration for Acoustic Scene Classification [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5343

AN ATTENTION ENHANCED MULTI-TASK MODEL FOR OBJECTIVE SPEECH ASSESSMENT IN REAL-WORLD ENVIRONMENTS

Paper Details

Authors:
Submitted On:
14 May 2020 - 11:44am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

slide_1603.pdf

(62)

Subscribe

[1] , "AN ATTENTION ENHANCED MULTI-TASK MODEL FOR OBJECTIVE SPEECH ASSESSMENT IN REAL-WORLD ENVIRONMENTS", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5309. Accessed: Dec. 02, 2020.
@article{5309-20,
url = {http://sigport.org/5309},
author = { },
publisher = {IEEE SigPort},
title = {AN ATTENTION ENHANCED MULTI-TASK MODEL FOR OBJECTIVE SPEECH ASSESSMENT IN REAL-WORLD ENVIRONMENTS},
year = {2020} }
TY - EJOUR
T1 - AN ATTENTION ENHANCED MULTI-TASK MODEL FOR OBJECTIVE SPEECH ASSESSMENT IN REAL-WORLD ENVIRONMENTS
AU -
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5309
ER -
. (2020). AN ATTENTION ENHANCED MULTI-TASK MODEL FOR OBJECTIVE SPEECH ASSESSMENT IN REAL-WORLD ENVIRONMENTS. IEEE SigPort. http://sigport.org/5309
, 2020. AN ATTENTION ENHANCED MULTI-TASK MODEL FOR OBJECTIVE SPEECH ASSESSMENT IN REAL-WORLD ENVIRONMENTS. Available at: http://sigport.org/5309.
. (2020). "AN ATTENTION ENHANCED MULTI-TASK MODEL FOR OBJECTIVE SPEECH ASSESSMENT IN REAL-WORLD ENVIRONMENTS." Web.
1. . AN ATTENTION ENHANCED MULTI-TASK MODEL FOR OBJECTIVE SPEECH ASSESSMENT IN REAL-WORLD ENVIRONMENTS [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5309

Dr.

Paper Details

Authors:
Submitted On:
14 May 2020 - 8:10am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

Source separation with weakly labelled data An approach to computational auditory scene analysis_v1.1.pdf

(117)

Subscribe

[1] , "Dr.", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5293. Accessed: Dec. 02, 2020.
@article{5293-20,
url = {http://sigport.org/5293},
author = { },
publisher = {IEEE SigPort},
title = {Dr.},
year = {2020} }
TY - EJOUR
T1 - Dr.
AU -
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5293
ER -
. (2020). Dr.. IEEE SigPort. http://sigport.org/5293
, 2020. Dr.. Available at: http://sigport.org/5293.
. (2020). "Dr.." Web.
1. . Dr. [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5293

URTIS: A SMALL 3D IMAGING SONAR SENSOR FOR ROBOTIC APPLICATIONS

Paper Details

Authors:
Thomas Verellen, Robin Kerstens, Dennis Laurijssen, Jan Steckel
Submitted On:
14 May 2020 - 3:26am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

CoSys_PosterURTIS_2020.pdf

(60)

Subscribe

[1] Thomas Verellen, Robin Kerstens, Dennis Laurijssen, Jan Steckel, "URTIS: A SMALL 3D IMAGING SONAR SENSOR FOR ROBOTIC APPLICATIONS", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5248. Accessed: Dec. 02, 2020.
@article{5248-20,
url = {http://sigport.org/5248},
author = {Thomas Verellen; Robin Kerstens; Dennis Laurijssen; Jan Steckel },
publisher = {IEEE SigPort},
title = {URTIS: A SMALL 3D IMAGING SONAR SENSOR FOR ROBOTIC APPLICATIONS},
year = {2020} }
TY - EJOUR
T1 - URTIS: A SMALL 3D IMAGING SONAR SENSOR FOR ROBOTIC APPLICATIONS
AU - Thomas Verellen; Robin Kerstens; Dennis Laurijssen; Jan Steckel
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5248
ER -
Thomas Verellen, Robin Kerstens, Dennis Laurijssen, Jan Steckel. (2020). URTIS: A SMALL 3D IMAGING SONAR SENSOR FOR ROBOTIC APPLICATIONS. IEEE SigPort. http://sigport.org/5248
Thomas Verellen, Robin Kerstens, Dennis Laurijssen, Jan Steckel, 2020. URTIS: A SMALL 3D IMAGING SONAR SENSOR FOR ROBOTIC APPLICATIONS. Available at: http://sigport.org/5248.
Thomas Verellen, Robin Kerstens, Dennis Laurijssen, Jan Steckel. (2020). "URTIS: A SMALL 3D IMAGING SONAR SENSOR FOR ROBOTIC APPLICATIONS." Web.
1. Thomas Verellen, Robin Kerstens, Dennis Laurijssen, Jan Steckel. URTIS: A SMALL 3D IMAGING SONAR SENSOR FOR ROBOTIC APPLICATIONS [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5248

AUDIO CODEC ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS


Audio codecs are typically transform-domain based and efficiently code stationary audio signals, but they struggle with speech and signals containing dense transient events such as applause. Specifically, with these two classes of signals as examples, we demonstrate a technique for restoring audio from coding noise based on generative adversarial networks (GAN). A primary advantage of the proposed GAN-based coded audio enhancer is that the method operates end-to-end directly on decoded audio samples, eliminating the need to design any manually-crafted frontend.

Paper Details

Authors:
Arijit Biswas, Dai Jia
Submitted On:
14 May 2020 - 3:15am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

AUDIO CODEC ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS.pdf

(66)

Subscribe

[1] Arijit Biswas, Dai Jia, "AUDIO CODEC ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5244. Accessed: Dec. 02, 2020.
@article{5244-20,
url = {http://sigport.org/5244},
author = {Arijit Biswas; Dai Jia },
publisher = {IEEE SigPort},
title = {AUDIO CODEC ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS},
year = {2020} }
TY - EJOUR
T1 - AUDIO CODEC ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS
AU - Arijit Biswas; Dai Jia
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5244
ER -
Arijit Biswas, Dai Jia. (2020). AUDIO CODEC ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS. IEEE SigPort. http://sigport.org/5244
Arijit Biswas, Dai Jia, 2020. AUDIO CODEC ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS. Available at: http://sigport.org/5244.
Arijit Biswas, Dai Jia. (2020). "AUDIO CODEC ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS." Web.
1. Arijit Biswas, Dai Jia. AUDIO CODEC ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5244

Duration robust weakly supervised sound event detection

Paper Details

Authors:
Submitted On:
13 May 2020 - 9:53pm
Short Link:
Type:
Document Year:
Cite

Document Files

ICASSP2020_Duration_robust_Slides (1).pdf

(59)

Subscribe

[1] , "Duration robust weakly supervised sound event detection", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5188. Accessed: Dec. 02, 2020.
@article{5188-20,
url = {http://sigport.org/5188},
author = { },
publisher = {IEEE SigPort},
title = {Duration robust weakly supervised sound event detection},
year = {2020} }
TY - EJOUR
T1 - Duration robust weakly supervised sound event detection
AU -
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5188
ER -
. (2020). Duration robust weakly supervised sound event detection. IEEE SigPort. http://sigport.org/5188
, 2020. Duration robust weakly supervised sound event detection. Available at: http://sigport.org/5188.
. (2020). "Duration robust weakly supervised sound event detection." Web.
1. . Duration robust weakly supervised sound event detection [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5188

Learning with Out of Distribution Data for Audio Classification


In supervised machine learning, the assumption that training data is labelled correctly is not always satisfied. In this paper, we investigate an instance of labelling error for classification tasks in which the dataset is corrupted with out-of-distribution (OOD) instances: data that does not belong to any of the target classes, but is labelled as such. We show that detecting and relabelling certain OOD instances, rather than discarding them, can have a positive effect on learning.

Paper Details

Authors:
Wenwu Wang
Submitted On:
14 May 2020 - 8:06am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Presentation.pdf

(106)

Subscribe

[1] Wenwu Wang, "Learning with Out of Distribution Data for Audio Classification", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5137. Accessed: Dec. 02, 2020.
@article{5137-20,
url = {http://sigport.org/5137},
author = {Wenwu Wang },
publisher = {IEEE SigPort},
title = {Learning with Out of Distribution Data for Audio Classification},
year = {2020} }
TY - EJOUR
T1 - Learning with Out of Distribution Data for Audio Classification
AU - Wenwu Wang
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5137
ER -
Wenwu Wang. (2020). Learning with Out of Distribution Data for Audio Classification. IEEE SigPort. http://sigport.org/5137
Wenwu Wang, 2020. Learning with Out of Distribution Data for Audio Classification. Available at: http://sigport.org/5137.
Wenwu Wang. (2020). "Learning with Out of Distribution Data for Audio Classification." Web.
1. Wenwu Wang. Learning with Out of Distribution Data for Audio Classification [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5137

Pages