Sorry, you need to enable JavaScript to visit this website.

Audio and Acoustic Signal Processing

Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)


With the strong growth of assistive and personal listening devices, natural sound rendering over headphones is becoming a necessity for prolonged listening in multimedia and virtual reality applications. The aim of natural sound rendering is to naturally recreate the sound scenes with the spatial and timbral quality as natural as possible, so as to achieve a truly immersive listening experience. However, rendering natural sound over headphones encounters many challenges. This tutorial article presents signal processing techniques to tackle these challenges to assist human listening.

Paper Details

Authors:
Kaushik Sunder, Ee-Leng Tan
Submitted On:
23 February 2016 - 1:43pm
Short Link:
Type:

Document Files

SPM15slides_Natural Sound Rendering for Headphones.pdf

(594 downloads)

Keywords

Subscribe

[1] Kaushik Sunder, Ee-Leng Tan, "Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)", IEEE SigPort, 2015. [Online]. Available: http://sigport.org/167. Accessed: May. 27, 2018.
@article{167-15,
url = {http://sigport.org/167},
author = {Kaushik Sunder; Ee-Leng Tan },
publisher = {IEEE SigPort},
title = {Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)},
year = {2015} }
TY - EJOUR
T1 - Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)
AU - Kaushik Sunder; Ee-Leng Tan
PY - 2015
PB - IEEE SigPort
UR - http://sigport.org/167
ER -
Kaushik Sunder, Ee-Leng Tan. (2015). Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides). IEEE SigPort. http://sigport.org/167
Kaushik Sunder, Ee-Leng Tan, 2015. Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides). Available at: http://sigport.org/167.
Kaushik Sunder, Ee-Leng Tan. (2015). "Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides)." Web.
1. Kaushik Sunder, Ee-Leng Tan. Natural Sound Rendering for Headphones: Integration of signal processing techniques (slides) [Internet]. IEEE SigPort; 2015. Available from : http://sigport.org/167

Unsupervised Learning of Semantic Audio Representations

Paper Details

Authors:
Submitted On:
24 May 2018 - 8:46pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP18_ Unsupervised Learning of Semantic Audio Representations.pdf

(8 downloads)

Keywords

Subscribe

[1] , "Unsupervised Learning of Semantic Audio Representations", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3208. Accessed: May. 27, 2018.
@article{3208-18,
url = {http://sigport.org/3208},
author = { },
publisher = {IEEE SigPort},
title = {Unsupervised Learning of Semantic Audio Representations},
year = {2018} }
TY - EJOUR
T1 - Unsupervised Learning of Semantic Audio Representations
AU -
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3208
ER -
. (2018). Unsupervised Learning of Semantic Audio Representations. IEEE SigPort. http://sigport.org/3208
, 2018. Unsupervised Learning of Semantic Audio Representations. Available at: http://sigport.org/3208.
. (2018). "Unsupervised Learning of Semantic Audio Representations." Web.
1. . Unsupervised Learning of Semantic Audio Representations [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3208

Study Of Dense Network Approaches For Speech Emotion Recognition

Paper Details

Authors:
Mohammed Abdelwahab, Carlos Busso
Submitted On:
30 April 2018 - 11:45am
Short Link:
Type:
Event:
Paper Code:
Document Year:
Cite

Document Files

Abdelwahab_ICASSP_2018-poster.pdf

(25 downloads)

Keywords

Subscribe

[1] Mohammed Abdelwahab, Carlos Busso, " Study Of Dense Network Approaches For Speech Emotion Recognition", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3192. Accessed: May. 27, 2018.
@article{3192-18,
url = {http://sigport.org/3192},
author = {Mohammed Abdelwahab; Carlos Busso },
publisher = {IEEE SigPort},
title = { Study Of Dense Network Approaches For Speech Emotion Recognition},
year = {2018} }
TY - EJOUR
T1 - Study Of Dense Network Approaches For Speech Emotion Recognition
AU - Mohammed Abdelwahab; Carlos Busso
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3192
ER -
Mohammed Abdelwahab, Carlos Busso. (2018). Study Of Dense Network Approaches For Speech Emotion Recognition. IEEE SigPort. http://sigport.org/3192
Mohammed Abdelwahab, Carlos Busso, 2018. Study Of Dense Network Approaches For Speech Emotion Recognition. Available at: http://sigport.org/3192.
Mohammed Abdelwahab, Carlos Busso. (2018). " Study Of Dense Network Approaches For Speech Emotion Recognition." Web.
1. Mohammed Abdelwahab, Carlos Busso. Study Of Dense Network Approaches For Speech Emotion Recognition [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3192

Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment

Paper Details

Authors:
Submitted On:
30 April 2018 - 10:36am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

main.pdf

(34 downloads)

Keywords

Subscribe

[1] , "Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3189. Accessed: May. 27, 2018.
@article{3189-18,
url = {http://sigport.org/3189},
author = { },
publisher = {IEEE SigPort},
title = {Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment},
year = {2018} }
TY - EJOUR
T1 - Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment
AU -
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3189
ER -
. (2018). Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment. IEEE SigPort. http://sigport.org/3189
, 2018. Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment. Available at: http://sigport.org/3189.
. (2018). "Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment." Web.
1. . Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3189

Compressive Sampling of Sound Fields Using Moving Microphones

Paper Details

Authors:
Fabrice Katzberg, Radoslaw Mazur, Marco Maass, Philipp Koch, Alfred Mertins
Submitted On:
24 April 2018 - 11:24am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

presICASSP4.pdf

(31 downloads)

Keywords

Subscribe

[1] Fabrice Katzberg, Radoslaw Mazur, Marco Maass, Philipp Koch, Alfred Mertins, "Compressive Sampling of Sound Fields Using Moving Microphones", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3169. Accessed: May. 27, 2018.
@article{3169-18,
url = {http://sigport.org/3169},
author = {Fabrice Katzberg; Radoslaw Mazur; Marco Maass; Philipp Koch; Alfred Mertins },
publisher = {IEEE SigPort},
title = {Compressive Sampling of Sound Fields Using Moving Microphones},
year = {2018} }
TY - EJOUR
T1 - Compressive Sampling of Sound Fields Using Moving Microphones
AU - Fabrice Katzberg; Radoslaw Mazur; Marco Maass; Philipp Koch; Alfred Mertins
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3169
ER -
Fabrice Katzberg, Radoslaw Mazur, Marco Maass, Philipp Koch, Alfred Mertins. (2018). Compressive Sampling of Sound Fields Using Moving Microphones. IEEE SigPort. http://sigport.org/3169
Fabrice Katzberg, Radoslaw Mazur, Marco Maass, Philipp Koch, Alfred Mertins, 2018. Compressive Sampling of Sound Fields Using Moving Microphones. Available at: http://sigport.org/3169.
Fabrice Katzberg, Radoslaw Mazur, Marco Maass, Philipp Koch, Alfred Mertins. (2018). "Compressive Sampling of Sound Fields Using Moving Microphones." Web.
1. Fabrice Katzberg, Radoslaw Mazur, Marco Maass, Philipp Koch, Alfred Mertins. Compressive Sampling of Sound Fields Using Moving Microphones [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3169

MULTI-DIALECT SPEECH RECOGNITION WITH A SINGLE SEQUENCE-TO-SEQUENCE MODEL

Paper Details

Authors:
Bo Li, Tara Sainath, Khe Chai Sim, Michiel Bacchiani, Eugene Weinstein, Patrick Nguyen, Zhifeng Chen, Yonghui Wu, Kanishka Rao
Submitted On:
23 April 2018 - 7:59pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

MultiDialect LAS

(31 downloads)

Keywords

Subscribe

[1] Bo Li, Tara Sainath, Khe Chai Sim, Michiel Bacchiani, Eugene Weinstein, Patrick Nguyen, Zhifeng Chen, Yonghui Wu, Kanishka Rao, "MULTI-DIALECT SPEECH RECOGNITION WITH A SINGLE SEQUENCE-TO-SEQUENCE MODEL", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3154. Accessed: May. 27, 2018.
@article{3154-18,
url = {http://sigport.org/3154},
author = {Bo Li; Tara Sainath; Khe Chai Sim; Michiel Bacchiani; Eugene Weinstein; Patrick Nguyen; Zhifeng Chen; Yonghui Wu; Kanishka Rao },
publisher = {IEEE SigPort},
title = {MULTI-DIALECT SPEECH RECOGNITION WITH A SINGLE SEQUENCE-TO-SEQUENCE MODEL},
year = {2018} }
TY - EJOUR
T1 - MULTI-DIALECT SPEECH RECOGNITION WITH A SINGLE SEQUENCE-TO-SEQUENCE MODEL
AU - Bo Li; Tara Sainath; Khe Chai Sim; Michiel Bacchiani; Eugene Weinstein; Patrick Nguyen; Zhifeng Chen; Yonghui Wu; Kanishka Rao
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3154
ER -
Bo Li, Tara Sainath, Khe Chai Sim, Michiel Bacchiani, Eugene Weinstein, Patrick Nguyen, Zhifeng Chen, Yonghui Wu, Kanishka Rao. (2018). MULTI-DIALECT SPEECH RECOGNITION WITH A SINGLE SEQUENCE-TO-SEQUENCE MODEL. IEEE SigPort. http://sigport.org/3154
Bo Li, Tara Sainath, Khe Chai Sim, Michiel Bacchiani, Eugene Weinstein, Patrick Nguyen, Zhifeng Chen, Yonghui Wu, Kanishka Rao, 2018. MULTI-DIALECT SPEECH RECOGNITION WITH A SINGLE SEQUENCE-TO-SEQUENCE MODEL. Available at: http://sigport.org/3154.
Bo Li, Tara Sainath, Khe Chai Sim, Michiel Bacchiani, Eugene Weinstein, Patrick Nguyen, Zhifeng Chen, Yonghui Wu, Kanishka Rao. (2018). "MULTI-DIALECT SPEECH RECOGNITION WITH A SINGLE SEQUENCE-TO-SEQUENCE MODEL." Web.
1. Bo Li, Tara Sainath, Khe Chai Sim, Michiel Bacchiani, Eugene Weinstein, Patrick Nguyen, Zhifeng Chen, Yonghui Wu, Kanishka Rao. MULTI-DIALECT SPEECH RECOGNITION WITH A SINGLE SEQUENCE-TO-SEQUENCE MODEL [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3154

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK

Paper Details

Authors:
Submitted On:
22 April 2018 - 11:10pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

WenjieGuan-3304-2018_ICASSP_POSTER.pdf

(25 downloads)

Keywords

Subscribe

[1] , "MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3142. Accessed: May. 27, 2018.
@article{3142-18,
url = {http://sigport.org/3142},
author = { },
publisher = {IEEE SigPort},
title = {MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK},
year = {2018} }
TY - EJOUR
T1 - MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK
AU -
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3142
ER -
. (2018). MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. IEEE SigPort. http://sigport.org/3142
, 2018. MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Available at: http://sigport.org/3142.
. (2018). "MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK." Web.
1. . MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3142

Whole Sentence Neural Language Model


Recurrent neural networks have become increasingly popular for the task of language modeling achieving impressive gains in state-of-the-art speech recognition and natural language processing (NLP) tasks. Recurrent models exploit word dependencies over a much longer context window (as retained by the history states) than what is feasible with n-gram language models.

Paper Details

Authors:
Abhinav Sethy, Kartik Audhkhasi, Bhuvana Ramabhadran
Submitted On:
20 April 2018 - 10:30pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

whole sentence neural language model

(52 downloads)

Keywords

Additional Categories

Subscribe

[1] Abhinav Sethy, Kartik Audhkhasi, Bhuvana Ramabhadran, "Whole Sentence Neural Language Model ", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3118. Accessed: May. 27, 2018.
@article{3118-18,
url = {http://sigport.org/3118},
author = {Abhinav Sethy; Kartik Audhkhasi; Bhuvana Ramabhadran },
publisher = {IEEE SigPort},
title = {Whole Sentence Neural Language Model },
year = {2018} }
TY - EJOUR
T1 - Whole Sentence Neural Language Model
AU - Abhinav Sethy; Kartik Audhkhasi; Bhuvana Ramabhadran
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3118
ER -
Abhinav Sethy, Kartik Audhkhasi, Bhuvana Ramabhadran. (2018). Whole Sentence Neural Language Model . IEEE SigPort. http://sigport.org/3118
Abhinav Sethy, Kartik Audhkhasi, Bhuvana Ramabhadran, 2018. Whole Sentence Neural Language Model . Available at: http://sigport.org/3118.
Abhinav Sethy, Kartik Audhkhasi, Bhuvana Ramabhadran. (2018). "Whole Sentence Neural Language Model ." Web.
1. Abhinav Sethy, Kartik Audhkhasi, Bhuvana Ramabhadran. Whole Sentence Neural Language Model [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3118

Signboard Saliency Detection in Street Videos

Paper Details

Authors:
Submitted On:
20 April 2018 - 4:33pm
Short Link:
Type:

Document Files

ICASSP_onkar.pdf

(26 downloads)

Keywords

Subscribe

[1] , "Signboard Saliency Detection in Street Videos", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3114. Accessed: May. 27, 2018.
@article{3114-18,
url = {http://sigport.org/3114},
author = { },
publisher = {IEEE SigPort},
title = {Signboard Saliency Detection in Street Videos},
year = {2018} }
TY - EJOUR
T1 - Signboard Saliency Detection in Street Videos
AU -
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3114
ER -
. (2018). Signboard Saliency Detection in Street Videos. IEEE SigPort. http://sigport.org/3114
, 2018. Signboard Saliency Detection in Street Videos. Available at: http://sigport.org/3114.
. (2018). "Signboard Saliency Detection in Street Videos." Web.
1. . Signboard Saliency Detection in Street Videos [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3114

Acoustic Reflector Localization and Classification


The process of understanding acoustic properties of environments is important for several applications, such as spatial audio, augmented reality and source separation. In this paper, multichannel room impulse responses are recorded and transformed into their direction of arrival (DOA)-time domain, by employing a superdirective beamformer. This domain can be represented as a 2D image. Hence, a novel image processing method is proposed to analyze the DOA-time domain, and estimate the reflection times of arrival and DOAs. The main acoustically reflective objects are then localized.

Paper Details

Authors:
Luca Remaggi, Hansung Kim, Philip J. B. Jackson, Filippo M. Fazi, Adrian Hilton
Submitted On:
20 April 2018 - 12:07pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Remaggietal_ICASSP2018.pdf

(23 downloads)

Keywords

Subscribe

[1] Luca Remaggi, Hansung Kim, Philip J. B. Jackson, Filippo M. Fazi, Adrian Hilton, "Acoustic Reflector Localization and Classification", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3109. Accessed: May. 27, 2018.
@article{3109-18,
url = {http://sigport.org/3109},
author = {Luca Remaggi; Hansung Kim; Philip J. B. Jackson; Filippo M. Fazi; Adrian Hilton },
publisher = {IEEE SigPort},
title = {Acoustic Reflector Localization and Classification},
year = {2018} }
TY - EJOUR
T1 - Acoustic Reflector Localization and Classification
AU - Luca Remaggi; Hansung Kim; Philip J. B. Jackson; Filippo M. Fazi; Adrian Hilton
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3109
ER -
Luca Remaggi, Hansung Kim, Philip J. B. Jackson, Filippo M. Fazi, Adrian Hilton. (2018). Acoustic Reflector Localization and Classification. IEEE SigPort. http://sigport.org/3109
Luca Remaggi, Hansung Kim, Philip J. B. Jackson, Filippo M. Fazi, Adrian Hilton, 2018. Acoustic Reflector Localization and Classification. Available at: http://sigport.org/3109.
Luca Remaggi, Hansung Kim, Philip J. B. Jackson, Filippo M. Fazi, Adrian Hilton. (2018). "Acoustic Reflector Localization and Classification." Web.
1. Luca Remaggi, Hansung Kim, Philip J. B. Jackson, Filippo M. Fazi, Adrian Hilton. Acoustic Reflector Localization and Classification [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3109

Pages