Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

LATENT REPRESENTATION LEARNING FOR ARTIFICIAL BANDWIDTH EXTENSION USING A CONDITIONAL VARIATIONAL AUTO-ENCODER

Abstract: 

Artificial bandwidth extension (ABE) algorithms can improve speech quality when wideband devices are used with narrowband
devices or infrastructure. Most ABE solutions employ some form of memory, implying high-dimensional feature representations that increase both latency and complexity. Dimensionality reduction techniques have thus been developed to preserve efficiency. These entail the extraction of compact, low-dimensional representations that are then used with a standard regression model to estimate high-band components. Previous work shows that some form of supervision is crucial to the optimisation of dimensionality reduction techniques for ABE. This paper reports the first application of conditional variational auto-encoders (CVAEs) for supervised dimensionality reduction specifically tailored to ABE. CVAEs, form of directed, graphical models, are exploited to model higher-dimensional log-spectral data to extract the latent narrowband representations. When compared to results obtained with alternative dimensionality reduction techniques, objective and subjective assessments show that the
probabilistic latent representations learned with CVAEs produce bandwidth-extended speech signals of notably better quality.

up
0 users have voted:

Paper Details

Authors:
Pramod Bachhav, Massimiliano Todisco, Nicholas Evans
Submitted On:
7 May 2019 - 1:32pm
Short Link:
Type:
Poster
Event:
Presenter's Name:
Pramod Bachhav
Paper Code:
4583
Document Year:
2019
Cite

Document Files

ICASSP2019.pdf

(20)

Subscribe

[1] Pramod Bachhav, Massimiliano Todisco, Nicholas Evans, "LATENT REPRESENTATION LEARNING FOR ARTIFICIAL BANDWIDTH EXTENSION USING A CONDITIONAL VARIATIONAL AUTO-ENCODER", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/3932. Accessed: Jul. 22, 2019.
@article{3932-19,
url = {http://sigport.org/3932},
author = {Pramod Bachhav; Massimiliano Todisco; Nicholas Evans },
publisher = {IEEE SigPort},
title = {LATENT REPRESENTATION LEARNING FOR ARTIFICIAL BANDWIDTH EXTENSION USING A CONDITIONAL VARIATIONAL AUTO-ENCODER},
year = {2019} }
TY - EJOUR
T1 - LATENT REPRESENTATION LEARNING FOR ARTIFICIAL BANDWIDTH EXTENSION USING A CONDITIONAL VARIATIONAL AUTO-ENCODER
AU - Pramod Bachhav; Massimiliano Todisco; Nicholas Evans
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/3932
ER -
Pramod Bachhav, Massimiliano Todisco, Nicholas Evans. (2019). LATENT REPRESENTATION LEARNING FOR ARTIFICIAL BANDWIDTH EXTENSION USING A CONDITIONAL VARIATIONAL AUTO-ENCODER. IEEE SigPort. http://sigport.org/3932
Pramod Bachhav, Massimiliano Todisco, Nicholas Evans, 2019. LATENT REPRESENTATION LEARNING FOR ARTIFICIAL BANDWIDTH EXTENSION USING A CONDITIONAL VARIATIONAL AUTO-ENCODER. Available at: http://sigport.org/3932.
Pramod Bachhav, Massimiliano Todisco, Nicholas Evans. (2019). "LATENT REPRESENTATION LEARNING FOR ARTIFICIAL BANDWIDTH EXTENSION USING A CONDITIONAL VARIATIONAL AUTO-ENCODER." Web.
1. Pramod Bachhav, Massimiliano Todisco, Nicholas Evans. LATENT REPRESENTATION LEARNING FOR ARTIFICIAL BANDWIDTH EXTENSION USING A CONDITIONAL VARIATIONAL AUTO-ENCODER [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/3932