Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

wav2letter++ : A Fast Open-Source Speech Recognition Framework

Abstract: 

This paper introduces wav2letter++, a fast open-source deep learning speech recognition framework. wav2letter++ is written entirely in C++, and uses the ArrayFire tensor library for maximum efficiency. Here we explain the architecture and design of the wav2letter++ system and compare it to other major open-source speech recognition systems. In some cases wav2letter++ is more than 2x faster than other optimized frameworks for training end-to-end neural networks for speech recognition. We also show that wav2letter++'s training times scale linearly to 64 GPUs, the highest we tested, for models with 100 million parameters. High-performance frameworks enable fast iteration, which is often a crucial factor in successful research and model tuning on new datasets and tasks.

up
0 users have voted:

Paper Details

Authors:
Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert
Submitted On:
13 May 2019 - 8:40am
Short Link:
Type:
Poster
Event:
Presenter's Name:
Vineel Pratap
Paper Code:
4733
Document Year:
2019
Cite

Document Files

wav2letter++-poster.pdf

(47)

Subscribe

[1] Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert, "wav2letter++ : A Fast Open-Source Speech Recognition Framework", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4483. Accessed: Aug. 23, 2019.
@article{4483-19,
url = {http://sigport.org/4483},
author = {Vineel Pratap; Awni Hannun; Qiantong Xu; Jeff Cai; Jacob Kahn; Gabriel Synnaeve; Vitaliy Liptchinsky; Ronan Collobert },
publisher = {IEEE SigPort},
title = {wav2letter++ : A Fast Open-Source Speech Recognition Framework},
year = {2019} }
TY - EJOUR
T1 - wav2letter++ : A Fast Open-Source Speech Recognition Framework
AU - Vineel Pratap; Awni Hannun; Qiantong Xu; Jeff Cai; Jacob Kahn; Gabriel Synnaeve; Vitaliy Liptchinsky; Ronan Collobert
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4483
ER -
Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert. (2019). wav2letter++ : A Fast Open-Source Speech Recognition Framework. IEEE SigPort. http://sigport.org/4483
Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert, 2019. wav2letter++ : A Fast Open-Source Speech Recognition Framework. Available at: http://sigport.org/4483.
Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert. (2019). "wav2letter++ : A Fast Open-Source Speech Recognition Framework." Web.
1. Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert. wav2letter++ : A Fast Open-Source Speech Recognition Framework [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4483