Sorry, you need to enable JavaScript to visit this website.

A First Attempt at Polyphonic Sound Event Detection Using Connectionist Temporal Classification

Error message

  • The specified file temporary://filereaTzc could not be copied, because the destination directory is not properly configured. This may be caused by a problem with file or directory permissions. More information is available in the system log.
  • The specified file temporary://fileblLHAT could not be copied, because the destination directory is not properly configured. This may be caused by a problem with file or directory permissions. More information is available in the system log.
  • The specified file temporary://filexwRiHc could not be copied, because the destination directory is not properly configured. This may be caused by a problem with file or directory permissions. More information is available in the system log.
  • The specified file temporary://fileeFq7hD could not be copied, because the destination directory is not properly configured. This may be caused by a problem with file or directory permissions. More information is available in the system log.
  • The specified file temporary://fileDI1jax could not be copied, because the destination directory is not properly configured. This may be caused by a problem with file or directory permissions. More information is available in the system log.
  • The specified file temporary://fileEC6Y3N could not be copied, because the destination directory is not properly configured. This may be caused by a problem with file or directory permissions. More information is available in the system log.
  • The specified file temporary://filecCfow5 could not be copied, because the destination directory is not properly configured. This may be caused by a problem with file or directory permissions. More information is available in the system log.
  • The specified file temporary://fileYdGcAG could not be copied, because the destination directory is not properly configured. This may be caused by a problem with file or directory permissions. More information is available in the system log.
  • The specified file temporary://fileHVjpdy could not be copied, because the destination directory is not properly configured. This may be caused by a problem with file or directory permissions. More information is available in the system log.
Citation Author(s):
Florian Metze
Submitted by:
Yun Wang
Last updated:
27 February 2017 - 5:12pm
Document Type:
Poster
Document Year:
2017
Event:
Presenters:
Yun Wang
Paper Code:
3217
 

Sound event detection is the task of detecting the type, starting time, and ending time of sound events in audio streams. Recently, recurrent neural networks (RNNs) have become the mainstream solution for sound event detection. Because RNNs make a prediction at every frame, it is necessary to provide exact starting and ending times of the sound events in the training data, making data annotation an extremely time-consuming process. Connectionist temporal classification (CTC), as a sequence-to-sequence model, can relax this constraint, because it suffices to provide ordered sequences of sound events without exact starting and ending times.

This paper presents a first attempt at using CTC for sound event detection. In the polyphonic situation, sound events may overlap with each other, making it hard to define ordered sequences of sound events. We propose to use the boundaries (i.e. starts and ends) of the sound events as tokens for CTC. We show that CTC is able to locate the boundaries of sound events on a very noisy corpus of consumer generated content with rough hints about their positions. The CTC approach seems to be particularly suited to detecting short and transient sounds, which have traditionally been hardest to detect.

up
0 users have voted: