Sorry, you need to enable JavaScript to visit this website.

This article proposes an automatic approach - based on nonverbal speech features - aimed at the automatic discrimination between depressed and non-depressed speakers. The experiments have been performed over one of the largest corpora collected for such a task in the literature ($62$ patients diagnosed with depression and $54$ healthy control subjects), especially when it comes to data where the depressed speakers have been diagnosed as such by professional psychiatrists.


Sound event detection is the task of detecting the type, starting time, and ending time of sound events in audio streams. Recently, recurrent neural networks (RNNs) have become the mainstream solution for sound event detection. Because RNNs make a prediction at every frame, it is necessary to provide exact starting and ending times of the sound events in the training data, making data annotation an extremely time-consuming process.


With the strong growth of assistive and personal listening devices, natural sound rendering over headphones is becoming a necessity for prolonged listening in multimedia and virtual reality applications. The aim of natural sound rendering is to naturally recreate the sound scenes with the spatial and timbral quality as natural as possible, so as to achieve a truly immersive listening experience. However, rendering natural sound over headphones encounters many challenges. This tutorial article presents signal processing techniques to tackle these challenges to assist human listening.