ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.
- Read more about Measuring the Task Induced Oscillatory Brain Activity Using Tensor Decomposition
- Log in to post comments
- Categories:
- Read more about OPTIMIZATION OF SPEAKER EXTRACTION NEURAL NETWORK WITH MAGNITUDE AND TEMPORAL SPECTRUM APPROXIMATION LOSS
- Log in to post comments
The SpeakerBeam-FE (SBF) method is proposed for speaker extraction. It attempts to overcome the problem of unknown number of speakers in an audio recording during source separation. The mask approximation loss of SBF is sub-optimal, which doesn’t calculate direct signal reconstruction error and consider the speech context. To address these problems, this paper proposes a magnitude and temporal spectrum approximation loss to estimate a phase sensitive mask for the target speaker with the speaker characteristics.
- Categories:
- Read more about ICASSP 2019 Poster for Paper #3198: PRIVACY-AWARE FEATURE EXTRACTION FOR GENDER DISCRIMINATION VERSUS SPEAKER IDENTIFICATION
- Log in to post comments
This paper introduces a deep neural network based feature extraction scheme that aims to improve the trade-off between utility and privacy in speaker classification tasks. In the proposed scenario we develop a feature representation that helps to maximize the performance of a gender classifier while minimizing additional speaker
- Categories:
- Read more about Global Energy Efficiency Maximization in Non-Orthogonal Interference Networks
- Log in to post comments
Energy efficient resource allocation in interference networks is a challenging global optimization problem. The main issue is that the computational complexity grows exponentially in the number of variables. In general, resource allocation in interference networks requires optimizing jointly over achievable rates and transmit powers. However, close scrutiny reveals that the non-convexity stems mostly from the powers while the problem is linear in the rates.
- Categories:
- Read more about Content Adaptive Wavelet Lifting for Scalable Lossless Video Coding
- Log in to post comments
Scalable lossless video coding is an important aspect for many professional applications. Wavelet-based video coding decomposes an input sequence into a lowpass and a highpass subband by filtering along the temporal axis. The lowpass subband can be used for previewing purposes, while the highpass subband provides the residual content for lossless reconstruction of the original sequence. The recursive application of the wavelet transform to the lowpass subband of the previous stage yields coarser temporal resolutions of the input sequence.
- Categories:
- Read more about Active Anomaly Detection with Switching Cost
- Log in to post comments
The problem of anomaly detection among multiple processes is considered within the framework of sequential design of experiments. The objective is an active inference strategy consisting of a selection rule governing which process to probe at each time, a stopping rule on when to terminate the detection, and a decision rule on the final detection outcome. The performance measure is the Bayes risk that takes into account not only sample complexity and detection errors, but also costs associated with switching across processes.
Poster-1.pdf
- Categories:
- Read more about Overlap-Add Windows with Maximum Energy Concentration for Speech and Audio Processing
- Log in to post comments
Processing of speech and audio signals with time-frequency representations require windowing methods which allow perfect reconstruction of the original signal and where processing artifacts have a predictable behavior. The most common approach for this purpose is overlap-add windowing, where signal segments are windowed before and after processing. Commonly used windows include the half-sine and a Kaiser-Bessel derived window. The latter is an approximation of the discrete prolate spherical sequence, and thus a maximum energy concentration window, adapted for overlap-add.
- Categories:
- Read more about Artificial Bandwidth Extension of Narrowband Speech Using Generative Adversarial Networks
- Log in to post comments
The aim of artificial bandwidth extension is to recreate wideband speech (0 - 8 kHz) from a narrowband speech signal (0 - 4 kHz). State-of-the-art approaches use neural networks for this task. As a loss function during training, they employ the mean squared error between true and estimated wideband spectra. This, however, comes with the drawback of over-smoothing, which expresses itself in strongly underestimated dynamics of the upper frequency band.
- Categories:
- Read more about Dynamic Temporal Alignment of Speech to Lips
- Log in to post comments
- Categories:
- Read more about Poster: Turning a vulnerability into an asset: Accelerating Facial Identification with Morphing
- Log in to post comments
In recent years, morphing of facial images has arisen as an important attack vector on biometric systems. Detection of morphed images has proven challenging for automated systems and human experts alike. Likewise, in recent years, the importance of efficient (fast) biometric identification has been emphasised by the rapid rise and growth of large-scale biometric systems around the world.
- Categories: