ICASSP 2018

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

FAST TEXTURE INTRA SIZE CODING BASED ON BIG DATA CLUSTERING FOR 3D-HEVC

Read more about FAST TEXTURE INTRA SIZE CODING BASED ON BIG DATA CLUSTERING FOR 3D-HEVC
Log in to post comments

Poster_ICASSP_2018 (1).pdf

Poster_ICASSP_2018 (1).pdf (354)

Categories:: Image/Video Coding

11 Views

Speaker Diarization with LSTM

Read more about Speaker Diarization with LSTM
Log in to post comments

For many years, i-vector based audio embedding techniques were the dominant approach for speaker verification and speaker diarization applications. However, mirroring the rise of deep learning in various domains, neural network based audio embeddings, also known as d-vectors, have consistently demonstrated superior speaker verification performance. In this paper, we build on the success of d-vector based speaker verification systems to develop a new d-vector based approach to speaker diarization.

icassp2018_diarization_poster.pdf

icassp2018_poster_quan_diarization (536)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)
Neural network learning (MLR-NNLR)

43 Views

FAST DISTRIBUTED SUBSPACE PROJECTION VIA GRAPH FILTERS

Read more about FAST DISTRIBUTED SUBSPACE PROJECTION VIA GRAPH FILTERS
Log in to post comments

poster_landscape.pdf

poster_landscape.pdf (598)

Categories:: Filter Design

6 Views

GRAPH-BASED TRANSFORMS FOR PREDICTIVE LIGHT FIELD COMPRESSION BASED ON SUPER-PIXELS

Read more about GRAPH-BASED TRANSFORMS FOR PREDICTIVE LIGHT FIELD COMPRESSION BASED ON SUPER-PIXELS
Log in to post comments

In this paper, we explore the use of graph-basedtransforms to capture correlation in light fields. We consider a scheme in which view synthesis is used as a first step to exploit inter-view correlation. Local graph-based transforms (GT) are then considered for energy compaction of the residue signals. The structure of the local graphs is derived from a coherent super-pixel over-segmentation of the different views. The GT is computed and applied in a separable manner with a first spatial unweighted transform followed by an inter-view GT.

Rizkallah-Mira-SIROCCO.pdf

Rizkallah-Mira-SIROCCO.pdf (548)

Categories:: Image/Video Coding

13 Views

PERCEPTUALLY MOTIVATED ANALYSIS OF NUMERICALLY SIMULATED HEAD-RELATED TRANSFER FUNCTIONS GENERATED BY VARIOUS 3D SURFACE SCANNING SYSTEMS

Numerical simulations offer a feasible alternative to the direct acoustic measurement of individual head-related transfer functions (HRTFs). For the acquisition of high quality 3D surface scans, as required for these simulations, several approaches exist. In this paper, we systematically analyze the variations between different approaches and evaluate the influence of the accuracy of 3D scans on the resulting simulated HRTFs. To assess this effect, HRTFs were numerically simulated based on 3D scans of the head and pinna of the FABIAN dummy head generated with 6 different methods.

PosterPresentations_ICASSP2018_Manoj_2107.pdf

PosterPresentations_ICASSP2018_Manoj_2107.pdf (593)

Categories:: Spatial and Multichannel Audio

38 Views

Hands-on in Signal Processing Education at Technische Universität Darmstadt

Read more about Hands-on in Signal Processing Education at Technische Universität Darmstadt
Log in to post comments

This paper is meant to share our experience on signal processing hands-on opportunities within the formal engineering education at Technische Universität Darmstadt. It is our strong belief that undergraduate students should be offered hands-on opportunities from the very beginning of their studies until their graduation. We describe our projects, lectures and seminars that we provide undergraduate students to gain hands-on experience inside signal processing along the time line of the curriculum.

ICASSP_Presentation.pdf

ICASSP_2018_Presentation (826)

Categories:: Signal Processing Education

27 Views

AUTOMATED DETECTION OF HIGH FDG UPTAKE REGIONS IN CT IMAGES

Read more about AUTOMATED DETECTION OF HIGH FDG UPTAKE REGIONS IN CT IMAGES
Log in to post comments

Combined PET-CT scan is an important diagnostic tool in modern medicine, e.g. for staging or treatment planning in the field of oncology. Especially in small structures, like a tumour, textural variations visible in a PET image are not visually recognizable within a CT scan from the same region. Thus, both modalities are necessary for diagnosis. Since both techniques expose the patient to radiation, it would be desirable to get the same information about metabolic activity contained in the PET image from a CT scan only.

poster_icassp2018.pdf

poster_icassp2018.pdf (198)

Categories:: Medical image analysis

4 Views

ATTENTION-BASED MODELS FOR TEXT-DEPENDENT SPEAKER VERIFICATION

Read more about ATTENTION-BASED MODELS FOR TEXT-DEPENDENT SPEAKER VERIFICATION
Log in to post comments

Attention-based models have recently shown great performance on a range of tasks, such as speech recognition, machine translation, and image captioning due to their ability to summarize relevant information that expands through the entire length of an input sequence. In this paper, we analyze the usage of attention mechanisms to the problem of sequence summarization in our end-to-end text-dependent speaker recognition system. We explore different topologies and their variants of the attention layer, and compare different pooling methods on the attention weights.

icassp2018_poster_Reza_5.pdf

icassp2018_poster_reza_attention (782)

Categories:: Neural network learning (MLR-NNLR)
Speaker Recognition and Characterization (SPE-SPKR)

162 Views

Pattern Localization in Time Series through Signal-To-Model Alignment in Latent Space

Read more about Pattern Localization in Time Series through Signal-To-Model Alignment in Latent Space
Log in to post comments

In this paper, we study the problem of locating a predefined sequence of patterns in a time series. In particular, the studied scenario assumes a theoretical model is available that contains the expected locations of the patterns. This problem is found in several contexts, and it is commonly solved by first synthesizing a time series from the model, and then aligning it to the true time series through dynamic time warping. We propose a technique that increases the similarity of both time series before aligning them, by mapping them into a latent correlation space.