ICASSP 2019

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

EXPLORING RETRAINING-FREE SPEECH RECOGNITION FOR INTRA-SENTENTIAL CODE-SWITCHING

Read more about EXPLORING RETRAINING-FREE SPEECH RECOGNITION FOR INTRA-SENTENTIAL CODE-SWITCHING
Log in to post comments

Code Switching refers to the phenomenon of changing languages within a sentence or discourse, and it represents a challenge for conventional automatic speech recognition systems deployed to tackle a single target language. The code switching problem is complicated by the lack of multi-lingual training data needed to build new and ad hoc multi-lingual acoustic and language models. In this work, we present a prototype research code-switching speech recognition system that leverages existing monolingual acoustic and language models, i.e., no ad hoc training is needed.

CS_final-3 copy.pdf

CS_final-3 copy.pdf (446)

Categories:: Multilingual Recognition and Identification (SPE-MULT)

92 Views

Directional interference suppression using a spatial relative transfer function feature

ICASSP2019_poster_SpatialSuppressor.pdf

ICASSP2019_poster_SpatialSuppressor.pdf (351)

Categories:: Source Separation and Signal Enhancement

10 Views

Multiple Linear Regression for High Efficiency Video Intra Coding

Read more about Multiple Linear Regression for High Efficiency Video Intra Coding
Log in to post comments

In video coding frameworks, the essence of intra coding is leveraging the spatial correlation within a frame to remove redundancy thus achieving compact transmitting data. With modern video acquisition devices improvement, more high-definition videos emerge into people’s lives which has set a new challenge for high-efficiency video coding. In this paper, we propose a novel intra video coding scheme based on Multiple Linear Regression (MLR), named Multiple linear regression Intra Prediction (MIP).

icassp_mlr_for_intra_coding.pdf

icassp_mlr_for_intra_coding.pdf (417)

Categories:: Image/Video Coding

15 Views

Sensor-Assisted Global Motion Estimation for Efficient UAV Video Coding

Read more about Sensor-Assisted Global Motion Estimation for Efficient UAV Video Coding
Log in to post comments

Poster.pdf

Poster.pdf (997)

Categories:: Image/Video Coding

7 Views

Sensor-Assisted Global Motion Estimation for Efficient UAV Video Coding

Read more about Sensor-Assisted Global Motion Estimation for Efficient UAV Video Coding
Log in to post comments

Poster.pdf

Poster.pdf (507)

Categories:: Image/Video Coding

4 Views

Single-channel Speech Extraction Using Speaker Inventory and Attention Network

Read more about Single-channel Speech Extraction Using Speaker Inventory and Attention Network
Log in to post comments

ICASSP2019_SpeakerExtractionWithAttentionAndInventory_v21b.pptx

ICASSP2019_SpeakerExtractionWithAttentionAndInventory_v21b.pptx (519)

Categories:: Source Separation and Signal Enhancement

125 Views

Toeplitz Matrix Completion for Direction Finding Using a Modified Nested Linear Array

Read more about Toeplitz Matrix Completion for Direction Finding Using a Modified Nested Linear Array
Log in to post comments

A modified nested linear array (MNLA) has been reported recently for a greater potential in increasing the degree-of-freedom. However, there exist some “holes” in the difference co-array, which results in missing “lags” and limited performance of direction-of-arrival (DOA) estimation. In order to tackle this problem, this paper applies a Toeplitz matrix completion technique to MNLA, and investigates the performance of DOA estimation on this basis. Particularly, a semidefinite program with trace minimization is derived to obtain the covariance matrix with Hermitian and Toeplitz structure.

ICASSP2019_HuipingHuang.pdf

Poster of ICASSP Paper#2290 (421)

Categories:: Sensor Array Processing

58 Views

PERCEPTUALLY ENHANCED SINGLE FREQUENCY FILTERING FOR DYSARTHRIC SPEECH DETECTION AND INTELLIGIBILITY ASSESSMENT

This paper proposes a new speech feature representation that improves the intelligibility assessment of dysarthric speech. The formulation of the feature set is motivated from the human auditory perception and high time-frequency resolution property of single frequency filtering (SFF) technique. The proposed features are named as perceptually enhanced single frequency cepstral coefficients (PESFCC). As a part of SFF technique implementation, speech signal passed through a single pole complex bandpass filter bank to obtain high-resolution time-frequency distribution.

ICASSP_POSTER.pdf

ICASSP_POSTER.pdf (663)

ICASSP_POSTER.pdf

ICASSP_POSTER.pdf (594)

ICASSP_POSTER.pdf

ICASSP_POSTER.pdf (613)

Categories:: Speech Analysis (SPE-ANLS)

66 Views

Cross-Language Speech Dependent Lip-Synchronization

Read more about Cross-Language Speech Dependent Lip-Synchronization
Log in to post comments

Understanding videos of people speaking across international borders is hard as audiences from different demographies do not understand the language. Such speech videos are often supplemented with language subtitles. However, these hamper the viewing experience as the attention is shared. Simple audio dubbing in a different language makes the video appear unnatural due to unsynchronized lip motion. In this paper, we propose a system for automated cross-language lip synchronization for re-dubbed videos.