Audio and Acoustic Signal Processing

A Supervised Air-Tissue Boundary Segmentation Technique in real-time Magnetic Resonance Imaging Video using a Novel Measure of Contrast and Dynamic Programming

Read more about A Supervised Air-Tissue Boundary Segmentation Technique in real-time Magnetic Resonance Imaging Video using a Novel Measure of Contrast and Dynamic Programming
Log in to post comments

ICASSP_presentation_Advait_apr_14.pdf

ICASSP_presentation_Advait_apr_14.pdf (433)

Categories:: Audio and Acoustic Signal Processing

21 Views

Crowdsourcing Emotional Speech

Read more about Crowdsourcing Emotional Speech
Log in to post comments

We describe the methodology for the collection and annotation of a large corpus of emotional speech data through crowdsourcing. The corpus offers 187 hours of data from 2,965 subjects. Data includes non-emotional recordings from each subject as well as recordings for five emotions: angry, happy-low-arousal, happy-high-arousal, neutral,

ICASSP_SenSay_Poster_180409.pdf

ICASSP_SenSay_Poster_180409.pdf (453)

Categories:: Audio and Acoustic Signal Processing

7 Views

TOWARDS PREDICTING PHYSIOLOGY FROM SPEECH DURING STRESSFUL CONVERSATIONS: HEART RATE AND RESPIRATORY SINUS ARRHYTHMIA

Being affected by mental stress during conversations might have a direct or indirect effect on our speech acoustics as well as on our physiological responses. This paper presents a study on finding the relationship between these two modalities, speech acoustics and physiology, during stressful conversations between humans. Heart rate and respiratory sinus arrhythmia have been considered as physiological variables in the present study. Two datasets, one from stress induction sessions and the other one from in-lab discussions of relationship conflicts between couples, have been analyzed.

Stress_JatiWilliamsBaucomGeorgiou_final.pptx

Stress_JatiWilliamsBaucomGeorgiou_final.pptx (513)

Stress_JatiWilliamsBaucomGeorgiou_final_AJEdits.pptx

Stress_JatiWilliamsBaucomGeorgiou_final_AJEdits.pptx (420)

Categories:: Audio and Acoustic Signal Processing

8 Views

QUANTISATION EFFECTS IN DISTRIBUTED OPTIMISATION

Read more about QUANTISATION EFFECTS IN DISTRIBUTED OPTIMISATION
Log in to post comments

In this presentation, the effects of quantisation on distributed convex optimisation algorithms are explored via the lens of monotone operator theory. Specifically, by representing transmission quantisation via an additive noise model, we demonstrate how quantisation can be viewed as an instance of an inexact Krasnoselskii-Mann scheme. In the case of two distributed solvers, the Alternating Direction Method of Multipliers and the Primal Dual Method of Multipliers, we further demonstrate how an adaptive quantisation scheme can be constructed to reduce transmission costs between nodes.

Presentation QUANTISATION EFFECTS IN DISTRIBUTED OPTIMISATION.pdf

Presentation QUANTISATION EFFECTS IN DISTRIBUTED OPTIMISATION.pdf (345)

Categories:: Audio and Acoustic Signal Processing

11 Views

MULTI-VIEW AUDIO-ARTICULATORY FEATURES FOR PHONETIC RECOGNITION ON RTMRI-TIMIT DATABASE

In this paper, we investigate the use of articulatory informa-
tion, and more specifically real time Magnetic Resonance
Imaging (rtMRI) data of the vocal tract, to improve speech
recognition performance. For the purpose of our experiments,
we use data from the rtMRI-TIMIT database. Firstly, Scale
Invariant Feature Transform (SIFT) features are extracted for
each video frame. Afterwards, the SIFT descriptors of each
frame are transformed to a single histogram per picture, by
using the Bag of Visual Words methodology. Since this kind

ICASSP_2018_poster_final.pdf

ICASSP_2018_poster_final.pdf (1129)

Categories:: Audio and Acoustic Signal Processing

12 Views

A CONVERSATIONAL NEURAL LANGUAGE MODEL FOR SPEECH RECOGNITION IN DIGITAL ASSISTANTS

Read more about A CONVERSATIONAL NEURAL LANGUAGE MODEL FOR SPEECH RECOGNITION IN DIGITAL ASSISTANTS
Log in to post comments

Speech recognition in digital assistants such as Google Assistant can
potentially benefit from the use of conversational context consisting of user
queries and responses from the agent. We explore the use of recurrent,
Long Short-Term Memory (LSTM), neural language models (LMs) to model the conversations
in a digital assistant. Our proposed methods effectively capture the context of
previous utterances in a conversation without modifying the underlying LSTM
architecture. We demonstrate a 4% relative improvement in recognition performance

conversation.pdf

conversation.pdf (445)

Categories:: Audio and Acoustic Signal Processing

66 Views

USING ACCELEROMETRIC AND GYROSCOPIC DATA TO IMPROVE BLOOD PRESSURE PREDICTION FROM PULSE TRANSIT TIME USING RECURRENT NEURAL NETWORK

ICASSP_2018_Poster.pdf

ICASSP_2018_Poster.pdf (398)

Categories:: Audio and Acoustic Signal Processing

18 Views

Grid-Free Direction-of-Arrival Estimation with Compressed Sensing and Arbitrary Antenna Arrays

We study the problem of direction of arrival estimation for arbitrary antenna
arrays. We formulate it as a continuous line spectral estimation problem and solve it under
a sparsity prior without any gridding assumptions. Moreover, we incorporate the
array's beampattern in form of the Effective Aperture Distribution Function
(EADF), which allows to use arbitrary (synthetic as well as measured) antenna
arrays. This generalizes known atomic norm based grid-free DOA estimation methods (that