ICASSP 2019

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

DEVELOPMENT AND EVALUATION OF JAPANESE TEXT-TO-SPEECH MIDDLEWARE FOR 32-BIT MICROCONTROLLERS

icassp2019_nishizawa_poster.pdf

icassp2019_nishizawa_poster.pdf (521)

Categories:: Speech Processing Applications

14 Views

SIMPLE COOPERATIVE TRANSMISSION SCHEMES FOR UNDERLAY SPECTRUM SHARING USING SYMBOL-LEVEL PRECODING AND LOAD-CONTROLLED ARRAYS

The combination of coordinated multi-point (CoMP) and underlay spectrum sharing promises substantial spectral efficiency (SE) gains for future cellular networks. However, this concept has been largely overlooked in the literature. Moreover, none of the few relevant studies consider the use of “standard” transmission strategies to facilitate the adoption of the aforementioned communication paradigm by 5G networks.

Ntougias_ICASSP2019.pdf

Ntougias_ICASSP2019.pdf (545)

Categories:: Audio and Acoustic Signal Processing

6 Views

SIMPLE COOPERATIVE TRANSMISSION SCHEMES FOR UNDERLAY SPECTRUM SHARING USING SYMBOL-LEVEL PRECODING AND LOAD-CONTROLLED ARRAYS

Ntougias_ICASSP2019.pdf

Ntougias_ICASSP2019.pdf (545)

Categories:: Audio and Acoustic Signal Processing

6 Views

ICASSP 2019 SLIDES: SPEAKER CHANGE DETECTION USING FUNDAMENTAL FREQUENCY WITH APPLICATION TO MULTI-TALKER SEGMENTATION

This paper shows that time varying pitch properties can be used advantageously within the segmentation step of a multi-talker diarization system. First a study is conducted to verify that changes in pitch are strong indicators of changes in the speaker. It is then highlighted that an individual’s pitch is smoothly varying and, therefore, can be predicted by means of a Kalman filter. Subsequently it is shown that if the pitch is not predictable then this is most likely due to a change in the speaker.

ICASSP_2019_SLIDES_Speaker_Change_Detection_Using_Fundamental_Frequency_With_Application_To_Multi-talker_Segmentation.pdf

ICASSP_2019_SLIDES_Speaker_Change_Detection_Using_Fundamental_Frequency_With_Application_To_Multi-talker_Segmentation.pdf (567)

Categories:: Audio and Acoustic Signal Processing

216 Views

NEUROMORPHIC VISION SENSING FOR CNN-BASED ACTION RECOGNITION

Read more about NEUROMORPHIC VISION SENSING FOR CNN-BASED ACTION RECOGNITION
Log in to post comments

Neuromorphic vision sensing (NVS) hardware is now gaining traction as a low-power/high-speed visual sensing technology that circumvents the limitations of conventional active pixel sensing (APS) cameras. While object detection and tracking models have been investigated in conjunction with NVS, there is currently little work on NVS for higher-level semantic tasks, such as action recognition.

NEUROMORPHIC VISION SENSING FOR CNN-BASED ACTION RECOGNITION.pdf

NEUROMORPHIC VISION SENSING FOR CNN-BASED ACTION RECOGNITION.pdf (493)

Categories:: Pattern recognition and classification (MLR-PATT)
Image/Video Processing

59 Views

Bilinear Representation for Language-based Image Editing Using Conditional Generative Adversarial Networks

The task of Language-Based Image Editing (LBIE) aims at generating a target image by editing the source image based on the given language description. The main challenge of LBIE is to disentangle the semantics in image and text and then combine them to generate realistic images. Therefore, the editing performance is heavily dependent on the learned representation.

icassp_poster_改.pdf

icassp_poster_改.pdf (762)

Categories:: Image/Video Storage, Retrieval

31 Views

Segmentation, Classification, and Visualization of Orca Calls using Deep Learning

Read more about Segmentation, Classification, and Visualization of Orca Calls using Deep Learning
Log in to post comments

Audiovisual media are increasingly used to study the communication and behavior of animal groups, e.g. by placing microphones in the animals habitat resulting in huge datasets with only a small amount of animal interactions. The Orcalab has recorded orca whales since 1973 using stationary underwater hydrophones and made it publicly available on the Orchive. There exist over 15 000 manually extracted orca/noise annotations and about 20 000 h unseen audio data. To analyze the behavior and communication of killer whales we need to interpret the different call types.

seg_class_visual_icassp_2019.pdf

Presentation Slides ICASSP 2019 (458)

Categories:: Bioacoustics and Medical Acoustics

152 Views

Secure MIMO Interference Channel with Condential Messages and Delayed CSIT

Read more about Secure MIMO Interference Channel with Condential Messages and Delayed CSIT
Log in to post comments

Slide of Presentation in ICASSP 2019

ICASSP 2019 Slide.pdf

ICASSP 2019 Slide.pdf (503)

Categories:: Communications and Network Security
MIMO Communications and Signal Processing

26 Views

ATOM SELECTION IN CONTINUOUS DICTIONARIES: RECONCILING POLAR AND SVD APPROXIMATIONS

Read more about ATOM SELECTION IN CONTINUOUS DICTIONARIES: RECONCILING POLAR AND SVD APPROXIMATIONS
Log in to post comments

This work deals with efficient atom selection procedure in
a continuous dictionary, as required for instance in a Frank-
Wolfe approach within a BLASSO problem for the onedimensional
deconvolution problem. We show that efficient
maximization of a correlation between any given vector and
an atom sweeping a continuous dictionary can be performed
through a particular piece-wise linear approximation of dictionaries:
the polar approximation. We finally identify the
polar approximation as being optimal in a mean square error

posterICASSP19_svd.pdf

posterICASSP19_svd.pdf (415)

Categories:: Audio and Acoustic Signal Processing

7 Views

DEEP LEARNING THE EEG MANIFOLD FOR PHONOLOGICAL CATEGORIZATION FROM ACTIVE THOUGHTS

Read more about DEEP LEARNING THE EEG MANIFOLD FOR PHONOLOGICAL CATEGORIZATION FROM ACTIVE THOUGHTS
Log in to post comments

Speech-related Brain Computer Interfaces (BCI) aim primarily at finding an alternative vocal communication pathway for
people with speaking disabilities. As a step towards full decoding of imagined speech from active thoughts, we present a
BCI system for subject-independent classification of phonological categories exploiting a novel deep learning based

ICASSP 2019_2585.pdf

ICASSP 2019_2585.pdf (505)

Categories:: Neural network learning (MLR-NNLR)
Biomedical signal processing

14 Views

Pages