Image, Video, and Multidimensional Signal Processing

CONTINUOUS ULTRASOUND BASED TONGUE MOVEMENT VIDEO SYNTHESIS FROM SPEECH

Read more about CONTINUOUS ULTRASOUND BASED TONGUE MOVEMENT VIDEO SYNTHESIS FROM SPEECH
Log in to post comments

The movement of tongue plays an important role in pronunciation. Visualizing the movement of tongue can improve speech intelligibility and also helps learning a second language. However, hardly any research has been investigated for this topic. In this paper, a framework to synthesize continuous ultrasound tongue movement video from speech is presented. Two different mapping methods are introduced as the most important parts of the framework.

icassp-2016-ultrasound.pdf

icassp-2016-ultrasound.pdf (792)

Categories:: Image, Video, and Multidimensional Signal Processing

7 Views

Microtexture Inpainting through Gaussian Conditional Simulation

Read more about Microtexture Inpainting through Gaussian Conditional Simulation
Log in to post comments

Image inpainting consists in filling missing regions of an image by inferring from the surrounding content.
In the case of texture images, inpainting can be formulated in terms of conditional simulation of a stochastic texture model.
Many texture synthesis methods thus have been adapted to texture inpainting, but these methods do not offer theoretical guarantees since the conditional sampling is in general only approximate.

presentation.pdf

presentation.pdf (967)

Categories:: Image, Video, and Multidimensional Signal Processing

15 Views

An improved local binary pattern operator for texture classification

Read more about An improved local binary pattern operator for texture classification
Log in to post comments

icassp2016_presentation_lu.pdf

icassp2016_presentation_lu.pdf (944)

Categories:: Image, Video, and Multidimensional Signal Processing

31 Views

Mining Representative Actions for Actor Identification

Read more about Mining Representative Actions for Actor Identification
Log in to post comments

Previous works on actor identification mainly focused on static
features based on face identification and costume detection,
without considering the abundant dynamic information contained
in videos. In this paper, we propose a novel method
to mine representative actions of each actor, and show the remarkable
power of such actions for actor identification task.
Videos are firstly divided into shots and represented by BoW
based on spatial-temporal features. Then we integrate the prototype

Mining Representative Actions for Actor Identification - wlxie.pdf

Mining Representative Actions for Actor Identification - wlxie.pdf (140)

Categories:: Image, Video, and Multidimensional Signal Processing

5 Views

NONCONVEX COMPRESSIVE SENSING RECONSTRUCTION FOR TENSOR USING STRUCTURES IN MODES

Read more about NONCONVEX COMPRESSIVE SENSING RECONSTRUCTION FOR TENSOR USING STRUCTURES IN MODES
Log in to post comments

poster.pdf

poster.pdf (1015)

Categories:: Image, Video, and Multidimensional Signal Processing

4 Views

Manga-Specific Features and Latent Style Model for Manga Style Analysis

Read more about Manga-Specific Features and Latent Style Model for Manga Style Analysis
Log in to post comments

A latent style model describing manga styles based on the proposed manga-specific features is constructed to facilitate novel style-based applications. Two manga-specific features, i.e., screentone features showing texture and shade, and panel features showing panel arrangement, are firstly proposed to describe manga pages. Based on the latent Dirichlet allocation technique, we discover latent style elements embedded in manga documents, which are described by visual words derived from manga-specific features.

Manga-Specific Features and Latent Style Model for Manga Style Analysis.pdf

Manga-Specific Features and Latent Style Model for Manga Style Analysis.pdf (79)

Categories:: Image, Video, and Multidimensional Signal Processing

15 Views

News Story Clustering with Fisher Embedding

Read more about News Story Clustering with Fisher Embedding
Log in to post comments

An automatic news story clustering system is presented to facilitate efficient news browsing and summarization. We describe news content by considering both what objects appear and how these objects move in news stories. With Fisher embedding, we respectively encode local features, semantics features, and dense trajectories as Fisher vectors, based on which similarity between news stories can be well evaluated and thus better clustering performance can be obtained.

News Story Clustering with Fisher Embedding.pdf

News Story Clustering with Fisher Embedding.pdf (97)

Categories:: Image, Video, and Multidimensional Signal Processing

9 Views

Intrinsic Two-Dimensional Local Structures for Micro-Expression Recognition

Read more about Intrinsic Two-Dimensional Local Structures for Micro-Expression Recognition
Log in to post comments

An elapsed facial emotion involves changes of facial contour due to the motions (such as contraction or stretch) of facial muscles located at the eyes, nose, lips and etc. Thus, the important information such as corners of facial contours that are located in various regions of the face are crucial to the recognition of facial expressions, and even more apparent for micro-expressions. In this paper, we propose the first known notion of employing intrinsic two-dimensional (i2D) local structures to represent these features for micro-expression recognition.

mi2dbp_icassp2016.pdf

mi2dbp_icassp2016.pdf (766)

Categories:: Image, Video, and Multidimensional Signal Processing

10 Views

Spatio-Temporal Mid-Level Feature Bank for Action Recognition in Low Quality Video

Read more about Spatio-Temporal Mid-Level Feature Bank for Action Recognition in Low Quality Video
Log in to post comments

It is a great challenge to perform high level recognition tasks on videos that are poor in quality. In this paper, we propose a new spatio-temporal mid-level (STEM) feature bank for recognizing human actions in low quality videos. The feature bank comprises of a trio of local spatio-temporal features, i.e. shape, motion and textures, which respectively encode structural, dynamic and statistical information in video. These features are encoded into mid-level representations and aggregated to construct STEM.

stem_icassp2016.pdf

stem_icassp2016.pdf (777)

Categories:: Image, Video, and Multidimensional Signal Processing

25 Views

Discriminant Correlation Analysis for Feature Level Fusion with Application to Multimodal Biometrics

In this paper, we present Discriminant Correlation Analysis (DCA), a feature level fusion technique that incorporates the class associations in correlation analysis of the feature sets. DCA performs an effective feature fusion by maximizing the pair-wise correlations across the two feature sets, and at the same time, eliminating the between-class correlations and restricting the correlations to be within classes.

DCA_ICASSP16_Poster.pdf

DCA_ICASSP16_Poster.pdf (1847)

Categories:: Information Forensics and Security
Biometrics
Applications
Multimedia Forensics
Machine Learning for Signal Processing
Applications in Data Fusion (MLR-FUSI)
Pattern recognition and classification (MLR-PATT)
Other applications of machine learning (MLR-APPL)
Image, Video, and Multidimensional Signal Processing
Image/Video Processing
Multimedia Signal Processing
Multimodal signal processing
Sensor Array and Multichannel Signal Processing

121 Views

Image, Video, and Multidimensional Signal Processing

Pages