Sorry, you need to enable JavaScript to visit this website.

MMSP 2019

MMSP 2019 is the IEEE 21st International Workshop on Multimedia Signal Processing. The workshop is organized by the Multimedia Signal Processing Technical Committee (MMSP TC) of the IEEE Signal Processing Society (SPS). The workshop will bring together researcher and developers from different fields working on multimedia signal processing to share their experience, exchange ideas, explore future research directions and network.

Injective State-Image Mapping facilitates Visual Adversarial Imitation Learning


The growing use of virtual autonomous agents in applications like games and entertainment demands better control policies for natural-looking movements and actions. Unlike the conventional approach of hard-coding motion routines, we propose a deep learning method for obtaining control policies by directly mimicking raw video demonstrations. Previous methods in this domain rely on extracting low-dimensional features from expert videos followed by a separate hand-crafted reward estimation step.

Paper Details

Authors:
Daiki Kimura, Asim Munawar, Ryuki Tachibana
Submitted On:
24 September 2019 - 4:46pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

mmps_final.pdf

(20)

Subscribe

[1] Daiki Kimura, Asim Munawar, Ryuki Tachibana, "Injective State-Image Mapping facilitates Visual Adversarial Imitation Learning", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4836. Accessed: Oct. 18, 2019.
@article{4836-19,
url = {http://sigport.org/4836},
author = {Daiki Kimura; Asim Munawar; Ryuki Tachibana },
publisher = {IEEE SigPort},
title = {Injective State-Image Mapping facilitates Visual Adversarial Imitation Learning},
year = {2019} }
TY - EJOUR
T1 - Injective State-Image Mapping facilitates Visual Adversarial Imitation Learning
AU - Daiki Kimura; Asim Munawar; Ryuki Tachibana
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4836
ER -
Daiki Kimura, Asim Munawar, Ryuki Tachibana. (2019). Injective State-Image Mapping facilitates Visual Adversarial Imitation Learning. IEEE SigPort. http://sigport.org/4836
Daiki Kimura, Asim Munawar, Ryuki Tachibana, 2019. Injective State-Image Mapping facilitates Visual Adversarial Imitation Learning. Available at: http://sigport.org/4836.
Daiki Kimura, Asim Munawar, Ryuki Tachibana. (2019). "Injective State-Image Mapping facilitates Visual Adversarial Imitation Learning." Web.
1. Daiki Kimura, Asim Munawar, Ryuki Tachibana. Injective State-Image Mapping facilitates Visual Adversarial Imitation Learning [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4836

Learning Multiple Sound Source 2D Localization


In this paper, we propose novel deep learning based algorithms for multiple sound source localization. Specifically, we aim to find the 2D Cartesian coordinates of multiple sound sources in an enclosed environment by using multiple microphone arrays. To this end, we use an encoding-decoding architecture and propose two improvements on it to accomplish the task. In addition, we also propose two novel localization representations which increase the accuracy.

Paper Details

Authors:
Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante
Submitted On:
29 September 2019 - 4:58am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

2019-mmsp-presentation_v6_sigport.pdf

(17)

Keywords

Additional Categories

Subscribe

[1] Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante, "Learning Multiple Sound Source 2D Localization", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4835. Accessed: Oct. 18, 2019.
@article{4835-19,
url = {http://sigport.org/4835},
author = {Guillaume Le Moing; Phongtharin Vinayavekhin; Tadanobu Inoue; Jayakorn Vongkulbhisal; Asim Munawar; Ryuki Tachibana; Don Joven Agravante },
publisher = {IEEE SigPort},
title = {Learning Multiple Sound Source 2D Localization},
year = {2019} }
TY - EJOUR
T1 - Learning Multiple Sound Source 2D Localization
AU - Guillaume Le Moing; Phongtharin Vinayavekhin; Tadanobu Inoue; Jayakorn Vongkulbhisal; Asim Munawar; Ryuki Tachibana; Don Joven Agravante
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4835
ER -
Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante. (2019). Learning Multiple Sound Source 2D Localization. IEEE SigPort. http://sigport.org/4835
Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante, 2019. Learning Multiple Sound Source 2D Localization. Available at: http://sigport.org/4835.
Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante. (2019). "Learning Multiple Sound Source 2D Localization." Web.
1. Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante. Learning Multiple Sound Source 2D Localization [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4835

End-to-End Conditional GAN-based Architectures for Image Colourisation


In this work recent advances in conditional adversarial networks are investigated to develop an end-to-end architecture based on Convolutional Neural Networks (CNNs) to directly map realistic colours to an input greyscale image. Observing that existing colourisation methods sometimes exhibit a lack of colourfulness, this paper proposes a method to improve colourisation results. In particular, the method uses Generative Adversarial Neural Networks (GANs) and focuses on improvement of training stability to enable better generalisation in large multi-class image datasets.

Paper Details

Authors:
Marc Gorriz Blanch, Marta Mrak, Alan Smeaton, Noel O'Connor
Submitted On:
24 September 2019 - 12:11pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

MMSP2019_poster.pdf

(12)

Subscribe

[1] Marc Gorriz Blanch, Marta Mrak, Alan Smeaton, Noel O'Connor, "End-to-End Conditional GAN-based Architectures for Image Colourisation", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4834. Accessed: Oct. 18, 2019.
@article{4834-19,
url = {http://sigport.org/4834},
author = {Marc Gorriz Blanch; Marta Mrak; Alan Smeaton; Noel O'Connor },
publisher = {IEEE SigPort},
title = {End-to-End Conditional GAN-based Architectures for Image Colourisation},
year = {2019} }
TY - EJOUR
T1 - End-to-End Conditional GAN-based Architectures for Image Colourisation
AU - Marc Gorriz Blanch; Marta Mrak; Alan Smeaton; Noel O'Connor
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4834
ER -
Marc Gorriz Blanch, Marta Mrak, Alan Smeaton, Noel O'Connor. (2019). End-to-End Conditional GAN-based Architectures for Image Colourisation. IEEE SigPort. http://sigport.org/4834
Marc Gorriz Blanch, Marta Mrak, Alan Smeaton, Noel O'Connor, 2019. End-to-End Conditional GAN-based Architectures for Image Colourisation. Available at: http://sigport.org/4834.
Marc Gorriz Blanch, Marta Mrak, Alan Smeaton, Noel O'Connor. (2019). "End-to-End Conditional GAN-based Architectures for Image Colourisation." Web.
1. Marc Gorriz Blanch, Marta Mrak, Alan Smeaton, Noel O'Connor. End-to-End Conditional GAN-based Architectures for Image Colourisation [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4834

Securing physical documents with digital signatures

Paper Details

Authors:
Christian Winter, Waldemar Berchtold, Jan Niklas Hollenbeck
Submitted On:
24 September 2019 - 10:04am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Securing_physical_documents.pdf

(8)

Subscribe

[1] Christian Winter, Waldemar Berchtold, Jan Niklas Hollenbeck, "Securing physical documents with digital signatures", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4832. Accessed: Oct. 18, 2019.
@article{4832-19,
url = {http://sigport.org/4832},
author = {Christian Winter; Waldemar Berchtold; Jan Niklas Hollenbeck },
publisher = {IEEE SigPort},
title = {Securing physical documents with digital signatures},
year = {2019} }
TY - EJOUR
T1 - Securing physical documents with digital signatures
AU - Christian Winter; Waldemar Berchtold; Jan Niklas Hollenbeck
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4832
ER -
Christian Winter, Waldemar Berchtold, Jan Niklas Hollenbeck. (2019). Securing physical documents with digital signatures. IEEE SigPort. http://sigport.org/4832
Christian Winter, Waldemar Berchtold, Jan Niklas Hollenbeck, 2019. Securing physical documents with digital signatures. Available at: http://sigport.org/4832.
Christian Winter, Waldemar Berchtold, Jan Niklas Hollenbeck. (2019). "Securing physical documents with digital signatures." Web.
1. Christian Winter, Waldemar Berchtold, Jan Niklas Hollenbeck. Securing physical documents with digital signatures [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4832

Semantic Segmentation in Compressed Videos


Existing approaches for semantic segmentation in
videos usually extract each frame as an RGB image, then apply
standard image-based semantic segmentation models on each
frame. This is time-consuming. In this paper, we tackle this
problem by exploring the nature of video compression techniques.
A compressed video contains three types of frames, I-frames,
P-frames, and B-frames. I-frames are represented as regular
images, P-frames are represented as motion vectors and residual
errors, and B-frames are bidirectionally frames that can be

Paper Details

Authors:
Ang Li, Yang Wang
Submitted On:
23 September 2019 - 9:26pm
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

Semantic Segmentation in Compressed Videos .pdf

(26)

Subscribe

[1] Ang Li, Yang Wang, "Semantic Segmentation in Compressed Videos", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4825. Accessed: Oct. 18, 2019.
@article{4825-19,
url = {http://sigport.org/4825},
author = {Ang Li; Yang Wang },
publisher = {IEEE SigPort},
title = {Semantic Segmentation in Compressed Videos},
year = {2019} }
TY - EJOUR
T1 - Semantic Segmentation in Compressed Videos
AU - Ang Li; Yang Wang
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4825
ER -
Ang Li, Yang Wang. (2019). Semantic Segmentation in Compressed Videos. IEEE SigPort. http://sigport.org/4825
Ang Li, Yang Wang, 2019. Semantic Segmentation in Compressed Videos. Available at: http://sigport.org/4825.
Ang Li, Yang Wang. (2019). "Semantic Segmentation in Compressed Videos." Web.
1. Ang Li, Yang Wang. Semantic Segmentation in Compressed Videos [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4825

Dynamic Guidance For Depth Map Restoration

Paper Details

Authors:
Submitted On:
23 September 2019 - 9:03am
Short Link:
Type:
Event:
Document Year:
Cite

Document Files

Dynamic Guidance For Depth Map Restoration-RanZhu(paper id 80).pdf

(16)

Subscribe

[1] , "Dynamic Guidance For Depth Map Restoration", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4821. Accessed: Oct. 18, 2019.
@article{4821-19,
url = {http://sigport.org/4821},
author = { },
publisher = {IEEE SigPort},
title = {Dynamic Guidance For Depth Map Restoration},
year = {2019} }
TY - EJOUR
T1 - Dynamic Guidance For Depth Map Restoration
AU -
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4821
ER -
. (2019). Dynamic Guidance For Depth Map Restoration. IEEE SigPort. http://sigport.org/4821
, 2019. Dynamic Guidance For Depth Map Restoration. Available at: http://sigport.org/4821.
. (2019). "Dynamic Guidance For Depth Map Restoration." Web.
1. . Dynamic Guidance For Depth Map Restoration [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4821

Youtube UGC Dataset for Video Compression Research


This poster introduces a large scale UGC dataset (1500 20 sec video clips) sampled from millions of Creative Commons YouTube videos. The dataset covers popular categories like Gaming, Sports, and new features like High Dynamic Range (HDR). Besides a novel sampling method based on features extracted from encoding, challenges for UGC compression and quality evaluation are also discussed. Shortcomings of traditional reference-based metrics on UGC are addressed.

Paper Details

Authors:
Sasi Inguva, Yilin Wang, Balu Adsumilli
Submitted On:
22 September 2019 - 2:40am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

mmsp_poster.pdf

(9)

Subscribe

[1] Sasi Inguva, Yilin Wang, Balu Adsumilli , "Youtube UGC Dataset for Video Compression Research", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4814. Accessed: Oct. 18, 2019.
@article{4814-19,
url = {http://sigport.org/4814},
author = {Sasi Inguva; Yilin Wang; Balu Adsumilli },
publisher = {IEEE SigPort},
title = {Youtube UGC Dataset for Video Compression Research},
year = {2019} }
TY - EJOUR
T1 - Youtube UGC Dataset for Video Compression Research
AU - Sasi Inguva; Yilin Wang; Balu Adsumilli
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4814
ER -
Sasi Inguva, Yilin Wang, Balu Adsumilli . (2019). Youtube UGC Dataset for Video Compression Research. IEEE SigPort. http://sigport.org/4814
Sasi Inguva, Yilin Wang, Balu Adsumilli , 2019. Youtube UGC Dataset for Video Compression Research. Available at: http://sigport.org/4814.
Sasi Inguva, Yilin Wang, Balu Adsumilli . (2019). "Youtube UGC Dataset for Video Compression Research." Web.
1. Sasi Inguva, Yilin Wang, Balu Adsumilli . Youtube UGC Dataset for Video Compression Research [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4814

An Occlusion Probability Model for Improving the Rendering Quality of Views


Occlusion as a common phenomenon in object surface can seriously affect information collection of light field. To visualize light field data-set, occlusions are usually idealized and neglected for most prior light field rendering (LFR) algorithms. However, the 3D spatial structure of some features may be missing to capture some incorrect samples caused by occlusion discontinuities. To solve this problem, we propose an occlusion probability (OCP) model to improve the capturing information and the rendering quality of views with occlusion for the LFR.

Paper Details

Authors:
Submitted On:
20 September 2019 - 5:41am
Short Link:
Type:
Event:
Paper Code:
Document Year:
Cite

Document Files

occlusion_MMSP2019.pdf

(6)

Subscribe

[1] , "An Occlusion Probability Model for Improving the Rendering Quality of Views", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4772. Accessed: Oct. 18, 2019.
@article{4772-19,
url = {http://sigport.org/4772},
author = { },
publisher = {IEEE SigPort},
title = {An Occlusion Probability Model for Improving the Rendering Quality of Views},
year = {2019} }
TY - EJOUR
T1 - An Occlusion Probability Model for Improving the Rendering Quality of Views
AU -
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4772
ER -
. (2019). An Occlusion Probability Model for Improving the Rendering Quality of Views. IEEE SigPort. http://sigport.org/4772
, 2019. An Occlusion Probability Model for Improving the Rendering Quality of Views. Available at: http://sigport.org/4772.
. (2019). "An Occlusion Probability Model for Improving the Rendering Quality of Views." Web.
1. . An Occlusion Probability Model for Improving the Rendering Quality of Views [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4772

EVS AND OPUS AUDIO CODERS PERFORMANCE EVALUATION FOR ORIENTAL AND ORCHESTRAL MUSICAL INSTRUMENTS


In modern telecommunication systems the channel bandwidth and the quality of the reconstructed decoded audio signals are considered as major telecommunication resources. New speech or audio coders must be carefully designed and implemented to meet these requirements. EVS and OPUS audio coders are new coders which used to improve the quality of the reconstructed audio signal at different output bitrates. These coders can operate with different input signal type. The performance of these coders must be evaluated in terms of the quality of the reconstructed signals.

Paper Details

Authors:
Yasser A. Zenhom, Eman Mohammed, Micheal N. Mikhael, Hala A. Mansour
Submitted On:
19 September 2019 - 7:46pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

The performance criteria of the EVS and OPUS audio coders and describe the performance evaluation results for oriental and orche

(11)

Keywords

Subscribe

[1] Yasser A. Zenhom, Eman Mohammed, Micheal N. Mikhael, Hala A. Mansour, "EVS AND OPUS AUDIO CODERS PERFORMANCE EVALUATION FOR ORIENTAL AND ORCHESTRAL MUSICAL INSTRUMENTS", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4753. Accessed: Oct. 18, 2019.
@article{4753-19,
url = {http://sigport.org/4753},
author = {Yasser A. Zenhom; Eman Mohammed; Micheal N. Mikhael; Hala A. Mansour },
publisher = {IEEE SigPort},
title = {EVS AND OPUS AUDIO CODERS PERFORMANCE EVALUATION FOR ORIENTAL AND ORCHESTRAL MUSICAL INSTRUMENTS},
year = {2019} }
TY - EJOUR
T1 - EVS AND OPUS AUDIO CODERS PERFORMANCE EVALUATION FOR ORIENTAL AND ORCHESTRAL MUSICAL INSTRUMENTS
AU - Yasser A. Zenhom; Eman Mohammed; Micheal N. Mikhael; Hala A. Mansour
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4753
ER -
Yasser A. Zenhom, Eman Mohammed, Micheal N. Mikhael, Hala A. Mansour. (2019). EVS AND OPUS AUDIO CODERS PERFORMANCE EVALUATION FOR ORIENTAL AND ORCHESTRAL MUSICAL INSTRUMENTS. IEEE SigPort. http://sigport.org/4753
Yasser A. Zenhom, Eman Mohammed, Micheal N. Mikhael, Hala A. Mansour, 2019. EVS AND OPUS AUDIO CODERS PERFORMANCE EVALUATION FOR ORIENTAL AND ORCHESTRAL MUSICAL INSTRUMENTS. Available at: http://sigport.org/4753.
Yasser A. Zenhom, Eman Mohammed, Micheal N. Mikhael, Hala A. Mansour. (2019). "EVS AND OPUS AUDIO CODERS PERFORMANCE EVALUATION FOR ORIENTAL AND ORCHESTRAL MUSICAL INSTRUMENTS." Web.
1. Yasser A. Zenhom, Eman Mohammed, Micheal N. Mikhael, Hala A. Mansour. EVS AND OPUS AUDIO CODERS PERFORMANCE EVALUATION FOR ORIENTAL AND ORCHESTRAL MUSICAL INSTRUMENTS [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4753

Discrete Cosine Basis Oriented Motion Modeling for Fisheye and 360 Degree Video Coding


Motion modeling plays a central role in video compression. This role is even more critical in fisheye video sequences
since the wide-angle fisheye imagery has special characteristics as in exhibiting radial distortion. While the translational motion model employed by modern video coding standards, such as HEVC, is sufficient in most cases, using higher order models is
beneficial; for this reason, the upcoming video coding standard, VVC, employs a 4-parameter affine model. Discrete cosine basis

Paper Details

Authors:
Ashek Ahmmed, Manoranjan Paul
Submitted On:
18 September 2019 - 5:18am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Motion modelling for 360 videos

(8)

Subscribe

[1] Ashek Ahmmed, Manoranjan Paul, "Discrete Cosine Basis Oriented Motion Modeling for Fisheye and 360 Degree Video Coding", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4668. Accessed: Oct. 18, 2019.
@article{4668-19,
url = {http://sigport.org/4668},
author = {Ashek Ahmmed; Manoranjan Paul },
publisher = {IEEE SigPort},
title = {Discrete Cosine Basis Oriented Motion Modeling for Fisheye and 360 Degree Video Coding},
year = {2019} }
TY - EJOUR
T1 - Discrete Cosine Basis Oriented Motion Modeling for Fisheye and 360 Degree Video Coding
AU - Ashek Ahmmed; Manoranjan Paul
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4668
ER -
Ashek Ahmmed, Manoranjan Paul. (2019). Discrete Cosine Basis Oriented Motion Modeling for Fisheye and 360 Degree Video Coding. IEEE SigPort. http://sigport.org/4668
Ashek Ahmmed, Manoranjan Paul, 2019. Discrete Cosine Basis Oriented Motion Modeling for Fisheye and 360 Degree Video Coding. Available at: http://sigport.org/4668.
Ashek Ahmmed, Manoranjan Paul. (2019). "Discrete Cosine Basis Oriented Motion Modeling for Fisheye and 360 Degree Video Coding." Web.
1. Ashek Ahmmed, Manoranjan Paul. Discrete Cosine Basis Oriented Motion Modeling for Fisheye and 360 Degree Video Coding [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4668

Pages