Sorry, you need to enable JavaScript to visit this website.

Multimedia Signal Processing

Automatic identification of speakers from head gestures in a narration


In this work, we focus on quantifying speaker identity information encoded in the head gestures of speakers, while they narrate a story. We hypothesize that the head gestures over a long duration have speaker-specific patterns. To establish this, we consider a classification problem to identify speakers from head gestures. We represent every head orientation as a triplet of Euler angles and a sequence of head orientations as head gestures.

Paper Details

Authors:
Sanjeev Kadagathur Vadiraj, Achuth Rao M V, Prasanta Kumar Ghosh
Submitted On:
16 May 2020 - 10:52am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

ICASSP_2020_Presentation.pdf

(7)

Subscribe

[1] Sanjeev Kadagathur Vadiraj, Achuth Rao M V, Prasanta Kumar Ghosh, "Automatic identification of speakers from head gestures in a narration", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5375. Accessed: May. 30, 2020.
@article{5375-20,
url = {http://sigport.org/5375},
author = {Sanjeev Kadagathur Vadiraj; Achuth Rao M V; Prasanta Kumar Ghosh },
publisher = {IEEE SigPort},
title = {Automatic identification of speakers from head gestures in a narration},
year = {2020} }
TY - EJOUR
T1 - Automatic identification of speakers from head gestures in a narration
AU - Sanjeev Kadagathur Vadiraj; Achuth Rao M V; Prasanta Kumar Ghosh
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5375
ER -
Sanjeev Kadagathur Vadiraj, Achuth Rao M V, Prasanta Kumar Ghosh. (2020). Automatic identification of speakers from head gestures in a narration. IEEE SigPort. http://sigport.org/5375
Sanjeev Kadagathur Vadiraj, Achuth Rao M V, Prasanta Kumar Ghosh, 2020. Automatic identification of speakers from head gestures in a narration. Available at: http://sigport.org/5375.
Sanjeev Kadagathur Vadiraj, Achuth Rao M V, Prasanta Kumar Ghosh. (2020). "Automatic identification of speakers from head gestures in a narration." Web.
1. Sanjeev Kadagathur Vadiraj, Achuth Rao M V, Prasanta Kumar Ghosh. Automatic identification of speakers from head gestures in a narration [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5375

Presentation: Trapezoidal Segment Sequencing: A Novel Approach for Fusion of Human-produced Continuous Annotations


Generating accurate ground truth representations of human subjective experiences and judgements is essential for advancing our understanding of human-centered constructs such as emotions. Often, this requires the collection and fusion of annotations from several people where each one is subject to valuation disagreements, distraction artifacts, and other error sources.

Paper Details

Authors:
Shrikanth Narayanan
Submitted On:
14 May 2020 - 7:44am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

_Presentation__2020_ICASSP_Trapezoidal_Segment_Fusion.pdf

(8)

Subscribe

[1] Shrikanth Narayanan, "Presentation: Trapezoidal Segment Sequencing: A Novel Approach for Fusion of Human-produced Continuous Annotations", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5287. Accessed: May. 30, 2020.
@article{5287-20,
url = {http://sigport.org/5287},
author = {Shrikanth Narayanan },
publisher = {IEEE SigPort},
title = {Presentation: Trapezoidal Segment Sequencing: A Novel Approach for Fusion of Human-produced Continuous Annotations},
year = {2020} }
TY - EJOUR
T1 - Presentation: Trapezoidal Segment Sequencing: A Novel Approach for Fusion of Human-produced Continuous Annotations
AU - Shrikanth Narayanan
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5287
ER -
Shrikanth Narayanan. (2020). Presentation: Trapezoidal Segment Sequencing: A Novel Approach for Fusion of Human-produced Continuous Annotations. IEEE SigPort. http://sigport.org/5287
Shrikanth Narayanan, 2020. Presentation: Trapezoidal Segment Sequencing: A Novel Approach for Fusion of Human-produced Continuous Annotations. Available at: http://sigport.org/5287.
Shrikanth Narayanan. (2020). "Presentation: Trapezoidal Segment Sequencing: A Novel Approach for Fusion of Human-produced Continuous Annotations." Web.
1. Shrikanth Narayanan. Presentation: Trapezoidal Segment Sequencing: A Novel Approach for Fusion of Human-produced Continuous Annotations [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5287

ENCODER-RECURRENT DECODER NETWORK FOR SINGLE IMAGE DEHAZING

Paper Details

Authors:
An Dang, Toan Vu, Jia-Ching Wang
Submitted On:
17 April 2020 - 4:16am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP2020.pdf

(27)

Subscribe

[1] An Dang, Toan Vu, Jia-Ching Wang, "ENCODER-RECURRENT DECODER NETWORK FOR SINGLE IMAGE DEHAZING", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5102. Accessed: May. 30, 2020.
@article{5102-20,
url = {http://sigport.org/5102},
author = {An Dang; Toan Vu; Jia-Ching Wang },
publisher = {IEEE SigPort},
title = {ENCODER-RECURRENT DECODER NETWORK FOR SINGLE IMAGE DEHAZING},
year = {2020} }
TY - EJOUR
T1 - ENCODER-RECURRENT DECODER NETWORK FOR SINGLE IMAGE DEHAZING
AU - An Dang; Toan Vu; Jia-Ching Wang
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5102
ER -
An Dang, Toan Vu, Jia-Ching Wang. (2020). ENCODER-RECURRENT DECODER NETWORK FOR SINGLE IMAGE DEHAZING. IEEE SigPort. http://sigport.org/5102
An Dang, Toan Vu, Jia-Ching Wang, 2020. ENCODER-RECURRENT DECODER NETWORK FOR SINGLE IMAGE DEHAZING. Available at: http://sigport.org/5102.
An Dang, Toan Vu, Jia-Ching Wang. (2020). "ENCODER-RECURRENT DECODER NETWORK FOR SINGLE IMAGE DEHAZING." Web.
1. An Dang, Toan Vu, Jia-Ching Wang. ENCODER-RECURRENT DECODER NETWORK FOR SINGLE IMAGE DEHAZING [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5102

Convolutional Neural Network based Fast Intra Mode Prediction for H.266/FVC Video Coding

Paper Details

Authors:
Ting-Lan Lin, Kai-Wen Liang, Jing-Ya Huang, Yu-Liang Tu, and Pao-Chi Chang
Submitted On:
24 April 2020 - 2:51pm
Short Link:
Type:
Event:
Session:

Document Files

Convolutional Neural Network based Fast Intra Mode Prediction.pdf

(50)

Subscribe

[1] Ting-Lan Lin, Kai-Wen Liang, Jing-Ya Huang, Yu-Liang Tu, and Pao-Chi Chang, "Convolutional Neural Network based Fast Intra Mode Prediction for H.266/FVC Video Coding", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5079. Accessed: May. 30, 2020.
@article{5079-20,
url = {http://sigport.org/5079},
author = {Ting-Lan Lin; Kai-Wen Liang; Jing-Ya Huang; Yu-Liang Tu; and Pao-Chi Chang },
publisher = {IEEE SigPort},
title = {Convolutional Neural Network based Fast Intra Mode Prediction for H.266/FVC Video Coding},
year = {2020} }
TY - EJOUR
T1 - Convolutional Neural Network based Fast Intra Mode Prediction for H.266/FVC Video Coding
AU - Ting-Lan Lin; Kai-Wen Liang; Jing-Ya Huang; Yu-Liang Tu; and Pao-Chi Chang
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5079
ER -
Ting-Lan Lin, Kai-Wen Liang, Jing-Ya Huang, Yu-Liang Tu, and Pao-Chi Chang. (2020). Convolutional Neural Network based Fast Intra Mode Prediction for H.266/FVC Video Coding. IEEE SigPort. http://sigport.org/5079
Ting-Lan Lin, Kai-Wen Liang, Jing-Ya Huang, Yu-Liang Tu, and Pao-Chi Chang, 2020. Convolutional Neural Network based Fast Intra Mode Prediction for H.266/FVC Video Coding. Available at: http://sigport.org/5079.
Ting-Lan Lin, Kai-Wen Liang, Jing-Ya Huang, Yu-Liang Tu, and Pao-Chi Chang. (2020). "Convolutional Neural Network based Fast Intra Mode Prediction for H.266/FVC Video Coding." Web.
1. Ting-Lan Lin, Kai-Wen Liang, Jing-Ya Huang, Yu-Liang Tu, and Pao-Chi Chang. Convolutional Neural Network based Fast Intra Mode Prediction for H.266/FVC Video Coding [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5079

Segmentation of Text-Lines and Words from JPEG Compressed Printed Text Documents Using DCT Coefficients


Segmenting a document image into text-lines and words finds applications in many research areas of DIA(Document Image Analysis) such as OCR, Word Spotting, and document retrieval. However, carrying out segmentation operation directly in the compressed document images is still an unexplored and challenging research area. Since JPEG is most widely accepted compression algorithm, this research paper attempts to segment a JPEG compressed printed text document image into text-lines and words, without fully decompressing the image.

Paper Details

Authors:
Mohammed Javed, P Nagabhushan, Watanabe Osamu
Submitted On:
7 April 2020 - 5:04am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Session:
Document Year:
Cite

Document Files

DCC2020 Paper ID 181

(37)

Subscribe

[1] Mohammed Javed, P Nagabhushan, Watanabe Osamu, "Segmentation of Text-Lines and Words from JPEG Compressed Printed Text Documents Using DCT Coefficients", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5001. Accessed: May. 30, 2020.
@article{5001-20,
url = {http://sigport.org/5001},
author = {Mohammed Javed; P Nagabhushan; Watanabe Osamu },
publisher = {IEEE SigPort},
title = {Segmentation of Text-Lines and Words from JPEG Compressed Printed Text Documents Using DCT Coefficients},
year = {2020} }
TY - EJOUR
T1 - Segmentation of Text-Lines and Words from JPEG Compressed Printed Text Documents Using DCT Coefficients
AU - Mohammed Javed; P Nagabhushan; Watanabe Osamu
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5001
ER -
Mohammed Javed, P Nagabhushan, Watanabe Osamu. (2020). Segmentation of Text-Lines and Words from JPEG Compressed Printed Text Documents Using DCT Coefficients. IEEE SigPort. http://sigport.org/5001
Mohammed Javed, P Nagabhushan, Watanabe Osamu, 2020. Segmentation of Text-Lines and Words from JPEG Compressed Printed Text Documents Using DCT Coefficients. Available at: http://sigport.org/5001.
Mohammed Javed, P Nagabhushan, Watanabe Osamu. (2020). "Segmentation of Text-Lines and Words from JPEG Compressed Printed Text Documents Using DCT Coefficients." Web.
1. Mohammed Javed, P Nagabhushan, Watanabe Osamu. Segmentation of Text-Lines and Words from JPEG Compressed Printed Text Documents Using DCT Coefficients [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5001

Multimodal active speaker detection and virtual cinematography for video conferencing


Active speaker detection (ASD) and virtual cinematography (VC) can significantly improve the remote user experience of a video conference by automatically panning, tilting and zooming of a video conferencing camera: users subjectively rate an expert video cinematographer’s video significantly higher than unedited video. We describe a new automated ASD and VC that performs within 0.3 MOS of an expert cinematographer based on subjective ratings with a 1-5 scale.

Paper Details

Authors:
Ross Cutler, Ramin Mehran, Sam Johnson, Cha Zhang, Adam Kirk, Oliver Whyte, Adarsh Kowdle
Submitted On:
12 February 2020 - 12:55am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP 2020 ASD.pdf

(45)

Subscribe

[1] Ross Cutler, Ramin Mehran, Sam Johnson, Cha Zhang, Adam Kirk, Oliver Whyte, Adarsh Kowdle, "Multimodal active speaker detection and virtual cinematography for video conferencing", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/4980. Accessed: May. 30, 2020.
@article{4980-20,
url = {http://sigport.org/4980},
author = {Ross Cutler; Ramin Mehran; Sam Johnson; Cha Zhang; Adam Kirk; Oliver Whyte; Adarsh Kowdle },
publisher = {IEEE SigPort},
title = {Multimodal active speaker detection and virtual cinematography for video conferencing},
year = {2020} }
TY - EJOUR
T1 - Multimodal active speaker detection and virtual cinematography for video conferencing
AU - Ross Cutler; Ramin Mehran; Sam Johnson; Cha Zhang; Adam Kirk; Oliver Whyte; Adarsh Kowdle
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/4980
ER -
Ross Cutler, Ramin Mehran, Sam Johnson, Cha Zhang, Adam Kirk, Oliver Whyte, Adarsh Kowdle. (2020). Multimodal active speaker detection and virtual cinematography for video conferencing. IEEE SigPort. http://sigport.org/4980
Ross Cutler, Ramin Mehran, Sam Johnson, Cha Zhang, Adam Kirk, Oliver Whyte, Adarsh Kowdle, 2020. Multimodal active speaker detection and virtual cinematography for video conferencing. Available at: http://sigport.org/4980.
Ross Cutler, Ramin Mehran, Sam Johnson, Cha Zhang, Adam Kirk, Oliver Whyte, Adarsh Kowdle. (2020). "Multimodal active speaker detection and virtual cinematography for video conferencing." Web.
1. Ross Cutler, Ramin Mehran, Sam Johnson, Cha Zhang, Adam Kirk, Oliver Whyte, Adarsh Kowdle. Multimodal active speaker detection and virtual cinematography for video conferencing [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/4980

A Fast Iterative Method for Removing Sparse Noise from Sparse Signals


Reconstructing a signal corrupted by impulsive noise is of high importance in several applications, including impulsive noise removal from images, audios and videos, and separating texts from images. Investigating this problem, in this paper we propose a new method to reconstruct a noise-corrupted signal where both signal and noise are sparse but in different domains. We apply our algorithm for impulsive noise (Salt-and-Pepper Noise (SPN) and Random-Valued Impulsive Noise (RVIN) removal from images and compare our results with other notable algorithms in the literature.

Paper Details

Authors:
nematollah zarmehi, farokh marvasti, saeed gazor
Submitted On:
7 November 2019 - 3:48pm
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

Sahar Sadrizadeh.pdf

(73)

Subscribe

[1] nematollah zarmehi, farokh marvasti, saeed gazor, "A Fast Iterative Method for Removing Sparse Noise from Sparse Signals", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4926. Accessed: May. 30, 2020.
@article{4926-19,
url = {http://sigport.org/4926},
author = {nematollah zarmehi; farokh marvasti; saeed gazor },
publisher = {IEEE SigPort},
title = {A Fast Iterative Method for Removing Sparse Noise from Sparse Signals},
year = {2019} }
TY - EJOUR
T1 - A Fast Iterative Method for Removing Sparse Noise from Sparse Signals
AU - nematollah zarmehi; farokh marvasti; saeed gazor
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4926
ER -
nematollah zarmehi, farokh marvasti, saeed gazor. (2019). A Fast Iterative Method for Removing Sparse Noise from Sparse Signals. IEEE SigPort. http://sigport.org/4926
nematollah zarmehi, farokh marvasti, saeed gazor, 2019. A Fast Iterative Method for Removing Sparse Noise from Sparse Signals. Available at: http://sigport.org/4926.
nematollah zarmehi, farokh marvasti, saeed gazor. (2019). "A Fast Iterative Method for Removing Sparse Noise from Sparse Signals." Web.
1. nematollah zarmehi, farokh marvasti, saeed gazor. A Fast Iterative Method for Removing Sparse Noise from Sparse Signals [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4926

Dynamic Guidance For Depth Map Restoration

Paper Details

Authors:
Submitted On:
23 September 2019 - 9:03am
Short Link:
Type:
Event:
Document Year:
Cite

Document Files

Dynamic Guidance For Depth Map Restoration-RanZhu(paper id 80).pdf

(78)

Subscribe

[1] , "Dynamic Guidance For Depth Map Restoration", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4821. Accessed: May. 30, 2020.
@article{4821-19,
url = {http://sigport.org/4821},
author = { },
publisher = {IEEE SigPort},
title = {Dynamic Guidance For Depth Map Restoration},
year = {2019} }
TY - EJOUR
T1 - Dynamic Guidance For Depth Map Restoration
AU -
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4821
ER -
. (2019). Dynamic Guidance For Depth Map Restoration. IEEE SigPort. http://sigport.org/4821
, 2019. Dynamic Guidance For Depth Map Restoration. Available at: http://sigport.org/4821.
. (2019). "Dynamic Guidance For Depth Map Restoration." Web.
1. . Dynamic Guidance For Depth Map Restoration [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4821

RATE-DISTORTION DRIVEN SEPARATION OF DIFFUSE AND SPECULAR COMPONENTS IN MULTIVIEW IMAGERY


In this work we explore an overcomplete representation of
multiview imagery for the purpose of compression. We
present a rate-distortion (R-D) driven approach to decompose
multiview datasets into two additive parts which can
be interpreted as being the diffuse and specular components.
We apply different transforms to each component such that
the compressibility of input data is improved. We describe
a framework which performs the R-D optimized separation
in a registered domain to avoid the complexity of warping

Paper Details

Authors:
Maryam Haghighat, Reji Mathew and David Taubman
Submitted On:
21 September 2019 - 7:18am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICIP2019_Haghighat_Poster.pdf

(95)

Subscribe

[1] Maryam Haghighat, Reji Mathew and David Taubman, "RATE-DISTORTION DRIVEN SEPARATION OF DIFFUSE AND SPECULAR COMPONENTS IN MULTIVIEW IMAGERY", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4805. Accessed: May. 30, 2020.
@article{4805-19,
url = {http://sigport.org/4805},
author = {Maryam Haghighat; Reji Mathew and David Taubman },
publisher = {IEEE SigPort},
title = {RATE-DISTORTION DRIVEN SEPARATION OF DIFFUSE AND SPECULAR COMPONENTS IN MULTIVIEW IMAGERY},
year = {2019} }
TY - EJOUR
T1 - RATE-DISTORTION DRIVEN SEPARATION OF DIFFUSE AND SPECULAR COMPONENTS IN MULTIVIEW IMAGERY
AU - Maryam Haghighat; Reji Mathew and David Taubman
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4805
ER -
Maryam Haghighat, Reji Mathew and David Taubman. (2019). RATE-DISTORTION DRIVEN SEPARATION OF DIFFUSE AND SPECULAR COMPONENTS IN MULTIVIEW IMAGERY. IEEE SigPort. http://sigport.org/4805
Maryam Haghighat, Reji Mathew and David Taubman, 2019. RATE-DISTORTION DRIVEN SEPARATION OF DIFFUSE AND SPECULAR COMPONENTS IN MULTIVIEW IMAGERY. Available at: http://sigport.org/4805.
Maryam Haghighat, Reji Mathew and David Taubman. (2019). "RATE-DISTORTION DRIVEN SEPARATION OF DIFFUSE AND SPECULAR COMPONENTS IN MULTIVIEW IMAGERY." Web.
1. Maryam Haghighat, Reji Mathew and David Taubman. RATE-DISTORTION DRIVEN SEPARATION OF DIFFUSE AND SPECULAR COMPONENTS IN MULTIVIEW IMAGERY [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4805

Influence of viewpoint on visual saliency models for volumetric content


In order to predict where humans look in a 3D immersive en- vironment, saliency can be computed using either 3D saliency models or view-based approaches (2D projection). In fact, building a 3D complete model is still a challenging task that is not investigated enough in the research field while 2D imag- ing approaches have been extensively studied and have shown solid performances.

Paper Details

Authors:
Mona Abid, Matthieu Perreira Da Silva, Patrick Le Callet
Submitted On:
20 September 2019 - 3:59am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Lecture_ICIP_2019.pdf

(58)

Keywords

Additional Categories

Subscribe

[1] Mona Abid, Matthieu Perreira Da Silva, Patrick Le Callet, " Influence of viewpoint on visual saliency models for volumetric content", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4763. Accessed: May. 30, 2020.
@article{4763-19,
url = {http://sigport.org/4763},
author = {Mona Abid; Matthieu Perreira Da Silva; Patrick Le Callet },
publisher = {IEEE SigPort},
title = { Influence of viewpoint on visual saliency models for volumetric content},
year = {2019} }
TY - EJOUR
T1 - Influence of viewpoint on visual saliency models for volumetric content
AU - Mona Abid; Matthieu Perreira Da Silva; Patrick Le Callet
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4763
ER -
Mona Abid, Matthieu Perreira Da Silva, Patrick Le Callet. (2019). Influence of viewpoint on visual saliency models for volumetric content. IEEE SigPort. http://sigport.org/4763
Mona Abid, Matthieu Perreira Da Silva, Patrick Le Callet, 2019. Influence of viewpoint on visual saliency models for volumetric content. Available at: http://sigport.org/4763.
Mona Abid, Matthieu Perreira Da Silva, Patrick Le Callet. (2019). " Influence of viewpoint on visual saliency models for volumetric content." Web.
1. Mona Abid, Matthieu Perreira Da Silva, Patrick Le Callet. Influence of viewpoint on visual saliency models for volumetric content [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4763

Pages