Sorry, you need to enable JavaScript to visit this website.

Semantic Segmentation in Compressed Videos

Citation Author(s):
Ang Li, Yang Wang
Submitted by:
Yiwei Lu
Last updated:
23 September 2019 - 9:26pm
Document Type:
Document Year:
Presenters Name:
Yiwei Lu



Existing approaches for semantic segmentation in
videos usually extract each frame as an RGB image, then apply
standard image-based semantic segmentation models on each
frame. This is time-consuming. In this paper, we tackle this
problem by exploring the nature of video compression techniques.
A compressed video contains three types of frames, I-frames,
P-frames, and B-frames. I-frames are represented as regular
images, P-frames are represented as motion vectors and residual
errors, and B-frames are bidirectionally frames that can be
regarded as a special case of a P frame. We propose a method
that directly operates on I-frames (as RGB images) and P-frames
(motion vectors and residual errors) in a video. Our
proposed model uses a ConvLSTM model to capture the temporal
information in the video required for producing the semantic
segmentation on P-frames. Our experimental results show that
our method performs much faster than other alternatives while
achieving similar performance in terms of accuracies.

0 users have voted:

Dataset Files

Semantic Segmentation in Compressed Videos .pdf