Documents
Poster
Bi-Prediction Enhancement with Deep Frame Prediction Network for Versatile Video Coding
- Citation Author(s):
- Submitted by:
- Tao Hao
- Last updated:
- 27 February 2021 - 12:38pm
- Document Type:
- Poster
- Document Year:
- 2021
- Event:
- Presenters:
- Tao Hao
- Categories:
- Log in to post comments
Bi-prediction is a fundamental module of inter prediction in the blocked-based hybrid video coding framework. Block-based motion estimation(ME) and motion compensation(MC) with simple models are adopted in bi-prediction process. Unfortunately, this MEMC-based algorithm can’t guarantee the prediction performance when it comes to videos with irregular motions. In this paper, a novel inter prediction scheme based on deep frame prediction network (DFP-net) is proposed to enhance bi-prediction accuracy, especially in complicated scenes. Specifically, the proposed DFP-net is composed of multi-scale motion alignment, fusion with temporal and spatial correlation and frame synthesis module. The DFP-net can precisely extract and fuse motion features in various scales and completely exploit temporal and spatial correlation to generate the prediction frame in a data-driven manner. Moreover, the DFP-net is integrated into VTM-6.2 to provide an additional prediction frame for biprediction. Since the prediction generated by DFP-net is more similar with the to-be-coded frame in the sense of temporal distance and texture, it can be appended to reference list to improve the content diversity of references. In this manner, the proposed bi-prediction scheme has surpassed VTM-6.2 on average 1.8% BD-rate saving.