Beyond Keypoint Coding: Temporal Evolution Inference with Compact Feature Representation for Talking Face Video Compression

We propose a talking face video compression framework by implicitly transforming the temporal evolution into compact feature representation. More specifically, the temporal evolution of faces, which is complex, non-linear and difficult to extrapolate, is modelled in an end-to-end inference framework based upon very compact features. This enables the high-quality rendering of the face videos, which benefits from the learning of dense motion map with compact feature representation. Therefore, the proposed framework can accommodate ultra-low bandwidth video communication and maintain the quality of the reconstructed videos. Experimental results demonstrate that compared with the state-of-the-art video coding standard Versatile Video Coding (VVC) as well as the latest generative compression scheme Face Video-to-Video Synthesis (Face_vid2vid), the proposed scheme is superior in terms of both objective and subjective quality assessment methods.

DCC2022_pre.pptx

DCC2022_pre.pptx (267)

Thumbs Up

CITE

Documents

Demo

Beyond Keypoint Coding: Temporal Evolution Inference with Compact Feature Representation for Talking Face Video Compression

DCC2022_pre.pptx

QUESTIONS?