Sorry, you need to enable JavaScript to visit this website.

Sparse Directed Graph Learning for Head Movement Prediction in 360 Video Streaming

Citation Author(s):
Gene Cheung, Patrick Le Callet, Jack Z. G. Tan
Submitted by:
Xue Zhang
Last updated:
20 May 2020 - 7:49pm
Document Type:
Presentation Slides
Event:
 

High-definition 360 videos encoded in fine quality are typically too large in size to stream in its entirety over bandwidth (BW)-constrained networks. One popular remedy is to interactively extract and send a spatial sub-region corresponding to a viewer's current field-of-view (FoV) in a head-mounted display (HMD) for more BW-efficient streaming. Due to the non-negligible round-trip-time (RTT) delay between server and client, accurate head movement prediction that foretells a viewer's future FoVs is essential. Existing approaches are either overly simplistic in modelling and predict poorly when RTT is large, or are over-reliant on data-driven learning, resulting in inflexible models that are not robust to RTT heterogeneity. In this paper, we cast the head movement prediction task as a sparse directed graph learning problem, where three sources of relevant information---a 360 image saliency map, collected viewers' head movement traces, and a biological head rotation model---are aggregated into a unified Markov model. Specifically, we formulate a constrained optimization problem to minimize an $l_2$-norm fidelity term and a sparsity term, corresponding to trace data / saliency consistency and a sparse graph model prior respectively. We solve the problem alternately using a hybrid iterative reweighted least square (IRLS) and Frank-Wolfe optimization strategy. Extensive experiments show that our head movement prediction scheme noticeably outperforms existing proposals across a wide range of RTTs.

up
0 users have voted: