Sorry, you need to enable JavaScript to visit this website.

Txt2Vid-Web: Web-based, Text-to-Video, Video Conferencing Pipeline

Citation Author(s):
Arjun Barrett, Laura Gomezjurado, Shuvam Mukherjee, Arz Bshara, Sahasrajit Sarmasarkar, Pulkit Tandon and Tsachy Weissman
Submitted by:
Pulkit Tandon
Last updated:
28 February 2023 - 2:26am
Document Type:
Presentation Slides
Document Year:
2023
Event:
Paper Code:
2023001234
 

Video conferencing tools have seen a significant increase in usage in the past few years but they still consume a significant bandwidth of ~100 Kbps to a few Mbps. In this work, we present Txt2Vid-Web: a practical, web-based, low bandwidth, video conferencing platform building upon the Txt2Vid work [1]. We introduce multiple improvements over the existing Txt2Vid framework – implementing it on browser application stack making it much more accessible and portable, reducing the implementation complexity of the platform via WebGL, and implementing a new WebGL shader for ConvTranspose in ONNX runtime – thereby enabling it to run on web-browsers of modern laptops. We use WebRTC to establish peer-to-peer data channels over network connections and utilize SDP’s multimedia negotiation scheme over SRTP connections, allowing our platform to provide high-quality video calls for the majority of connections and fall back to Txt2Vid when bandwidth limitations overconstrain SDP’s chosen codecs. We verified our platform via subjective study (n = 126) consisting of comparison of five different audio-video (AV) contents compressed via standard codecs and Txt2Vid-Web. We choose bitrates of {6 kbps, 10 kbps} for encoding the audio and bitrates of {15 kbps, 35 kbps, 100 kbps} for encoding the video using standard AV codec. Results show that at similar quality of experience our platform requires 100 − 500× less bandwidth than H.264 and VP9 as video codec and OPUS as audio codec. We envision our platform can open up many novel applications. Towards this end, we also open-source both our new tool as a Github repository (https://github.com/tpulkit/txt2vid_browser), and our subjective study dataset (https://tinyurl.com/SubjectiveStudyDataset).

[1] Pulkit Tandon et al., “Txt2vid: Ultra-low bitrate compression of talking-head videos via text,” IEEE Journal on Selected Areas in Communications, 2022.

up
0 users have voted: