PhD Proposal: Towards Immersive Streaming for Videos and Light Fields
David Li
Abstract
As virtual and augmented reality devices evolve with new applications, the ability to create and transmit immersive content becomes ever more critical. In particular, mobile, standalone devices have power, computing, and bandwidth limitations which require careful thought on how to deliver content to users. In this proposal, we examine techniques to enable adaptive streaming of two types of content: 360 degree panoramic videos and light fields.
With the rapidly increasing resolutions of 360 degree cameras, head-mounted displays, and live-streaming services, streaming high-resolution panoramic videos over limited-bandwidth networks is becoming a critical challenge. Foveated video streaming can address this rising challenge in the context of eye-tracking-equipped virtual reality head-mounted displays. We introduce a new log-rectilinear transformation incorporating summed-area table filtering and off-the-shelf video codecs to enable foveated streaming of 360 degree videos suitable for VR headsets with built-in eye-tracking. Our technique results in a 30% decrease in flickering and a 10% decrease in bit rate with H.264 streaming while maintaining similar or better quality.
Neural representations have shown great promise in compactly representing radiance and light fields. However, existing neural representations are not suited for streaming as decoding can only be done at a single level of detail and requires downloading the entire neural network model. To resolve these challenges, we present a progressive multi-scale light field network that encodes light fields with multiple levels of detail across various subsets of the network weights. With our approach, light field networks can render starting with less than 7% of the model weights and progressively depict greater levels of detail as more model weights are streamed.
With the rapidly increasing resolutions of 360 degree cameras, head-mounted displays, and live-streaming services, streaming high-resolution panoramic videos over limited-bandwidth networks is becoming a critical challenge. Foveated video streaming can address this rising challenge in the context of eye-tracking-equipped virtual reality head-mounted displays. We introduce a new log-rectilinear transformation incorporating summed-area table filtering and off-the-shelf video codecs to enable foveated streaming of 360 degree videos suitable for VR headsets with built-in eye-tracking. Our technique results in a 30% decrease in flickering and a 10% decrease in bit rate with H.264 streaming while maintaining similar or better quality.
Neural representations have shown great promise in compactly representing radiance and light fields. However, existing neural representations are not suited for streaming as decoding can only be done at a single level of detail and requires downloading the entire neural network model. To resolve these challenges, we present a progressive multi-scale light field network that encodes light fields with multiple levels of detail across various subsets of the network weights. With our approach, light field networks can render starting with less than 7% of the model weights and progressively depict greater levels of detail as more model weights are streamed.
Examining Committee
Bio
David Li is a PhD student working in computer graphics advised by Professor Amitabh Varshney. His research interests are in delivering immersive content with 360 videos and neural representations. He has also worked on projects in VR data visualization and interactive 3D graphics.
This talk is organized by Tom Hurst