PhD Defense: Towards Immersive Streaming for Videos and Light Fields
David Li
IRB 4107
Abstract
As virtual and augmented reality devices evolve with new applications, the ability to create and transmit immersive content becomes increasingly important. In particular, mobile, standalone devices have power, computing, and bandwidth limitations, necessitating careful approaches to content delivery. In this dissertation, we explore techniques to enable adaptive streaming of 360-degree panoramic videos and light fields.
First, we introduce a new log-rectilinear transformation incorporating summed-area table filtering to enable foveated streaming of 360-degree videos suitable for VR headsets with built-in eye-tracking. Compared to using log-polar sampling, our technique results in a 30% decrease in flickering and a 10% decrease in bit rate with H.264 streaming while maintaining similar or better quality.
Next, we present a progressive multi-scale light field network that encodes light fields with multiple levels of detail across various subsets of the network weights. With our approach, light field networks can render starting with less than 7% of the model weights and progressively depict greater levels of detail as more model weights are streamed.
Finally, we present continuous levels of detail for light field networks. This addresses flickering artifacts during transitions across levels of detail and enables more granular adaptation to available resources. Furthermore, by rendering levels of detail at each possible network width, we reduce the model size deltas from over a hundred rows and columns per layer down to a single row and column per layer, for smoother streaming.
First, we introduce a new log-rectilinear transformation incorporating summed-area table filtering to enable foveated streaming of 360-degree videos suitable for VR headsets with built-in eye-tracking. Compared to using log-polar sampling, our technique results in a 30% decrease in flickering and a 10% decrease in bit rate with H.264 streaming while maintaining similar or better quality.
Next, we present a progressive multi-scale light field network that encodes light fields with multiple levels of detail across various subsets of the network weights. With our approach, light field networks can render starting with less than 7% of the model weights and progressively depict greater levels of detail as more model weights are streamed.
Finally, we present continuous levels of detail for light field networks. This addresses flickering artifacts during transitions across levels of detail and enables more granular adaptation to available resources. Furthermore, by rendering levels of detail at each possible network width, we reduce the model size deltas from over a hundred rows and columns per layer down to a single row and column per layer, for smoother streaming.
Bio
David Li is a Computer Science PhD candidate at the University of Maryland, College Park advised by Professor Amitabh Varshney. His research interests are in delivering immersive content with 360 videos and neural representations. He has also worked on projects involving VR data visualization and interactive 3D graphics.
This talk is organized by Migo Gui