Virtual and augmented reality (VR and AR) are bridging the gap between the physical and the virtual worlds. The ultimate goal of VR and AR technology is to present three-dimensional (3D) images at high frame rates for realistic, immersive, and interactive viewing experiences. As the demand for higher resolution in VR and AR devices increases, the computational complexity and the data requirements also increase. This puts a burden on underlying critical resources, such as memory, processing time, and energy consumption, which are essential for storing, rendering, processing, and displaying information. To address these challenges, this research explores methods that harness the inherent structure and redundancy present in the data. By focusing on three key areas -- rendering, displays, and compression -- this dissertation aims to enable efficient AR/VR systems that enhance resource utilization without compromising the user experience.
First, we focus on developing computationally efficient rendering techniques. We begin by discussing various foveated rendering approaches. With the advent of real-time eye tracking systems and an increase in the resolution and field of view in modern AR and VR headsets, foveated rendering becomes crucial to achieve real-time rendering. We review the current state of the field and provide a taxonomy of various foveation techniques that can be used as a guide for developing foveated rendering methods. Then, we investigate methods to improve the image quality from sparse Monte Carlo samples in volumetric rendering.
Monte Carlo path tracing has the potential to create stunning visualizations of volumetric data. However, the computational cost of achieving noise-free images is extremely high due to the large number of samples required per pixel. We show how deep-learning-based denoising techniques can be integrated with Monte Carlo volumetric rendering to achieve high-quality images at interactive rates.
Next, we present our research towards developing energy-efficient holographic displays.
Holographic displays are considered true 3D displays, with the potential to emulate all the depth cues of human vision. Nanophotonic phased array (NPA) is a novel emerging technology for holographic displays with compact sizes and very high refresh rates. However, building a large-scale NPA is limited by the significant power consumption and circuit complexity. We present algorithms to generate sparse holographic patterns and show that we can produce high-resolution images with high visual quality using as few as 10% of the display elements.
Finally, we explore techniques for efficient compression of multi-view images. As the quantity of 3D data being acquired and processed continues to expand, it leads to increased storage demands and data transmission challenges. We present a deep learning-based multi-view image compression framework that integrates a novel view-aware entropy model with the recent advancements in single-view image compression. By achieving superior compression performance, our approach facilitates more efficient utilization of memory resources when dealing with multi-view data.
Susmija is a Ph.D. candidate in Computer Science at the University of Maryland, College Park, advised by Prof. Amitabh Varshney. Her research interests involve computer graphics, 3D displays, and computer vision, focusing on solving problems to develop immersive and interactive visual experiences. She received her master's and bachelor's degrees in Computer Science from the Indian Institute of Technology, Kharagpur.