Obtaining 3D representations from real-world observations is a long-standing problem in computer vision/graphics with important applications in virtual/augmented reality. Inverse rendering is a technique for inferring 3D information—such as geometry, materials, and lighting—from a set of 2D images based on image formation principles. To recover scene parameters, it uses iterative gradient descent to minimize a loss between input images and images rendered by a differentiable renderer. Specifically, physically-based inverse rendering emphasizes accurate modeling of light transport physics, which can be prohibitively expensive. For example, inter-reflections between objects in a scene—known as global illumination—require simulating multi-bounce light transport. Differentiating this process across millions of pixels and parameters over many light bounces can exhaust memory, especially when using automatic differentiation, which stores a large transcript during the forward pass and differentiates through it in the backward pass. Consequently, many works in the literature simplify the problem by limiting light bounces to one or two, often resulting in inaccurate reconstructions.
This doctoral thesis focuses on improving the efficiency of global illumination algorithms using neural networks, with the ultimate purpose of making real-world inverse rendering more accessible. In particular, we present Neural Radiosity, a method of finding the solution of the rendering equation using a neural network by minimizing the residual of the rendering equation. We integrate Neural Radiosity in an inverse rendering pipeline and introduce a radiometric prior as a form of regularization term next to the photometric loss. As inverse rendering requires differentiating the rendering algorithm, we further apply the idea of Neural Radiosity to find the solution of the differential rendering equation. Finally, by coupling inverse rendering with generative AI, we present a method for synthesizing 3D assets. We use an image diffusion model to generate realistic material details on renderings of a scene, and backpropagate the new details into the scene description using inverse rendering. To achieve multi-view consistency using an image model, we propose to bias the attention mechanism without retraining the model. Together, our contributions advance the state-of-the-art in global illumination for inverse rendering, showing that this previously prohibitive goal is more attainable with neural methods. This thesis also demonstrates the potential of combining inverse rendering with generative AI for 3D content creation.
Saeed Hadadan is a PhD candidate in Computer Science at the University of Maryland, College Park, under the supervision of Prof. Matthias Zwicker. His research focuses on Computer Graphics, Neural Networks, and Generative AI, with applications in rendering, inverse rendering, and 3D reconstruction. He earned his MSc from the University of Maryland in 2022 and his BSc from Sharif University of Technology, Iran in 2019. His work includes Neural Radiosity, a novel approach to solving the rendering equation using a single neural network. He later extended Neural Radiosity to global illumination-aware real world 3D reconstruction. He has had research collaborations with NVIDIA and Disney Research on Computer Graphics and Generative AI research topics. His recent research endeavors include Generative AI for physically-based material synthesis and generative data augmentation in NeRF-like reconstruction models.