Generative models are often treated as content creation tools. This proposal reframes them as data processing machines: efficient operators that map rough or incomplete inputs to the target domain.
In Magic Fixup, we show that a quickly edited user image can be transformed into a photorealistic output, demonstrating how generative models can refine coarse, approximate inputs. Next, we demonstrate how multiview generative models can transform photos of objects taken under extremely varying illuminations—images that are unusable by standard reconstruction methods—into consistently lit photos suitable for conventional 3D reconstruction pipelines.
A key challenge is that controllable diffusion models typically require task-specific training. This proposal explores coupling diffusion models to unlock new capabilities without retraining. By combining models that operate in different domains—for example, 2D relighting and 3D generation—we can construct new functionality, such as 3D relighting, from existing models. This paradigm of composable generative operators promises to exponentially expand the range of applications, including inverse problems and scientific domains, using the models we already have.
Hadi is a 4th year PhD student advised by Jia-Bin Huang. He has interned at Adobe Research and Google DeepMind working on photo editing and 3D reconstruction. He received his Master's and Bachelor's degrees from Cornell University.