Large-scale generative visual models, such as DALL·E2 and Stable Diffusion, have made content creation as little effort as writing a short text description. Meanwhile, these models also spark concerns among artists, designers, and photographers about job security and proper credit for their contributions to the training images. This leads to many questions: Will generative models make creators’ jobs obsolete? Should creators stop publicly sharing their work? Should we ban generative models altogether?
In this talk, I argue that human creators and generative models can coexist. To achieve it, we need to involve creators in the loop of both model inference and model creation while crediting their efforts for their involvement. I will first explore our recent efforts in model rewriting, which allows creators to freely control the model’s behavior by adding, altering, or removing concepts and rules. I will demonstrate several applications, including creating new visual effects, customizing models with multiple personal concepts, and removing copyrighted content. I will then discuss our data attribution algorithm for assessing the influence of each training image for a generated sample. Collectively, we aim to allow creators to leverage the models while retaining control over the creation process and data ownership.
Jun-Yan Zhu is an Assistant Professor at CMU’s School of Computer Science. Prior to joining CMU, he was a Research Scientist at Adobe Research and a postdoc at MIT CSAIL. He obtained his Ph.D. from UC Berkeley and B.E. from Tsinghua University. He studies computer vision, computer graphics, and computational photography. His current research focuses on generative models for visual content creation. He has received the Packard Fellowship, the NSF CAREER Award, the ACM SIGGRAPH Outstanding Doctoral Dissertation Award, and the UC Berkeley EECS David J. Sakrison Memorial Prize for outstanding doctoral research, among other awards.