Talks

PhD Proposal: Visual Content Synthesis at Scale

Songwei Ge

IRB-3137 Brendan Iribe Center for Computer Science and Engineering (IRB)

Wednesday, April 24, 2024, 12:30-2:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

The visual content humans can synthesize has always been in a symbiotic and continually evolving relationship with technological development. Impressionism develops with synthetic pigments, film-making starts with zoopraxiscope, and video games grow with computer-generated imagery. Along the way of technology development, tons of visual data has been accumulated, from paintings to web images and videos. The availability of such data is unprecedented in human history. Generative models are one of the effective ways to handle these data. By training the scalable models on a large volume of visual data, the models can synthesize top-tier visual content. Further combined with various controllability tools, it empowers individuals to create their desired artistic content by instructing with natural language or operating with intuitive user interfaces, even without any skill training like before. In this thesis proposal, we explore the scalable generative models architectures and training, analyze the evaluation metrics and training data, and their applications to various domains and tasks.

Examining Committee

Chair:	Dr. David Jacobs
Department Representative:	Dr. Tom Goldstein
Members:	Dr. Jia-Bin Huang

Bio

Songwei is a fourth-year PhD student in Computer Science at University of Maryland, advised by Prof. Jia-Bin Huang and Prof. David Jacobs. He has interned at NVIDIA and Meta and was recipient of NVIDIA Research Fellowship. He received his Master's degree from CMU and Bachelor's degrees from Renmin University of China. His research primarily focuses on generative models applied to images and videos.

This talk is organized by Migo Gui